starburstdata / dbt-trino Goto Github PK
View Code? Open in Web Editor NEWThe Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)
License: Apache License 2.0
The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)
License: Apache License 2.0
Hi we have a scenario where the data is written on S3 and Glue metastore. There is a tricky situation where sometimes the tables are created with no data, ie the query returns 0 rows and therefore the table is created but there is no data.
When we try to query this table later we get the error:
SQL Error [16777220]: Query failed (#20220711_175341_09350_zawua): Partition location does not exist: s3://bucket/table...
Is there a way of pointing out to trino that the table is empty? I have no issues querying this table using Athena.
Thank you
The properties feature the implement recently is very useful but it would be better if it could support following use-cases:
+properties:
transactional: "{{ env_var('DBT_TRANSACTION','true') }}"
In our project we make extensive use of the dbt_utils library. the dbt_utils.union_relations doesn't work because it generates invalid sql
create table "dataproduct_levenslangleren"."zone_levenslangleren"."levenslang_leren__dbt_tmp"
as (
(
select
cast('"dataproduct_levenslangleren"."zone_levenslangleren"."levenslang_leren_ho"' as
string
) as _dbt_source_relation,
cast("laadoperatie" as character varying(256)) as "laadoperatie" ,
cast("record max laadoperatie" as character varying(256)) as "record max laadoperatie" ,
Two issues here:
This requires the adapter to provide specific implementation if the default ('string') is not supported:
{% macro trino__type_string() %}
varchar
{% endmacro %}
Will post PR shortly.
** would have been a Discussion if they were enabled for this repo **
as part of our new adapter verification process, the dbt core adapters engineering team and I met today to do a high-level review of the dbt-trino codebase. good news, your adapter passes the sniff test! That means that while the engineers haven’t yet had the time to review every single line of code (though they plan to in the next few sprints), there are no glaring red flags that prevent the adapter from being verified.
That said, here's some notes of things we noticed:
SQLAdapter
methods.TrinoCredentialsFactory
needed to exist. But using dataclass parameter validation to guide users to the correct sets of profile fields is a great choice. Perhaps though there's a way we can work to make this easier.dbt-trino/dbt/adapters/trino/connections.py
Lines 30 to 45 in e9194b1
dbt-trino/dbt/adapters/trino/connections.py
Lines 333 to 353 in e9194b1
dbt-trino/dbt/adapters/trino/connections.py
Lines 360 to 368 in e9194b1
For example:
TrinoAdapter.add_query()
could upstream into dbt-core?dbt-trino/dbt/adapters/trino/connections.py
Lines 416 to 450 in e9194b1
please let us know if there any other particular areas you’d like us to review more closely. Looking forward to collaborating!
I have an idea for a feature and wonder if it makes sense to be part of dbt-trino or not.
Trino is commonly deployed on AWS with AWS Glue Catalog - you see people referencing it a lot. AWS Glue Catalog doesn't support renaming tables, so normal table
materialization doesn't work. So anyone using Trino, dbt and Glue has to override the materialization macro to be able to materialize tables.
The way we do this is we just move things around so that instead of create new, rename old, rename new, drop old we just do drop old, create new. That causes table to be unavailable for a period of time, but works as a quick fix. Also we have to maintain our custom macro, and ship it with every dbt project.
Could this be looked at as part of dbt-trino? maybe creating official "table_norename" materialization, or looking into materialized views, insert overwrite or something? It would make lives of everyone using Trino with AWS much easier.
Attribute | Details |
---|---|
Library | PyYAML |
Description | YAML parser and emitter for Python |
Language | PYTHON |
Vulnerability | Arbitrary Code Execution |
Vulnerability description | pyyaml is vulnerable to arbitrary code execution. The .yaml files that are parsed by FullLoader uses the unsafe yaml.load() by default. This vulnerability exists due to a bypass of the fix for CVE-2020-1747. |
CVE | 2020-14343 |
CVSS score | 10 |
Vulnerability present in version/s | 3.12-5.3.1 |
Found library version/s | 5.3.1 |
Vulnerability fixed in version | 5.4b1 |
Library latest version | 6.0 |
Fix |
Links:
When I add source freshness configuration and run dbt source freshness
with dbt-trino, I get
Runtime Error in source mv_mantis_mfr (models/staging/sources.yml)
ERROR: current_timestamp macro not implemented for adapter trino
Recently, Trino added support for MERGE
statement https://trino.io/docs/current/sql/merge.html
dbt-trino can now benefit from it by adding merge
incremental strategy.
looks like trino's python client supports JWT. An easy lift would be that a user could pass a JWT token as a target field in the user's profiles.yml
.
The user below could then use some script to store the token as an environment variable.
Happy to advise on how this might work, LMK!
I'm trying to connect dbt to presto as that is one of our data stores. The only way that we are able to connect to this is through okta verification using a token that lasts only 24 hours. From the looks of the documentation on the profile setup it looks like only ldap and kerberos is supported. Is there currently a way to fetch a token or is this something that would need to be done on my end to pass over to the presto profile?
dbt seed
command does not currently escape some data types supported by trino.
Perform the changes in the method dbt.adapters.trino.connections.ConnectionWrapper._escape_value
Make sure to add a corresponding model which contains all these types in the automated tests.
seeds:
jaffle_shop:
seed_data_types:
+column_types:
c_timestamp: timestamp
c_date: date
c_time: time
c_char: char(10)
short_decimal: decimal(10,2)
long_decimal: decimal(10,2)
c_real: real
c_float: float
Check https://docs.getdbt.com/docs/building-a-dbt-project/seeds (the FAQ section as well) to get a better understanding of seeds in dbt. Do perform also tests for the seeds (as documented in the " How do I test and document seeds?" from the FAQ section) to ensure compatibility with the functionality provided by dbt.
When using a schema with mixed case, I expect dbt-trino to be able to overwrite my existing tables.
The first dbt run succeeds, however subsequent runs fail with
Compilation Error in model base_orders (models/base/base_orders.sql)
20:23:53 When searching for a relation, dbt found an approximate match. Instead of guessing
20:23:53 which relation to use, dbt will move on. Please delete "datalake"."analyticstest"."base_orders", or rename it to be less ambiguous.
use a mixed case profile eg
my_dbt_trino_project:
outputs:
dev:
type: trino
method: none # optional, one of {none | ldap | kerberos}
user: admin
database: datalake
host: localhost
port: 8080
schema: analyticsTEST
threads: 1
target: dev
Run dbt run
multiple times.
❯ dbt run
20:23:44 Running with dbt=1.2.1
20:23:44 Unable to do partial parsing because profile has changed
20:23:45 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
20:23:45 Found 4 models, 0 tests, 1 snapshot, 0 analyses, 272 macros, 0 operations, 0 seed files, 3 sources, 0 exposures, 0 metrics
20:23:45
20:23:46 Concurrency: 1 threads (target='dev')
20:23:46
20:23:46 1 of 4 START view model analyticsTEST.base_customers ........................... [RUN]
20:23:47 1 of 4 OK created view model analyticsTEST.base_customers ...................... [SUCCESS in 0.30s]
20:23:47 2 of 4 START view model analyticsTEST.base_orders .............................. [RUN]
20:23:47 2 of 4 OK created view model analyticsTEST.base_orders ......................... [SUCCESS in 0.09s]
20:23:47 3 of 4 START view model analyticsTEST.base_payments ............................ [RUN]
20:23:47 3 of 4 OK created view model analyticsTEST.base_payments ....................... [SUCCESS in 0.11s]
20:23:47 4 of 4 START table model analyticsTEST.orders .................................. [RUN]
20:23:49 4 of 4 OK created table model analyticsTEST.orders ............................. [SUCCESS in 2.03s]
20:23:49
20:23:49 Finished running 3 view models, 1 table model in 0 hours 0 minutes and 4.24 seconds (4.24s).
20:23:49
20:23:49 Completed successfully
20:23:49
20:23:49 Done. PASS=4 WARN=0 ERROR=0 SKIP=0 TOTAL=4
❯ dbt run
20:23:53 Running with dbt=1.2.1
20:23:53 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
20:23:53 Found 4 models, 0 tests, 1 snapshot, 0 analyses, 272 macros, 0 operations, 0 seed files, 3 sources, 0 exposures, 0 metrics
20:23:53
20:23:53 Concurrency: 1 threads (target='dev')
20:23:53
20:23:53 1 of 4 START view model analyticsTEST.base_customers ........................... [RUN]
20:23:53 1 of 4 ERROR creating view model analyticsTEST.base_customers .................. [ERROR in 0.01s]
20:23:53 2 of 4 START view model analyticsTEST.base_orders .............................. [RUN]
20:23:53 2 of 4 ERROR creating view model analyticsTEST.base_orders ..................... [ERROR in 0.00s]
20:23:53 3 of 4 START view model analyticsTEST.base_payments ............................ [RUN]
20:23:53 3 of 4 ERROR creating view model analyticsTEST.base_payments ................... [ERROR in 0.00s]
20:23:53 4 of 4 SKIP relation analyticsTEST.orders ...................................... [SKIP]
20:23:53
20:23:53 Finished running 3 view models, 1 table model in 0 hours 0 minutes and 0.25 seconds (0.25s).
20:23:53
20:23:53 Completed with 3 errors and 0 warnings:
20:23:53
20:23:53 Compilation Error in model base_customers (models/base/base_customers.sql)
20:23:53 When searching for a relation, dbt found an approximate match. Instead of guessing
20:23:53 which relation to use, dbt will move on. Please delete "datalake"."analyticstest"."base_customers", or rename it to be less ambiguous.
20:23:53 Searched for: "datalake"."analyticsTEST"."base_customers"
20:23:53 Found: "datalake"."analyticstest"."base_customers"
20:23:53
20:23:53 > in macro create_or_replace_view (macros/materializations/models/view/create_or_replace_view.sql)
20:23:53 > called by macro materialization_view_trino (macros/materializations/view.sql)
20:23:53 > called by model base_customers (models/base/base_customers.sql)
20:23:53
20:23:53 Compilation Error in model base_orders (models/base/base_orders.sql)
20:23:53 When searching for a relation, dbt found an approximate match. Instead of guessing
20:23:53 which relation to use, dbt will move on. Please delete "datalake"."analyticstest"."base_orders", or rename it to be less ambiguous.
20:23:53 Searched for: "datalake"."analyticsTEST"."base_orders"
20:23:53 Found: "datalake"."analyticstest"."base_orders"
20:23:53
20:23:53 > in macro create_or_replace_view (macros/materializations/models/view/create_or_replace_view.sql)
20:23:53 > called by macro materialization_view_trino (macros/materializations/view.sql)
20:23:53 > called by model base_orders (models/base/base_orders.sql)
20:23:53
20:23:53 Compilation Error in model base_payments (models/base/base_payments.sql)
20:23:53 When searching for a relation, dbt found an approximate match. Instead of guessing
20:23:53 which relation to use, dbt will move on. Please delete "datalake"."analyticstest"."base_payments", or rename it to be less ambiguous.
20:23:53 Searched for: "datalake"."analyticsTEST"."base_payments"
20:23:53 Found: "datalake"."analyticstest"."base_payments"
20:23:53
20:23:53 > in macro create_or_replace_view (macros/materializations/models/view/create_or_replace_view.sql)
20:23:53 > called by macro materialization_view_trino (macros/materializations/view.sql)
20:23:53 > called by model base_payments (models/base/base_payments.sql)
20:23:53
20:23:53 Done. PASS=0 WARN=0 ERROR=3 SKIP=1 TOTAL=4
macos
1.2.2
396
3.9.10
Attribute | Details |
---|---|
Library | Jinja2 |
Description | A very fast and expressive template engine. |
Language | PYTHON |
Vulnerability | Cross-Site Scripting (XSS) |
Vulnerability description | jinja2 is vulnerable to Cross Site Scripting. An attacker is able to inject and execute arbitrary Javascript through the gettext and ngettext function due to the lack of output sanitization. |
CVE | null |
CVSS score | 5.8 |
Vulnerability present in version/s | 2.5-3.0.0 |
Found library version/s | 2.11.3 |
Vulnerability fixed in version | |
Library latest version | 3.1.2 |
Fix | There is no fix version. Apply the following fix: |
Links:
Hey dbt-trino
maintainers!
We've been developing a new framework for all integration/functional testing in dbt-core
. One of our goals in the initiative has been to replace the suite of basic test cases for adapter plugins.
For now, the framework is comprised of:
dbt-core==1.1.0b1
.dbtspec
/ pytest-dbt-adapter
I have some draft docs (dbt-labs/docs.getdbt.com#1263) for using the new framework: https://deploy-preview-1263--docs-getdbt-com.netlify.app/docs/contributing/testing-a-new-adapter
It should be much easier to:
If it proves helpful, you can also check out the way we've implemented the framework in some of our own adapter plugins:
Thanks for all the great work :)
I have been experiencing this issue for a long time. This issue happens when we try to execute the query which takes more than 5min. To reproduce this issue use dbt seed which can insert some rows which will take more than 5mins.
The reason for this issue is that, In Trino, the default value of transaction.idle-timeout
is 5minutes
and dbt executes commands in the following flow which causes this error
START TRANSACTION
COMMIT
<- This will throw an errortrino.exceptions.DatabaseError: failed to commit transaction 73dc7443-ec53-4911-95be-a372792743b4: TrinoUserError(type=USER_ERROR, name=UNKNOWN_TRANSACTION, message="Unknown transaction ID: 73dc7443-ec53-4911-95be-a372792743b4. Possibly expired? Commands ignored until end of transaction block", query_id=20211121_145311_00013_7rttf)
Is there any specific reason why transactions are explicitly started in dbt-trino in spite of using auto-commit in python connection?
dbt-trino/dbt/adapters/trino/connections.py
Line 195 in 8088447
Other reference:
I'm transferring this issue over, as I previously incorrectly posted to the presto adapter repo.
As far as I can tell, there is no way to pass in a path to a certificate for an SSL connection to the API. I believe this needs to be set on the trino connection that is created.
dbt-trino/dbt/adapters/trino/connections.py
Line 185 in afa4091
A great and unique feature of Trino is that it allows users to authenticate through OIDC. This is not something commonly seen with databases (only Snowflake and Denodo support this feature). it's a much safer use as it doesn't require storing any local credentials.
I saw that this feature is implemented in the trino-python-client but not yet supported in the dbt adapter.
Some gotchas:
I can only find ldap
method to autentication trino server(With user and password), does it support password file configed trino server? If not, can you add this support?
I've tried but got error as following
$ dbt run
05:33:21 Running with dbt=1.0.1
05:33:21 Found 1 model, 2 tests, 0 snapshots, 0 analyses, 167 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
05:33:21
05:33:23 Encountered an error:
Runtime Error
Runtime Error
error 401: b'Unauthorized'
Hey guys,
If your destination table doesn't have the same columns as your model the whole model fails.
Example
If Table A has ID, Text1, Date1 in it's field list and the dbt model has TableA as ID, Text1 on dbt run the model will fail regardless of the on_schema_change setting. It seems the incremental macro retrieves the list of columns to insert into from the destination table and never checks them against the model for schema diffs?
error whle seeding a big csv file:
Runtime Error in seed com_event_log_info (data/<filename>csv)
TrinoUserError(type=USER_ERROR, name=QUERY_TEXT_TOO_LARGE, message="Query text length (1380022) exceeds the maximum length (1000000)", query_id=20211116_121112_10193_2mbb4)
same file could be seeded without problem with dbt-presto driver.
I tried using the materialization with presto
(not trino
) and found that it works perfectly except if a custom schema is used.
This line
strips-off the database and schema names from the temp relation causing the query to fail.
I haven't yet had the chance to test this with trino
but I suspect that the behaviour would be the same.
Opening this task to discuss the best approach to add incremental.
Trino is planning on adding the merge operation trinodb/trino#7708 but I'd say we can start by adding the feature with a delete+insert
approach as stated here.
For now, I'd like to focus on adding support to Hive and Iceberg connector.
Any thoughts on this?
Add testing for Starburst Galaxy to ensure dbt-trino compatibility.
No response
Starburst Galaxy users.
We've just published the release cut of dbt-core 1.2.0, dbt-core 1.2.0rc1
(PyPI | GitHub release notes).
dbt-labs/dbt-core#5468 is an open discussion with more detailed information, and dbt-labs/dbt-core#5474 is for keeping track of the communities progress on releasing 1.2.0
Below is a checklist of work that would enable a successful 1.2.0 release of your adapter.
BaseDocsGenerate
and BaseDocsGenReferences
dbt-labs/dbt-core#5432 might make it into the second release cut in the next week, in which case, you'll also might want to:
As referenced in #35
In https://github.com/dbt-labs/dbt-utils/blob/73a65562bf3d9b31a1eca0b27118056ad048d8fc/macros/cross_db_utils/datatypes.sql#L3-L6 there is a utility method used when casting to string
Add Trino specific _type_xxx
methods in the Trino adapter and add also a corresponding unit test which creates a model by using CAST
for all these types. Maybe this testing thought is wrong - if so, please use https://github.com/dbt-labs/dbt-utils#union_relations-source to test this functionality.
The following issue in: Trino trinodb/trino#6658 requires us to override the current_timestap macro for the Iceberg catalog.
Trino uses 3 digits of subsecond precision in Date and time functions, but Iceberg expects timestamp with 3 digits of subsecond precision.
As a result we can't create Iceberg table from Trino
create table iceberg.examples.test
as select now() as n;
Exception:
Timestamp precision (3) not supported for Iceberg. Use "timestamp(6) with time zone" instead.
Once this issue is resolved, we can remove this documented limitation.
No response
No response
We've just published the release cut of dbt-core 1.2.0, dbt-core 1.2.0rc1
(PyPI | GitHub release notes).
dbt-labs/dbt-core#5468 is an open discussion with more detailed information, and dbt-labs/dbt-core#5474 is for keeping track of the communities progress on releasing 1.2.0
Below is a checklist of work that would enable a successful 1.2.0 release of your adapter.
BaseDocsGenerate
and BaseDocsGenReferences
dbt-labs/dbt-core#5432 might make it into the second release cut in the next week, in which case, you'll also might want to:
Is it possible to update the version for download through PyPi ? I am using some docker images for dbt and would like to be able to pull the updated version of this package during build to get some recent updates. Currently it pulls 0.20.0.
I think this needs an update to support the current version of DBT.
When running "dbt debug" the following error is returned:
dbt was unable to connect to the specified database.
The database returned the following error:
Runtime Error
'str' object has no attribute '_message'
DBT it set to use the trino driver which in turn is configured to talk to an AWS Data Catalog and S3
by default, dbt works like this (ref from dbt-labs/dbt-core#1558)
When dbt creates a relation in the db, add a special flag to the relation (defaults to False)
When checking a node name match:
if that flag is present and quoting is disabled, do a lowercase compare
otherwise remain case sensitive
the default behavior will fail with following case:
when enable store_failures, if the test includes Boolean parameter, dbt will try to save a test model with name False. First save will be fine, but second will fail because dbt will compare False model with false model (the one that has been saved)
error will looks like this:
06:21:21 Compilation Error in test accepted_values_model____False__0__1 (models/xxx/model.yml)
06:21:21 When searching for a relation, dbt found an approximate match. Instead of guessing
06:21:21 which relation to use, dbt will move on. Please delete "hive"."lab_dbt_xxx"."accepted_values_model____false__0__1", or rename it to be less ambiguous.
06:21:21 Searched for: "hive"."lab_dbt_xxx"."accepted_values_model____False__0__1" <-- False is in upper case
06:21:21 Found: "hive"."lab_dbt_xxx"."accepted_values_model____false__0__1"
no error and the test model (result) should be saved correctly
set store_failures
to true and use any test containing boolean value like this:
- name: some_column_name
quote: true
tests:
- accepted_values:
values:
- 0
- 1
quote: false
No response
linux
1.1.0/1.2.1
1.1.0/1.2.2
3.8
Add comments to catalog macro since now they are returned as NULLs. It will be a slight breaking change since it will require access to system
catalog in Trino.
Tools that use dbt metadata.
Table and column descriptions are not persisted with dbt-trino
In the current version the impersonation is not working in combination of jwt authentication.
It seems that the authentication factory for jwt will reset the username if set in profile.
For dbt-core 1.2, we're going to move the default cross-database macros into dbt-core. As a maintainer, you'll move your cross_db_utils folder from dbt-trino-utils into dbt-trino at that time.
I'll be happy to help you accomplish those details when that time comes.
You can read a little about the background of this effort here and here.
Enable snapshot and merge tests for Galaxy once Galaxy will be upgraded to 393.
Check TODO in the code in test_incremental_schema
, test_incremental_merge
and test_snapshot
.
No response
No response
Looks like using the trino (or presto) python client allows for setting session properties, however there is currently no way to set these and have them persist into the transaction when running dbt. Also it appears the python client needs this set in the actual connection. If a pre-hook is run, that also is run outside of a transaction of course and does not effect the running queries. Having the option to set session parameters can significantly increase the performance for pipelines through Trino.
Init from connect does include session_properties in trino-python-client.
def __init__(
self,
host,
port=constants.DEFAULT_PORT,
user=None,
source=constants.DEFAULT_SOURCE,
catalog=constants.DEFAULT_CATALOG,
schema=constants.DEFAULT_SCHEMA,
session_properties=None,
http_headers=None,
http_scheme=constants.HTTP,
auth=constants.DEFAULT_AUTH,
redirect_handler=None,
max_attempts=constants.DEFAULT_MAX_ATTEMPTS,
request_timeout=constants.DEFAULT_REQUEST_TIMEOUT,
isolation_level=IsolationLevel.AUTOCOMMIT,
verify=True
):
Can dbt-trino be unpinned from the specific dbt-core patch version?
Per dbt's docs, this seems in line with the way they are managing dbt-core versioning:
https://docs.getdbt.com/docs/core-versions#upgrading-to-new-patch-versions
In my particular use case I want to take advantage of dbt-core's 1.0.4 patch.
When using seed files and running dbt seed
, the generated insert sql is not working as types are not specified.
For example a seed file with following contents:
data_date_trunc.csv
updated_at,day,month
2018-01-05 12:00:00,2018-01-05,2018-01-01
,,
with following config in dbt_project.yml
seeds:
data_date_trunc:
+column_types:
updated_at: timestamp(3)
day: date
month: date
Creates following create table statements
CREATE TABLE memory.default.data_date_trunc (
updated_at timestamp(3),
day date,
month date
)
Following insert is generated
insert into "memory"."default"."data_date_trunc" (updated_at, day, month) values
('2018-01-05 12:00:00','2018-01-05','2018-01-01'),(NULL,NULL,NULL)
which fails with
io.trino.spi.TrinoException: Insert query has mismatched column types: Table: [timestamp(3), date, date], Query: [varchar(19), varchar(10), varchar(10)]
at io.trino.sql.analyzer.SemanticExceptions.semanticException(SemanticExceptions.java:48)
at io.trino.sql.analyzer.SemanticExceptions.semanticException(SemanticExceptions.java:43)
at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitInsert(StatementAnalyzer.java:533)
at io.trino.sql.analyzer.StatementAnalyzer$Visitor.visitInsert(StatementAnalyzer.java:404)
at io.trino.sql.tree.Insert.accept(Insert.java:68)
at io.trino.sql.tree.AstVisitor.process(AstVisitor.java:27)
at io.trino.sql.analyzer.StatementAnalyzer$Visitor.process(StatementAnalyzer.java:421)
at io.trino.sql.analyzer.StatementAnalyzer.analyze(StatementAnalyzer.java:384)
at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:79)
at io.trino.sql.analyzer.Analyzer.analyze(Analyzer.java:71)
at io.trino.execution.SqlQueryExecution.analyze(SqlQueryExecution.java:254)
at io.trino.execution.SqlQueryExecution.<init>(SqlQueryExecution.java:179)
at io.trino.execution.SqlQueryExecution$SqlQueryExecutionFactory.createQueryExecution(SqlQueryExecution.java:776)
at io.trino.dispatcher.LocalDispatchQueryFactory.lambda$createDispatchQuery$0(LocalDispatchQueryFactory.java:132)
at io.trino.$gen.Trino_369____20220206_204929_2.call(Unknown Source)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
More examples can be run through running the the tests on https://github.com/innoverio/dbt-trino-utils
Following issues preventing us from using the default dbt merge macro:
Once these are resolved we should remove our macro override.
No response
No response
Implement automated changelog tracking using Changie
Implement a pre-commit hook with automatic file formatting (black), linting (flake8), code analysis (mypy)
Trino at this moment does not support merge statements, that's why this materialization delete+insert is can implemented by first using a delete statement to delete records in the target table that are to be updated, and then an insert statement.
In the README file, there are only examples for possible values of session_properties
, but there is no list of what values are accepted or how these configs relate to Trino's session properties
To test advanced features of dbt-trino, tests should be able to execute queries like ALTER TABLE ADD/DROP column
on test catalog. Currently, all tests are executed upon memory connector, which has only limited support to SQL statements. PostgreSQL instance could be easily spin up in docker-compose used in tests alongside Trino/Starburst cluster. Migration process will include:
We plan to use client_tags in Trino to set resource group policies. It would be really helpful if we can set client tags in the trino profile configuration
If someone can help me, where do we set/define profile configuration parameters, I would really like to contribute
As the title says.
I use trino->postgresql and can execute
select account_id, array_agg(c_id) from ( select * from postgresql.public.temp_out limit 1000) group by account_id;
but when I run the same in DBT I get
TrinoUserError(type=USER_ERROR, name=NOT_SUPPORTED, message="Unsupported column type: array(bigint)"
pip freeze | grep dbt
dbt-core==0.21.0
dbt-extractor==0.4.0
dbt-trino==0.21.0
dbt-core v1.1 introduced a new testing framework #57.
In dbt-trino it was intoduced by #65 and some tests were already moved to new testing approach. Some tests from /docker/dbt/jaffle_shop/
are still using old testing. The purpose of the issue is to move all outstanding tests to a new testing framework to completely get rid of jaffle_shop
. Some tests could be covered by actively evolving dbt-tests-adapter
framework.
Useful links:
We've just published the release cut of dbt-core 1.2.0, dbt-core 1.2.0rc1
(PyPI | GitHub release notes).
dbt-labs/dbt-core#5468 is an open discussion with more detailed information, and dbt-labs/dbt-core#5474 is for keeping track of the communities progress on releasing 1.2.0
Below is a checklist of work that would enable a successful 1.2.0 release of your adapter.
BaseDocsGenerate
and BaseDocsGenReferences
dbt-labs/dbt-core#5432 might make it into the second release cut in the next week, in which case, you'll also might want to:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.