Comments (2)
2024-03-29T04:37:45.772-0600 ERROR SplitRunner-20240329_103745_00063_e4xfv.0.0.0-5-739 io.trino.plugin.deltalake.DeltaLakeMetadata Failed to write checkpoint for table test_delta_concurrent_writes_df66fjlg6l.test_concurrent_inserts_select_from_same_table_h28nk6v7lf for version 2
....
Caused by: java.lang.IllegalStateException: No previously loaded snapshot found for query 20240329_103745_00063_e4xfv, table test_delta_concurrent_writes_df66fjlg6l.test_concurrent_inserts_select_from_same_table_h28nk6v7lf [local:///test_delta_concurrent_writes_df66fjlg6l/test_concurrent_inserts_select_from_same_table_h28nk6v7lf-fb80ef1630a249098808c77433e76f23] at version 1
The exception above is caused by the fact that the readVersion
from DeltaLakeInsertTableHandle
bypasses the mechanism for registering queriedVersions
Note that the same behavior is exercised in beginMerge
as well.
Explanation
In getTableHandle
we read the table at version 0
and likely register its version in queriedVersions
.
In beginInsert
we read the table at version 1
(because in the meantime a concurrent operation has successfully completed and created 1.log
transaction log file), but don't register it in queriedVersions
. This is why in finishInsert
we don't find the version 1
in queriedVersions
.
Going on this line of thinking, the TableScanNode
is created based on the readVersion
0
of the DeltaLakeTableHandle
(obtained after getTableHandle
/applyFilter
and not on DeltaLakeInsertTableHandle
created in beginInsert
method).
The DeltaLakeInsertTableHandle
is created with the readVersion
1
in beginInsert
.
The operation is committed successfully in finishInsert
and the version 2
of the table is created. The checkpoint writeCheckpointIfNeeded
check however fails because the queriedVersion
1
is not registered.
The expected result (in case that all the statements would succeed) would be:
The table contains now the data:
0,10 initial data 0.log
1,10 the first winning commit 1.log
1,10 the second commit which mistakenly assumes it is scanning data beginning from readVersion 1, while the TableScanNode corresponds to readVersion 0 2.log
Quite likely the readVersion
on the DeltaLakeInsertTableHandle
should be set from DeltaLakeTableHandle
to reflect the reality of the query plan
cc @pajaks
from trino.
https://github.com/trinodb/trino/actions/runs/8514229981/job/23319578692?pr=21345
from trino.
Related Issues (20)
- Problem with gsheet connector HOT 5
- Unused Configuration Properties Causing Errors in Non-Kerberized and Non-HTTPS Environments HOT 3
- Incorrect Jaccard Index Calculation in Trino HOT 3
- question about hive.allow-register-partition-procedure
- COMPILER_ERROR HOT 6
- Server Fails to Start When HTTPS is Disabled HOT 9
- Separate Configuration for Internal and Client Communication Security
- Grafana queries cause workers to stuck HOT 6
- Race condition in Alluxio cache HOT 6
- Resource group metrics are not exported via JMX since 440 HOT 2
- Greatest/Least Function Check NaN on REAL data types? HOT 1
- Got exception: org.apache.hadoop.fs.UnsupportedFileSystemException No FileSystem for scheme "s3" HOT 1
- Possible SQL analyzer memory leak for trino 439?
- Trino hive count(*) are not matching with the same set of data on two tables HOT 1
- Jackson LockFreePool recycler pool causes memory issues HOT 14
- Error location markers for SQL routines (functions) are off, may break CLI HOT 1
- When utilizing column masking, unnecessary requests are generated for each column. HOT 9
- Improve speed when listing transaction logs during time travel in Delta Lake HOT 1
- Support Snowflake SSO authentication
- Flaky test TestRaptorMySqlConnectorTest `No catalog 'raptor'` HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trino.