GithubHelp home page GithubHelp logo

Comments (2)

findinpath avatar findinpath commented on August 16, 2024
2024-03-29T04:37:45.772-0600	ERROR	SplitRunner-20240329_103745_00063_e4xfv.0.0.0-5-739	io.trino.plugin.deltalake.DeltaLakeMetadata	Failed to write checkpoint for table test_delta_concurrent_writes_df66fjlg6l.test_concurrent_inserts_select_from_same_table_h28nk6v7lf for version 2
....
Caused by: java.lang.IllegalStateException: No previously loaded snapshot found for query 20240329_103745_00063_e4xfv, table test_delta_concurrent_writes_df66fjlg6l.test_concurrent_inserts_select_from_same_table_h28nk6v7lf [local:///test_delta_concurrent_writes_df66fjlg6l/test_concurrent_inserts_select_from_same_table_h28nk6v7lf-fb80ef1630a249098808c77433e76f23] at version 1

The exception above is caused by the fact that the readVersion from DeltaLakeInsertTableHandle bypasses the mechanism for registering queriedVersions

getMandatoryCurrentVersion(fileSystem, tableLocation, table.getReadVersion()),

checkState(queriedVersions.put(table, snapshot.getVersion()) == null, "queriedLocations changed concurrently for %s", table);

Note that the same behavior is exercised in beginMerge as well.

Explanation

In getTableHandle we read the table at version 0 and likely register its version in queriedVersions.
In beginInsert we read the table at version 1(because in the meantime a concurrent operation has successfully completed and created 1.log transaction log file), but don't register it in queriedVersions . This is why in finishInsert we don't find the version 1 in queriedVersions.

Going on this line of thinking, the TableScanNode is created based on the readVersion 0 of the DeltaLakeTableHandle (obtained after getTableHandle/applyFilter and not on DeltaLakeInsertTableHandle created in beginInsert method).

The DeltaLakeInsertTableHandle is created with the readVersion 1 in beginInsert.
The operation is committed successfully in finishInsert and the version 2 of the table is created. The checkpoint writeCheckpointIfNeeded check however fails because the queriedVersion 1 is not registered.

The expected result (in case that all the statements would succeed) would be:

// Considering T1, T2, T3 being the order of completion of the concurrent INSERT operations,
// if all the operations would eventually succeed, the entries inserted per thread would look like this:
// T1: (1, 10)
// T2: (2, 10)
// T3: (3, 10)

The table contains now the data:

0,10 initial data  0.log
1,10 the first winning commit 1.log
1,10 the second commit which mistakenly assumes it is scanning data beginning from readVersion 1, while the TableScanNode corresponds to readVersion 0  2.log

Quite likely the readVersion on the DeltaLakeInsertTableHandle should be set from DeltaLakeTableHandle to reflect the reality of the query plan

cc @pajaks

from trino.

ebyhr avatar ebyhr commented on August 16, 2024

https://github.com/trinodb/trino/actions/runs/8514229981/job/23319578692?pr=21345

from trino.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.