Comments (3)
I try to testing the file size, it seem like the threshold is 128MB (as my HDFS block size setup), it means that file <=128MB is fine, >128MB is fail.
While file size > 128MB, it stored in multiple blocks in HDFS, will it make the dictionary page not available to the second data block?
from presto.
@shangxinli , please take a look
from presto.
Hi there, any update on this @tdcmeehan ?
Meanwhile, I had done a lot of test, and I found the config that can workaround on this problem. I set this config to true
hive.order-based-execution-enabled=true
Presto read a file in HDFS by creating multiple splits, this process divides the parquet file into multiple parts. If we enable the PME in file, each page become an undivided, because it need the whole data byte into to decrypt data. So I think there is something wrong with the split creating process.
This config make the "hive files become non-splittable", so bypass this splitting process and make every thing work fine.
from presto.
Related Issues (20)
- Backport https://github.com/prestodb/presto/pull/22926 into 0.285, 0.286 and 0.287 HOT 4
- Pushdown (partial) rowNumber under join
- Flaky test: TestMemoryManager.testReservedPoolDisabledMultiCoordinator
- Add documentation for Geospatial types in main types page HOT 1
- For each agg function with input param as <T>, Add an equivalent agg function with input param as array<T> HOT 1
- [docs] Combine the descriptions of session property with configuration property for history based optimization
- singlestore-dockerized-tests job is failing often HOT 1
- Getting error while building in intelli idea HOT 1
- How to build a custom connector?
- How to build and run presto in intellij idea? HOT 1
- [native] Flaky test TaskManagerTest.buildSpillDirectoryFailure HOT 2
- Writer scaling fails for Parquet with smaller files HOT 6
- Flaky test: TestNoisySumGaussianLongAggregation.testNoisySumGaussianLongClippingSomeNoiseScaleWithinSomeStd() HOT 1
- Iceberg $changelog read fails on table with only one snapshot version.
- Pushdown partial TopN and RowNumber into UNION
- Inline cosntant cross joins
- Allow Presto Coordinator to ignore (not throw) negative runtime metrics. HOT 1
- Update the MongoDB connector to support binData data type HOT 2
- Presto needs a modern functional testing framework that runs tests using real infrastructure
- [docs] fix warnings in doc build for resource-groups.rst HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from presto.