Comments (7)
Just wanted to say I strongly agree with you on your first point, @asuhan. I would argue that even if you come up with a need case, it wouldn't be the best (going by fastest) solution. If you're doing this with that much data, the person with the use-case might need to adjust other parts in their system's flow. I view any select * from foo
statement to be poor practice (lack of future-proofing) and "unreasonable" because you just opened the door for many errors that will be done by others working with the system.
from heavydb.
@neko940709 Thanks for reporting, which query? It usually means failure to optimize away an intermediate projection. We don't want to let it run anyway because it'd take a very long time / use a lot of memory.
The larger point: we're aware that we need to work towards TPC-H compatibility. Over the next few months, we plan to get most / all of the queries working. Several pieces of work related to this goal are already in progress.
from heavydb.
The query is like this:
select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from lineitem, orders, customer where l_orderkey = o_orderkey and c_custkey = o_custkey and c_mktsegment = 'BUILDING' group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate;
The query executes on V2.17.2 TPC-H benchmark which I used to generate 10GB dataset. This query is also a provided sample and I simplify it. Actually, I have met this exception several times.
The first time is doing the following query:
select * from newdata;
The NEWDATA table consists of 3 columns, all the columns are INTEGER type. The size of Newdata table is about 2.5GB. If you said it's a failure of the intermediate projection, I DONT understand why this simple query will also throw exception?
This is the log message:
I0524 23:10:21.887410 132126 MapDHandler.cpp:2295] select * from newdata; I0524 23:10:21.888273 132126 Calcite.cpp:196] User mapd catalog mapd sql 'select * from newdata; I0524 23:10:21.897106 132126 Calcite.cpp:221] Time marshalling in JNI 0 (ms), Time in Java Calcite 9 (ms) E0524 23:10:22.045779 132126 MapDHandler.cpp:2331] Exception: Query would require a scan without a limit on table(s): newdata
from heavydb.
More, there is other exception I've got during the TPC-H test.Some of the sql will make the mapd_server crash. Here is an example:
select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 30 and p_type like '%SMALL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'ASIA' order by s_acctbal desc, n_name, s_name, p_partkey;
This query throws exception like this:
Check failed: static_cast<size_t>(curr_idx) < target_count_ (24 vs. 24)
from heavydb.
Apart from the TPC-H data, I ran the queries you give on your official website. The query was executed on the FLIGHT data in 2.6GB size. The query is the following:
select count(*) from flights_2008_7M where origin_name='Lambert-St Louis International' and dest_name = 'Lincoln Municipal;
And the exception is:
The complexity of matching the regular expression exceeded predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous. This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate.
How should I modify the sql ?
from heavydb.
I'll answer the questions one by one.
select * from newdata
tries to select everything from a big table. It's a projection without a "reasonable" limit (I think we ask it to be at most 10M). If you want to see a sample of the data, you can specify a limit and it'll work. Allowing this type of query without a limit will need us to implement pagination. I believe we'll end up doing it (for the sake of least surprise principle), but we haven't found an use-case which wouldn't be served by specifying a reasonable limit.- It's not an exception, it's a crash. We'll look into it very soon.
You shouldn't modify the SQL, we'll fix it.Your query misses the end quote for the second literal string, should be'Lincoln Municipal'
, not'Lincoln Municipal
. We'll check / fix it on our website, thanks for the catch.
from heavydb.
Closing this issue for now, but please feel free to start another discussion at https://community.mapd.com/ if this is still a problem.
from heavydb.
Related Issues (20)
- [GPU Error Bug] SELECT * FROM <table> WHERE ((<column> + <column>) < <column>) OR (<column> = <column>) Brings Errors HOT 1
- golang python HOT 10
- [GPU Error Bug] SELECT * FROM <table> JOIN ( SELECT ALL <number> FROM <table>) AS <alias> Brings Errors
- [GPU Error Bug] CAST(<column>+<column>(overflow) AS BOOLEAN) Brings Errors
- Evaluate using Profile-Guided Optimization (PGO) and Post-Link Optimization (PLO) HOT 1
- Intermitted SIGSEGV errors crashing heavyDB HOT 6
- Cannot import on an individual leaf. Please import from the Aggregator. HOT 1
- pinned memory HOT 2
- Failed to compile heavyDB; CUDA architecture not detected HOT 3
- Some demos on the website are not working or outdated HOT 1
- Error Running HeavyDB with Nvidia Nsight Compute: Broken Pipe in Thrift Connection HOT 7
- Heavydb has not been updated for more than 7 months, is there any plan to continue to update the open source code? HOT 4
- [GPU Logic Bug] SELECT /*+ keep_table_function_result */ <column> FROM <table> GROUP BY <column> Brings Errors HOT 1
- [GPU Logic Bug] SELECT /*+ keep_result */ DISTINCT <column> FROM <table> GROUP BY <column> LIMIT <number> Brings Errors HOT 1
- [GPU Logic Bug] SELECT DISTINCT <column> FROM <table> Brings Errors HOT 1
- [GPU Logic Bug] SELECT <column> FROM <table> JOIN <table> Brings Errors
- [GPU Logic Bug] SELECT /*+ keep_result */ <column> FROM <table> LIMIT <number> OFFSET <number> Brings Errors HOT 1
- [GPU Logic Bug] SELECT /*+ keep_table_function_result */ <column> FROM <table> LIMIT <number> OFFSET <number> Brings Errors HOT 1
- [GPU Logic Bug] SELECT <agg_func> FROM <table> Brings Errors HOT 1
- [GPU Logic Bug] SELECT <column> FROM <table> Brings Errors
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from heavydb.