aipescience / queryparser Goto Github PK
View Code? Open in Web Editor NEWParsing, processing, and translation of PostgreSQL, MySQL and ADQL queries
License: Apache License 2.0
Parsing, processing, and translation of PostgreSQL, MySQL and ADQL queries
License: Apache License 2.0
Query with "LIMIT N" statement at end is not handled correctly, when selecting ADQL.
The ADQL CIRCLE
function will raise an error if parameter are not fixed float values.
This works
SELECT gaia.source_id, gaia.ra, gaia.dec, gd.r_est
FROM gaiadr2.gaia_source gaia, gaiadr2_contrib.geometric_distance gd
WHERE 1 = CONTAINS(POINT('ICRS', gaia.ra, gaia.dec),
CIRCLE('ICRS',245.8962, -26.5222, 0.5))
AND gaia.phot_g_mean_mag < 15
AND gd.r_est > 1500 AND gd.r_est < 2300
AND gaia.source_id = gd.source_id
This will fail
SELECT gaia.source_id, gaia.ra, gaia.dec, gd.r_est
FROM gaiadr2.gaia_source gaia, gaiadr2_contrib.geometric_distance gd
WHERE 1 = CONTAINS(POINT('ICRS', 245.8962, -26.5222),
CIRCLE('ICRS',gaia.ra, gaia.dec, 0.5))
AND gaia.phot_g_mean_mag < 15
AND gd.r_est > 1500 AND gd.r_est < 2300
AND gaia.source_id = gd.source_id
CIRCLE
with variable parameters is valid and should work.
SELECT g.source_id, g.ra, t.ra from `GDR1`.`gaia_source` g JOIN `GDR1`.`tgas_source` t ON g.source_id = t.source_id
parses fine, but should give
(1060, "Duplicate column name 'ra'")
This passes with PostgreSQL but fails with ADQL translation
SELECT
0.5 + FLOOR(LOG(mass)) AS logmass,
COUNT(*) AS num
FROM mdr1.fof
WHERE snapnum = 85
GROUP BY FLOOR(LOG(mass))
ORDER BY logmass
The PostgreSQL specific DISTINCT ON clause is not supported by queryparser.
example:
from queryparser.postgresql import PostgreSQLQueryProcessor
qp = PostgreSQLQueryProcessor()
sql = """
SELECT
DISTINCT ON ("source"."tycho2_id") "tycho2_id",
"source"."tycho2_dist",
"source"."source_id",
"source"."raj2000"
FROM "applause_dr3"."source_calib" AS "source"
WHERE "source"."raj2000" BETWEEN 10.0 AND 10.0005
AND "source"."tycho2_dist" IS NOT NULL
ORDER BY "source"."tycho2_dist"
"""
qp.set_query(sql)
qp.process_query()
fails with following error message
---------------------------------------------------------------------------
QuerySyntaxError Traceback (most recent call last)
/tmp/ipykernel_136447/2882367489.py in <module>
17 """
18 qp.set_query(sql)
---> 19 qp.process_query()
~/.pyenv/versions/3.8.7/envs/tap_env/lib/python3.8/site-packages/queryparser/common/common.py in process_query(self, replace_schema_name, indexed_objects)
757 tree = parser.query()
758 if len(self.syntax_error_listener.syntax_errors):
--> 759 raise QuerySyntaxError(self.syntax_error_listener.syntax_errors)
760
761 if replace_schema_name is not None:
QuerySyntaxError: [(2, 9, 'ON')]
I expect that DISTINCT ON clause passes without raising an exception.
This patch seems to solve the issue. But may not be appropriate for all cases.
diff --git a/src/queryparser/postgresql/PostgreSQLParser.g4 b/src/queryparser/postgresql/PostgreSQLParser.g4
index 1e8a6a9..180a1b8 100644
--- a/src/queryparser/postgresql/PostgreSQLParser.g4
+++ b/src/queryparser/postgresql/PostgreSQLParser.g4
@@ -189,7 +189,9 @@ query: select_statement SEMI;
schema_name: ID ;
select_list: ( displayed_column ( COMMA displayed_column )* ) |
- ( ASTERISK ( COMMA displayed_column ( COMMA displayed_column )* )? ) ;
+ ( ASTERISK ( COMMA displayed_column ( COMMA displayed_column )* )? ) |
+ ( ON (subselect_list) ( COMMA displayed_column )* );
+subselect_list: ( displayed_column ( COMMA displayed_column )* );
select_statement: select_expression ( (UNION_SYM ( ALL )? ) select_expression )* ;
simple_expr:
can support for window function be added in both mysql postgresql
Sample query:
Select field1, sum(field2) over (partition by field3) as fieldn1 from table1
Currently, the queryparser seems to miss some functions from PostgreSQL v12+.
As an example, the following test fails
- SELECT log10(mvir) from mdr1.bdmv
WHERE snapnum=85
- ['mdr1.bdmv.mvir', 'mdr1.bdmv.snapnum']
- ['where']
-
-
-
log10() was introduced with PostgreSQL 12.
CREATE TABLE IF NOT EXISTS is throwing QuerySyntaxError though it should not be.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.