sn0wfree / clicksql Goto Github PK
View Code? Open in Web Editor NEWa python client for ClickHouse
License: MIT License
a python client for ClickHouse
License: MIT License
auto scan dependents
自动分析执行依赖???
when execute a sql or query, received a error code from server and show correspond information. no general database error!
>>> ch('show database')
>>> DatabaseError: b"Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 6 ('database'): database format JSONCompact. Expected one of: TABLES, GRANTS, CREATE, ACCESS, QUOTA, CURRENT ROLES, PRIVILEGES, PROCESSLIST, CURRENT QUOTA, ENABLED ROLES, CREATE, DICTIONARIES, QUOTAS, ROW POLICIES, POLICIES, SETTINGS PROFILES, PROFILES, ROLES, USERS (version 20.9.3.45 (official build))\n"
change to
>>> ch('show database')
>>> SyntaxError: b"Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 6 ('database'): database format JSONCompact. Expected one of: TABLES, GRANTS, CREATE, ACCESS, QUOTA, CURRENT ROLES, PRIVILEGES, PROCESSLIST, CURRENT QUOTA, ENABLED ROLES, CREATE, DICTIONARIES, QUOTAS, ROW POLICIES, POLICIES, SETTINGS PROFILES, PROFILES, ROLES, USERS (version 20.9.3.45 (official build))\n"
Node(sql)
for sql in sql_list:
Node(sql,delay=True,merged=True) # not real execute sql
Node.delay_tasks.run() # execute all sql which has been mark delay
其中可以加入merge参数,来决定该语句是否可以被合并执行
如果merge=True,可以合并相同的sql语句,只执行一次以减少数据请求次数,但是返回结果样式一致
extra_format_dict not update normally
Node.create(db,table,keys_col,extra_format_dict ={})
will raise attributeerror: 'dict' object has no attribute '_update'
当单次insert sql 后续接入的数据量比较大的时候,需要考虑通过拆分数据进行构建大吞吐量的插入操作
Node(sql,data)
当data超过100000行以上时,可能存在无法完成的插入操作
add query remote table
自动分片计算
ClickSQL/ClickSQL/clickhouse/ClickHouse.py
Line 411 in 42906c3
add a key argumnet to control whether use async query
def __execute__(self, sql: (str, list, tuple), convert_to: str = 'dataframe', transfer_sql_format: bool = True,async=False ):
alter to alter table
create a connection pool to use
ClickSQL/ClickSQL/factor_table/__init__.py
Line 182 in 42906c3
add new usage: use cik_dt only to merge rather than (cik_dt,cik_iid)
when join two sql, but join column is only cik_dt or cik_iid.
such as:
sql = f"select * from ({sql1}) all full join ({sql2}) using (cik_dt) {settings}"
or
sql = f"select * from ({sql1}) all full join ({sql2}) using (cik_iid) {settings}"
add auto create table function
ClickSQL/ClickSQL/factor_table/__init__.py
Line 251 in 42906c3
add_factor function could be to add factor via factortable(self), table of database, dataframe or SQL,
such as:
[completed]
ft = FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft.add_factor('test.test',['test1'],cik_dt='dt',cik_iid='iid')
ft = FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft.add_factor('test.test',['test1'],cik_dt='dt',cik_iid='iid')
ft2 = FactorTable(conn2,cik_dt='dt',cik_iid='code',strict_cik=False)
ft2.add_factor(ft,['test1'],cik_dt='dt',cik_iid='iid')
sql = 'select dt,code,test3,test4 from test.test2'
ft3= FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft3.add_factor(sql ,['test3','test4],cik_dt='dt',cik_iid='code')
data=pd.DataFrame(data,columns=['test3','test4','dt','code'])
ft3= FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft3.add_factor(data,['test3','test4],cik_dt='dt',cik_iid='code')
conn.execute(sql, args)
特别是在插入数据的时候,可以考虑写入一部分简单插入语句,后面加入对应的数据,以适配以及成型sqlengine的用法
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.