GithubHelp home page GithubHelp logo

sn0wfree / clicksql Goto Github PK

View Code? Open in Web Editor NEW
7.0 7.0 0.0 40.11 MB

a python client for ClickHouse

License: MIT License

Python 99.85% Shell 0.15%
clickhouse clickhouse-client database dataframe insert-data pandas python sql

clicksql's People

Contributors

dependabot[bot] avatar sn0wfree avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

clicksql's Issues

parse error type and category them to show information clearly

when execute a sql or query, received a error code from server and show correspond information. no general database error!

>>> ch('show database')

>>> DatabaseError: b"Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 6 ('database'): database format JSONCompact. Expected one of: TABLES, GRANTS, CREATE, ACCESS, QUOTA, CURRENT ROLES, PRIVILEGES, PROCESSLIST, CURRENT QUOTA, ENABLED ROLES, CREATE, DICTIONARIES, QUOTAS, ROW POLICIES, POLICIES, SETTINGS PROFILES, PROFILES, ROLES, USERS (version 20.9.3.45 (official build))\n"

change to

>>> ch('show database')

>>> SyntaxError: b"Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 6 ('database'): database format JSONCompact. Expected one of: TABLES, GRANTS, CREATE, ACCESS, QUOTA, CURRENT ROLES, PRIVILEGES, PROCESSLIST, CURRENT QUOTA, ENABLED ROLES, CREATE, DICTIONARIES, QUOTAS, ROW POLICIES, POLICIES, SETTINGS PROFILES, PROFILES, ROLES, USERS (version 20.9.3.45 (official build))\n"

add delay execute parameter

Node(sql)

  1. In for loop, may be can add delay arguements to execute sql later! parallel execute all sql at same time after for loop
    such as:
for sql in sql_list:
     Node(sql,delay=True,merged=True) # not real execute sql

Node.delay_tasks.run() # execute all sql which has been mark delay

其中可以加入merge参数,来决定该语句是否可以被合并执行
如果merge=True,可以合并相同的sql语句,只执行一次以减少数据请求次数,但是返回结果样式一致

extra_format_dict not update normally

extra_format_dict not update normally

Node.create(db,table,keys_col,extra_format_dict ={})

will raise attributeerror: 'dict' object has no attribute '_update'

当插入数据时候,数据量过大时可能导致请求无法完成

当插入数据时候,数据量过大时可能导致请求无法完成

当单次insert sql 后续接入的数据量比较大的时候,需要考虑通过拆分数据进行构建大吞吐量的插入操作

Node(sql,data)

当data超过100000行以上时,可能存在无法完成的插入操作

add new usage: use cik_dt only to merge rather than (cik_dt,cik_iid)

sql = f"select * from ({sql1}) all full join ({sql2}) using (cik_dt,cik_iid) {settings}"

add new usage: use cik_dt only to merge rather than (cik_dt,cik_iid)
when join two sql, but join column is only cik_dt or cik_iid.
such as:

sql = f"select * from ({sql1}) all full join ({sql2}) using (cik_dt)  {settings}" 

or

sql = f"select * from ({sql1}) all full join ({sql2}) using (cik_iid)  {settings}" 

add new process at add_factor

def add_factor(self, db_table, factor_names: (list, tuple, str), cik_dt=None, cik_iid=None,

add_factor function could be to add factor via factortable(self), table of database, dataframe or SQL,
such as:

add factor via database table

[completed]

ft = FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft.add_factor('test.test',['test1'],cik_dt='dt',cik_iid='iid')

add factor via factortable(self)

ft = FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft.add_factor('test.test',['test1'],cik_dt='dt',cik_iid='iid')

ft2 = FactorTable(conn2,cik_dt='dt',cik_iid='code',strict_cik=False)
ft2.add_factor(ft,['test1'],cik_dt='dt',cik_iid='iid')

add factor via SQL

sql = 'select dt,code,test3,test4 from test.test2'
ft3= FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft3.add_factor(sql ,['test3','test4],cik_dt='dt',cik_iid='code')

add factor via pd.DataFrame

data=pd.DataFrame(data,columns=['test3','test4','dt','code'])
ft3= FactorTable(conn,cik_dt='dt',cik_iid='code',strict_cik=False)
ft3.add_factor(data,['test3','test4],cik_dt='dt',cik_iid='code')

添加类sqlengine的execute用法

add usage like execute funcation of sqlengine

conn.execute(sql, args)

特别是在插入数据的时候,可以考虑写入一部分简单插入语句,后面加入对应的数据,以适配以及成型sqlengine的用法

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.