GithubHelp home page GithubHelp logo

datafuselabs / databend-py Goto Github PK

View Code? Open in Web Editor NEW
16.0 16.0 5.0 180 KB

Databend Cloud Python Driver with native interface support

License: Apache License 2.0

Python 99.72% Makefile 0.28%
database databend driver python

databend-py's People

Contributors

everpcpc avatar flaneur2020 avatar hantmac avatar wubx avatar zhihanz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

databend-py's Issues

`databend://` dsn not work in cloud

databend-py: 0.5.8

Follow this doc:
https://docs.databend.com/guides/sql-clients/developers/python

from databend_py import Client

client = Client.from_url(f"databend://{USER}:{PASSWORD}@${HOST}:443/{DATABASE}?&warehouse={WAREHOUSE_NAME})
client.execute('DROP TABLE IF EXISTS data')
client.execute('CREATE TABLE if not exists data (x Int32,y VARCHAR)')
client.execute('DESC  data')
client.execute("INSERT INTO data (Col1,Col2) VALUES ", [1, 'yy', 2, 'xx'])
_, res = client.execute('select * from data')
print(res)

databend:// does not work, return errors:

/lib/python3.9/site-packages/urllib3/__init__.py:35: NotOpenSSLWarning: urllib3 v2 only supports OpenSSL 1.1.1+, currently the 'ssl' module is compiled with 'LibreSSL 2.8.3'. See: https://github.com/urllib3/urllib3/issues/3020
  warnings.warn(
Please enter the DSN for the database connection: databend://bohu:***@***--small.gw.aws-us-east-2.default.databend.com:443/***?&warehouse=small
http error on http://***--askbend-small.gw.aws-us-east-2.default.databend.com:443/v1/query/

[bug] Read `boolean` type failed

ischema_names = {
"int": INTEGER,
"int64": INTEGER,
"int32": INTEGER,
"int16": INTEGER,
"int8": INTEGER,
"uint64": INTEGER,
"uint32": INTEGER,
"uint16": INTEGER,
"uint8": INTEGER,
"decimal": DECIMAL,
"date": DATE,
"timestamp": DATETIME,
"float": FLOAT,
"double": FLOAT,
"float64": FLOAT,
"float32": FLOAT,
"string": VARCHAR,
"array": ARRAY,
"map": MAP,
"json": JSON,
"varchar": VARCHAR,
}

There is no bool.

duplicated columns when data count greater than 10000

Version

databend_py: 0.4.2
databend_query: 1.1.45-nightly

Detail

import databend_py
conn = databend_py.Client(host='127.0.0.1',port=8000)
columns, data =  conn.execute('SELECT * FROM numbers(10001)',with_column_types=True)
print(columns)

columns:
[('number', 'UInt64'), ('number', 'UInt64')]
instead of
[('number', 'UInt64')]

When the amount of data exceeds 10000, the data will be returned by paging. function store will append columns multi times

https://github.com/databendcloud/databend-py/blob/6ebf8f11727834eb350cc52f805d815a457a508f/databend_py/result.py#LL44C10-L44C10

Batch insert error with multi query node in k8s

Version:

  • databend-query Version: 1.1.29-nightly
  • databend-py version: 0.3.9

Deploy method:
kubernetes helm charts

  • 3 pods of databend-meta
  • 2 pod of databend-query.
  • 1 ClusterIP Service for databend-query

Description:
When execute batch insert, client will upload csv to object storage and then execute COPY INTO , and receive_result with query id.
But the kubernetes Service uses a load balancing strategy, which receive_result request maybe send to different query node, and got http request error: query id not found

Temporary Solution:
Set Kubernetes Service sessionAffinity to ClientIP, make all requests made by the same Pod go to the same node, but if use multiprocessing in one pod, all requests will send to single query node.

I have no ideas on how to fix this issue completely, It seems the receive_result request must be sent to specific query node.

0.4.7 fails imports

It seems the latest version of this package is relying on sdk_info.py reading a VERSION file that is included in the repo, but not in the actual pip source. I think this is an issue with the setup.py file but I'm not sure how to fix it.

Logs:

>>> from databend_py import Client
  File "/usr/local/lib/python3.8/site-packages/databend_py/__init__.py", line 1, in <module>
    from .client import Client
  File "/usr/local/lib/python3.8/site-packages/databend_py/client.py", line 4, in <module>
    from .connection import Connection
  File "/usr/local/lib/python3.8/site-packages/databend_py/connection.py", line 16, in <module>
    headers = {'Content-Type': 'application/json', 'User-Agent': sdk_info(), 'Accept': 'application/json',
  File "/usr/local/lib/python3.8/site-packages/databend_py/sdk_info.py", line 18, in sdk_info
    return f"{sdk_lan()}/{sdk_version()}"
  File "/usr/local/lib/python3.8/site-packages/databend_py/sdk_info.py", line 8, in sdk_version
    with open(version_py, encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.8/site-packages/databend_py/VERSION'

Support client.insert

example:

row1 = [1000, 'String Value 1000', 5.233]
row2 = [2000, 'String Value 2000', -107.04]
data = [row1, row2]
client.insert('new_table', data, column_names=['key', 'value', 'metric'])

Support session settings

currently the session settings is not supported in the drivers yet.

like:

SET fast_parquet_read_bytes = 1024;

this statement can be executed successfully, but the session parameter is not taking effect at all.

this is because databend takes a client-side session implementation:

  1. when the SET statement got executed, the server side responses a new session state (which contains the session settings)
  2. the client need pass the session settings to the server on the next query

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.