Comments (12)
Update: I've added default constructor for ClickHouseConnection / ClickHouseCommand and this was enough for me to use the connector with my data access library and for now DbProviderFactory is not really needed.
After all I tried to connect to CH from simple .net core console app. There was weird exception related to TcpClient.OpenAsync().RunSynchronously() and I've already provided a bugfix in my fork.
Finally, I was able to connect to CH server, run select query, but I got empty data reader (of course query returns rows when executed from clickhouse-client).
Here are my code snippet:
var conn = new ClickHouseConnection();
conn.ConnectionString = "Compress=False;Compressor=lz4;Host=localhost;Port=9000;Database=default;User=default;Password=";
var cmd = conn.CreateCommand();
cmd.CommandText = "SELECT Source,ProductCode,COUNT(*) as __Count FROM ( SELECT * FROM precos ) as t GROUP BY Source,ProductCode";
conn.Open();
try {
var rdr = cmd.ExecuteReader();
while (rdr.Read()) {
Console.WriteLine("{0}, {1}: {2}", rdr["Source"], rdr["ProductCode"], rdr["__Count"]);
}
} finally {
conn.Close();
}
@killwort do you have any idea why data reader doesn't have any rows?
from clickhouse-net.
If you could, please, pullrequest all your changes back here, especially if they're bugfixes. However, I tried using my driver with netcoreapp1.1 target without any problem (except for missing System.Data :) )
Regarging your problem with reader: you must iterate through all results in reader (do{}while(reader.NextResult())). This is needed 'cause CH's protocol and engine is designed in a way allowing the same query to return several cursors with different schemas (you can see it in the clickshouse client too when you do some non-aggregating query like "SELECT col1,col2 FROM table WHERE col3=1" the result will be output grouped by "blocks" on MergeTree date key). By the way, the first result is empty often.
After all, there's extension method ClickHouse.Ado.AdoExtensions.ReadAll allowing you to hide DbReader iteration implementation.
from clickhouse-net.
@killwort you're absolutely right, I was able to get a query result by calling "reader.NextResult()".
Nevertheless, this is absolutely unexpected behavior :-) because "NextResult" is used when several different result sets are returned (say, 2 selects, or stored procedure that returns results of 2 selects). In ADO.NET when single SELECT is executed only one result is expected.
I'm trying to use ClickHouse.Ado with data layer that able to work with any SQL compatible connector, so it handles data readers in usual way (= for single query, it just calls "Read").
I understand that this is caused by the nature of CH protocol; for now I will write special wrapper for ClickHouseDataReader
that will iterate through all results sets and work with them as "single" result.
It would be nice if ClickHouse.Ado will support this mode from the box (maybe, it should be controlled with some special ClickHouseConnection property); I think that most users of ClickHouse.Ado will use it to execute single query and they will be surprised when reader returns 0 records without calling "NextResult" :-)
I can add this option if you point me out how to determine CH blocks that are actually result for the same query. Or this option can just iterate through all blocks with just "Read()", this is simplest variant of course. What do you think?
from clickhouse-net.
Well, clickhouse does not support multi queries per command roundtrip so all results may be safely grouped. However that does nothing to mitigate situations when blocks have different schemas (e.g. blocks from different shards or even different MergeTree date blocks may have different column order if it was not explicitly set by query).
As for conformance with usual ADO.NET ways IMO it could be dropped as clickhouse itself behaves in a way different from most databases. Its SQL dialect is incompatible with standards and it doesn't support any kind of data alteration after insert. Compatibility would simply cost too much to develop without any immediate profit, anyway why would one use CH as ORM back-end or any other automatic query builders? CH is designed to be performing in situations with bulk inserts and highly-aggregating OLAP queries, both of it require manual query tuning.
from clickhouse-net.
anyway why would one use CH as ORM back-end or any other automatic query builders?
let me explain a bit. I'm trying to use ClickHouse.Ado for BI tool connector. This tool can build pivot tables / pivot charts by dynamic configuration that is passed from UI (in other words, user able to select dimensions / measures), and unlike classic OLAP server this tool performs data aggregation on the fly (like ROLAP).
Technically, it produces simple SELECT .. GROUP BY only for columns that correspond to dimensions/measures needed for concrete pivot table. In this scenario ClickHouse SQL is quite enough standard; all dialect-specific things (functions, calculations) can be defined in the nested query SELECT ... FROM (SELECT ..)
Also this tool supports user-defined conditions: user can specify complex filter on UI like "(user_id=5 or user_id=6) and is_active=1". This condition is automatically translated to SQL with my NReco.Data library, and this just works fine with ClickHouse too. I've found a way how to use ClickHouse.Ado with NReco.Data without need of DbProviderFactory implementation (fortunately it defines it's own IDbFactory interface that uses only interfaces instead of base classes like DbConnection, DbCommand etc).
For now I've implemented special wrappers for ClickHouseCommand / ClickHouseDataReader that implement "read all" logic, and this solves my problem. If you decide that it would be nice to have this behavior in the connector code just let me know.
Regarding
e.g. blocks from different shards or even different MergeTree date blocks may have different column order if it was not explicitly set by query
it is possible to handle that somehow? I mean that ClickHouseDataReader in "ReadAll" mode can compare schema previous block schema with next one, and if order / columns set is different, align them (reorder to match order of previous block; return DBNull for columns that are missed in the next block but prevent in the previous). I guess this is untypical situation when blocks schema is different - after all, "clickhouse-client" returns result as single tabular data somehow?
from clickhouse-net.
I think its impossible to synchmonize schema across block with current reading implementation. Each protocol block contains its own header describing block structure there's no means to know future blocks' structures before you completely read previous block. Currently only one (current) block kept in client memory (though I still think it's a waste of resources as block may be quite big). The only possibility I see is to group sequential blocks with matching structure - in most cases you'll end up with big single block.
As for clickhouse-client behaviour - it is matching server engine behaviour. If your query translates to raw block output (i.e. no grouping constructs, no joins, no aggregation) result will be chunked (and it really is in clickhouse-client output). Elseway your front-end server (the one you're connecting to) do block grouping before outputting effectively eliminating chunking.
from clickhouse-net.
Wow, i`ve just spent the entire day trying to figure out why reader.Read() wasn't working. This library definitely needs documentation. I'm willing to help and cooperate since we're going to use it in our project. What do you think?
from clickhouse-net.
Well, this behaviour is described on readme.md: https://github.com/killwort/ClickHouse-Net#always-use-nextresult
If you wish to add some docs and/or implement new functionality feel free to make a pull request.
from clickhouse-net.
@OlegStotsky in my project I've wrapped ClickHouseDataReader with my own wrapper that transparently calls "NextResult":
class ReadAllDataReader : IDataReader {
IDataReader Reader;
internal ReadAllDataReader(IDataReader rdr) {
Reader = rdr;
}
// proxy IDataReader properties/methods implementation
public bool Read() {
var res = Reader.Read();
if (!res) {
var hasNextResult = Reader.NextResult();
if (hasNextResult)
res = Reader.Read();
}
return res;
}
}
@killwort possibly it is good idea add an option (enabled by default) to ClickHouseCommand that will force the same behavior?
from clickhouse-net.
@killwort Is there update on this?
from clickhouse-net.
DbProviderFactory is import for many ORM libs and data helpers, how to implement it?
from clickhouse-net.
Will be released with 2.0.0-preview.
Don't use for production right away, it is a major rewrite!
from clickhouse-net.
Related Issues (20)
- Any Way To optimize the ClickHouseCommand memory Use HOT 1
- how to set timezone in current connect HOT 3
- Driver hangs when inserting into ReplicatedMergeTree tables HOT 6
- Connection string for cluster db
- Inserting multiple rows doesn't work returns an timeout error HOT 4
- Please add a English and Russian headings in documentation / readme.md HOT 1
- Replace sql parameter error when execute sql HOT 1
- DataTable.Load(DbDataReader) has a wrong hadppend Can't use it in this way? HOT 3
- Пул подключений к БД HOT 1
- System.NotSupportedException:“Unknown column type Decimal(9, 4)” HOT 3
- find a bug ,when process nullable(DateTime) fields
- ClickHouseConnection how to use it properly HOT 1
- how to get correct value from datetime64 ? HOT 1
- DialogueLock of ClickhouseConnection is not released when read command fails and the reader is not created (the method dispose of the reader do the release when it's at least created) HOT 1
- Json Column types are not supported
- Special Character and connect string requirements HOT 1
- Type LowCardinality(Nullable(String)) not supported in v2.0.5 HOT 1
- Best way to measure write performance of batch insertion
- No rows returned in select statement, though data is present in the table HOT 1
- INSERT ... SETTINGS syntax does not work with bulk insertion
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse-net.