GithubHelp home page GithubHelp logo

blynk-technologies / clickhouse4j Goto Github PK

View Code? Open in Web Editor NEW
207.0 207.0 38.0 1.15 MB

Lighter and faster alternative for the official ClickHouse JDBC driver

License: Other

Java 100.00%
java jdbc-driver

clickhouse4j's People

Contributors

alex-krash avatar alexey-milovidov avatar amaslak avatar arkhipov avatar bgranvea avatar bluedogseyes avatar dcastanier avatar dmitriy-sych avatar doom369 avatar dysosmus avatar egorlitvinenko avatar enqueue avatar ezhaka avatar genuss avatar h6ak avatar hamsterready avatar ignatberesnev avatar izebit avatar jkee avatar kanterov avatar meegooo avatar peturrun avatar robincacou avatar serebrserg avatar smagellan avatar starwix avatar tkroman avatar tobia avatar tsl-karlp avatar zgmnkv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

clickhouse4j's Issues

Implement dynamic IN CLAUSE

Problem:

String query = "... where id in (?)";

statement.setObject(listOfIds);

Result:

String query = "... where id in ([1,2,3,4])";

Should be:

String query = "... where id in (1,2,3,4)";

how should I handle null value from spark

Caused by: java.lang.Throwable: Code: 27, e.displayText() = DB::Exception: Cannot parse input: expected \t before: \\N\t1\t50\t\\N\t\\N\t\\N\t\\N\t9139\t6916\t9118\t482856\n2019-10-03\tv X21UD A\t1\t3.4.3.3\t10\t\\N\t1\t52\t\\N\t\\N\t\\N\t\\N\t195\t136\t180\t1261641\n2019-10-03\tv Y66i A\t1\t2.8.1.1\t15\t\\N\t1\t: (at row 1)

It seems that clickhouse4j parsed null value in spark as \\N

CopyManager should extend Closeable

Current approach:

try (Connection connection = dataSource.getConnection();
    OutputStream outputStream = Files.newOutputStream(outputFile, TRUNCATE_EXISTING)) {
    CopyManager copyManager = CopyManagerFactory.create(connection); //lightweight
    copyManager.copyFromDb(query, outputStream);
}

Could be improved with:

try ( CopyManager copyManager = CopyManagerFactory.create(dataSource.getConnection()); 
      OutputStream outputStream = Files.newOutputStream(outputFile, TRUNCATE_EXISTING)) {
    copyManager.copyFromDb(query, outputStream);
}

how to solve large amount of query result ?

how to solve large amount of query result ?
my query result is huge, and get a java.lang.OutOfMemoryError , how can I solve it?

java.lang.OutOfMemoryError
  at java.io.ByteArrayOutputStream.hugeCapacity(ByteArrayOutputStream.java:123)
  at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:117)
  at java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
  at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:153)
  at cc.blynk.clickhouse.util.guava.StreamUtils.copy(StreamUtils.java:70)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.sendPostRequest(DefaultHttpConnector.java:182)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.post(DefaultHttpConnector.java:62)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.post(DefaultHttpConnector.java:74)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.sendRequest(ClickHouseStatementImpl.java:674)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:168)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:150)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:137)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:88)
  ... 59 elided

Thanks very much.

Allow to download csv files?

Current flow:

select x, y from z;
var entry = new EntryDTO(x, y);
file.write(entry.toCSV())

Would be nice to have some kind of CopyManager analog that is present in postgres driver and download files directly instead of creating intermediate objects.

it's ok to write data to local table. but error writing to distributed table

error code

scala> :paste
// Entering paste mode (ctrl-D to finish)

      lv1.repartition(30)
          .write
          .mode(SaveMode.Append)
          .format("jdbc")
          .option("url", ckUrlDa)
          .option("driver", driverClass4j)
          //.option("numPartitions", "1")
          .option("dbtable", "t_md_page_path_di_rep")
          .option("user", userMe)
          .option("password", passwordMe)
          .option("fetchsize", fetchSize)
          .option("batchsize", batchSize)
          .save()

// Exiting paste mode, now interpreting.


scala> :paste
// Entering paste mode (ctrl-D to finish)

      lv1.repartition(30)
          .write
          .mode(SaveMode.Append)
          .format("jdbc")
          .option("url", ckUrlDa)
          .option("driver", driverClass4j)
          //.option("numPartitions", "1")
          .option("dbtable", "t_md_page_path_di_dis")
          .option("user", userMe)
          .option("password", passwordMe)
          .option("fetchsize", fetchSize)
          .option("batchsize", batchSize)
          .save()

// Exiting paste mode, now interpreting.

cc.blynk.clickhouse.except.ClickHouseException: ClickHouse exception, code: 62, host: null, port: 0; Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 159: NOT NULL, "node_depth" INTEGER NOT NULL, "page_id_lv1" TEXT , "page_id_lv2" TEXT NOT NULL, "page_id_lv3" TEXT NOT NULL, "page_id_lv4" TEXT NOT NULL, "page_id_lv. Expected one of: DEFAULT, MATERIALIZED, ALIAS, COMMENT, CODEC, TTL, token, ClosingRoundBracket, Comma (version 19.15.3.6 (official build))

  at cc.blynk.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:61)
  at cc.blynk.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:31)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.checkForErrorAndThrow(DefaultHttpConnector.java:296)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.sendPostRequest(DefaultHttpConnector.java:175)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.post(DefaultHttpConnector.java:62)
  at cc.blynk.clickhouse.http.DefaultHttpConnector.post(DefaultHttpConnector.java:74)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.sendRequest(ClickHouseStatementImpl.java:674)
  at cc.blynk.clickhouse.ClickHouseStatementImpl.executeUpdate(ClickHouseStatementImpl.java:198)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:692)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:89)
  at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:518)
  at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:218)
  ... 59 elided
Caused by: java.lang.Throwable: Code: 62, e.displayText() = DB::Exception: Syntax error: failed at position 159: NOT NULL, "node_depth" INTEGER NOT NULL, "page_id_lv1" TEXT , "page_id_lv2" TEXT NOT NULL, "page_id_lv3" TEXT NOT NULL, "page_id_lv4" TEXT NOT NULL, "page_id_lv. Expected one of: DEFAULT, MATERIALIZED, ALIAS, COMMENT, CODEC, TTL, token, ClosingRoundBracket, Comma (version 19.15.3.6 (official build))

  at cc.blynk.clickhouse.except.ClickHouseExceptionSpecifier.specify(ClickHouseExceptionSpecifier.java:56)
  ... 70 more

How can I implement batch insert ?

I am new to clickhouse and clickhouse4j.
And I can't see any helpful document about batch insert.

I think more example codes would be very helpful.
Thanks.

Improve CopyManager usability

It would be nice if CopyManager could accept statement or preparedStatement.
Right now I have to use it like this:

copyManager.copyFromDb(((ClickHousePreparedStatement) ps).asSql(), zipOutputStream);

can not parsed quoted string

I want to interst Array(String) type value into a table.
but it failed:

Cannot parse quoted string: expected opening quote: (at row 1)
['001|002|003','001|002|003']

but i can insert it successfully by cli

Client service uses all ports

Hi,
We are testing ClickHouse with your connector and you did a great job with it. But we have a problem. When we run service in AWS as docker and try to insert a lot of records (2 billion) I don't know what is happening but connector gradually uses all ports in docker and never release it.

I try to close every connection in try catch and get a new connection for every batch insert. And I try to get new connection only if it was closed it was better because service uses fewer ports but still it was thousands of ports.

...
           try( final Connection connection = dataSource.getConnection();
            final PreparedStatement preparedStatement = connection.prepareStatement(
                    ActivitySqlGenerator.createInsertPreparedStatement(getDbAndTableName(), QuoteType.BACK_QUOTE)));
{
         connection.setAutoCommit(false);

        for (Activity activity: activities) {
            ActivitySqlGenerator.bindInsert(1, preparedStatement, activity);
            preparedStatement.addBatch();
        }

        preparedStatement.executeBatch();
        preparedStatement.clearBatch();
        connection.commit();

        } catch (Exception e) {
            logger.info("Failed to bulk insert.", e);
        }
...
    public ClickHouseConnection getConnection() throws SQLException {
        if (connection == null || connection.isClosed()) {
            this.connection = dataSource.getConnection();
        }
        return this.connection;
    }

...
        try
        {
            final Connection connection = this.getConnection();
            final PreparedStatement preparedStatement = connection.prepareStatement(
                    ActivitySqlGenerator.createInsertPreparedStatement(getDbAndTableName(), QuoteType.BACK_QUOTE));

        connection.setAutoCommit(false);

        for (Activity activity: activities) {
            ActivitySqlGenerator.bindInsert(1, preparedStatement, activity);
            preparedStatement.addBatch();
        }

        preparedStatement.executeBatch();
        preparedStatement.clearBatch();
        connection.commit();

        } catch (Exception e) {
            logger.info("Failed to bulk insert.", e);
        }
...

Can someone help? Is It known problem or do I something wrong?

Add cache for the queries

Typical jdbc driver has embedded LRU cache for queries, so queries in most of the apps just parsed once and than reused. In that case parser optimization could be avoided.

Remove CSV Manager

Right now it only things more complex (javadocs, examples, tets). At the moment we don't need this specialization.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.