GithubHelp home page GithubHelp logo

Comments (1)

max-vostrikov avatar max-vostrikov commented on August 16, 2024

I did managed to reproduce it. On our current latest version 24.7.1.1774

It seems that the problem is with MySQL Shell interface. I do get different results for this query with mysqlsh and with clickhouse client (clickhouse client have the correct results, mysqlsh misses one block):

mysqlsh:

MySQL  localhost:9004  test  -  SQL > WITH cte_2 AS (
                                    -> SELECT
                                    ->   cast(subq_1.c3 as Nullable(String)) as c0,
                                    ->   subq_1.c0 as c1
                                    -> FROM (select c_b as c0, c_s as c3 from t2) as subq_1
                                    -> ORDER BY 1, 2
                                    -> LIMIT 10
                                    -> )
                                    -> SELECT ref_20.c_s as c2
                                    -> FROM t1 as ref_20
                                    -> WHERE (ref_20.c_s in (select c0 from cte_2));
+----+
| c2 |
+----+
| v  |
+----+
1 row in set (0.0184 sec)

clickhouse client:

cluster_2S_2R node 1 :) WITH cte_2 AS (
SELECT
  cast(subq_1.c3 as Nullable(String)) as c0,
  subq_1.c0 as c1
FROM (select c_b as c0, c_s as c3 from t2) as subq_1
ORDER BY 1, 2
LIMIT 10
)
SELECT ref_20.c_s as c2
FROM t1 as ref_20
WHERE (ref_20.c_s in (select c0 from cte_2));

WITH cte_2 AS
    (
        SELECT
            CAST(subq_1.c3, 'Nullable(String)') AS c0,
            subq_1.c0 AS c1
        FROM
        (
            SELECT
                c_b AS c0,
                c_s AS c3
            FROM t2
        ) AS subq_1
        ORDER BY
            1 ASC,
            2 ASC
        LIMIT 10
    )
SELECT ref_20.c_s AS c2
FROM t1 AS ref_20
WHERE ref_20.c_s IN (
    SELECT c0
    FROM cte_2
)

Query id: dfdb9c00-1b75-4f3f-9a25-b3e07e7c39fb

   ┌─c2─┐
1. │ v  │
   └────┘
   ┌─c2─┐
2. │ f  │
   └────┘

2 rows in set. Elapsed: 0.021 sec.

those two commands were run on the same DB, at the same time.

Steps to reproduce:

  1. install https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-install.html
  2. set up the cluster. I've used https://github.com/ClickHouse/examples/tree/96787d22bb43e6ad8fe8dd72306ce1ae31c23b1e/docker-compose-recipes/recipes/cluster_2S_2R with changes:
    • <mysql_port>9004</mysql_port> in all config.xml for servers
    • <distributed_product_mode>allow</distributed_product_mode> in users.xml -> profiles/default
    • exposed - "127.0.0.1:9004:9004" mysql port in docker-compose.yaml
  3. run the queries to set up the data:
create database test on cluster default;
use test;

create table t1_source on cluster default (
  c_s String ,
  c_k Int32 primary key ,
  );
  
create table t1 on cluster default as t1_source ENGINE = Distributed(default, test, t1_source, c_k);

insert into t1 (c_s, c_k) values 
  ('v', 1811182695), 
  ('d', 42), 
  ('clbq1d5sc1', -654940302), 
  ('mqxkt', 32);

create table t2_source on cluster default (
  c_b Bool ,
  c_s String ,
  c_k Int32 primary key
  );
  
create table t2 on cluster default as t2_source ENGINE = Distributed(default, test, t2_source, c_k);

insert into t2 (c_b, c_s, c_k) values 
  (true, 'go7h8', -30), 
  (false, 'v', -741275615), 
  (false, 'qqvalkv3u', -1865417109), 
  (false, 'yc7e_3r50', -367254559);

insert into t2 (c_b, c_s, c_k) values 
  (false, 'yk81vk2', -10), 
  (false, 'r7', -2112028332), 
  (true, 'f', 612512867), 
  (false, 'x_vx', 1891684388);

insert into t1 (c_s, c_k) values
  ('_', -1492433226), 
  ('kw2krwo605', -37892574), 
  ('f', 100663045), 
  ('wgtfbfq45', 970950186);

insert into t2 (c_b, c_s, c_k) values  
  (false, 'p30ev6x', -565153215), 
  (false, 'rp2belxdc', 1462558685), 
  (true, 'h', 1026159363), 
  (false, 'hswe', -1226166248);
  1. error is ready to be reproduced, execute:
WITH cte_2 AS (
SELECT
  cast(subq_1.c3 as Nullable(String)) as c0,
  subq_1.c0 as c1
FROM (select c_b as c0, c_s as c3 from t2) as subq_1
ORDER BY 1, 2
LIMIT 10
)
SELECT ref_20.c_s as c2
FROM t1 as ref_20
WHERE (ref_20.c_s in (select c0 from cte_2));

the query gives different results with clickhouse client and mysqlsh.

I did tried to simplify the data and the query a bit, but it seems that is the end of it - any further simplifications of the query or the data makes the error not reproducible any more. I did checked:

  • it is not reproducible without a cluster, Distributed tables and distributed_product_mode=allow
  • most changes in the data makes the bug not reproducible any more
  • I did simplified the query itself, but removing ORDER BY , LIMIT and changing the subqueries makes the bug not reproducible too
  • the EXPLAIN query for both interfaces looks the same:
    ┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
 1. │ Union                                                                                                                                                                                  │
 2. │   CreatingSets (Create sets before main query execution)                                                                                                                               │
 3. │     Expression ((Project names + Projection))                                                                                                                                          │
 4. │       Filter ((WHERE + Change column names to column identifiers))                                                                                                                     │
 5. │         ReadFromMergeTree (test.t1_source)                                                                                                                                             │
 6. │     CreatingSet (Create set for subquery)                                                                                                                                              │
 7. │       Expression ((Project names + (Projection + (Change column names to column identifiers + Project names))))                                                                        │
 8. │         Limit (preliminary LIMIT (without OFFSET))                                                                                                                                     │
 9. │           Sorting (Sorting for ORDER BY)                                                                                                                                               │
10. │             Union                                                                                                                                                                      │
11. │               Expression ((Before ORDER BY + (Projection + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers)))))) │
12. │                 ReadFromMergeTree (test.t2_source)                                                                                                                                     │
13. │               Expression (( + ( + )))                                                                                                                                                  │
14. │                 ReadFromRemote (Read from remote replica)                                                                                                                              │
15. │   ReadFromRemote (Read from remote replica)                                                                                                                                            │
    └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

I assume the problem should be somewhere in mysql part, not sure where to dig next

from clickhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.