Comments (1)
I did managed to reproduce it. On our current latest version 24.7.1.1774
It seems that the problem is with MySQL Shell interface. I do get different results for this query with mysqlsh
and with clickhouse client (clickhouse client have the correct results, mysqlsh misses one block):
mysqlsh
:
MySQL localhost:9004 test - SQL > WITH cte_2 AS (
-> SELECT
-> cast(subq_1.c3 as Nullable(String)) as c0,
-> subq_1.c0 as c1
-> FROM (select c_b as c0, c_s as c3 from t2) as subq_1
-> ORDER BY 1, 2
-> LIMIT 10
-> )
-> SELECT ref_20.c_s as c2
-> FROM t1 as ref_20
-> WHERE (ref_20.c_s in (select c0 from cte_2));
+----+
| c2 |
+----+
| v |
+----+
1 row in set (0.0184 sec)
clickhouse client:
cluster_2S_2R node 1 :) WITH cte_2 AS (
SELECT
cast(subq_1.c3 as Nullable(String)) as c0,
subq_1.c0 as c1
FROM (select c_b as c0, c_s as c3 from t2) as subq_1
ORDER BY 1, 2
LIMIT 10
)
SELECT ref_20.c_s as c2
FROM t1 as ref_20
WHERE (ref_20.c_s in (select c0 from cte_2));
WITH cte_2 AS
(
SELECT
CAST(subq_1.c3, 'Nullable(String)') AS c0,
subq_1.c0 AS c1
FROM
(
SELECT
c_b AS c0,
c_s AS c3
FROM t2
) AS subq_1
ORDER BY
1 ASC,
2 ASC
LIMIT 10
)
SELECT ref_20.c_s AS c2
FROM t1 AS ref_20
WHERE ref_20.c_s IN (
SELECT c0
FROM cte_2
)
Query id: dfdb9c00-1b75-4f3f-9a25-b3e07e7c39fb
┌─c2─┐
1. │ v │
└────┘
┌─c2─┐
2. │ f │
└────┘
2 rows in set. Elapsed: 0.021 sec.
those two commands were run on the same DB, at the same time.
Steps to reproduce:
- install https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-shell-install.html
- set up the cluster. I've used https://github.com/ClickHouse/examples/tree/96787d22bb43e6ad8fe8dd72306ce1ae31c23b1e/docker-compose-recipes/recipes/cluster_2S_2R with changes:
<mysql_port>9004</mysql_port>
in allconfig.xml
for servers<distributed_product_mode>allow</distributed_product_mode>
in users.xml -> profiles/default- exposed
- "127.0.0.1:9004:9004"
mysql port in docker-compose.yaml
- run the queries to set up the data:
create database test on cluster default;
use test;
create table t1_source on cluster default (
c_s String ,
c_k Int32 primary key ,
);
create table t1 on cluster default as t1_source ENGINE = Distributed(default, test, t1_source, c_k);
insert into t1 (c_s, c_k) values
('v', 1811182695),
('d', 42),
('clbq1d5sc1', -654940302),
('mqxkt', 32);
create table t2_source on cluster default (
c_b Bool ,
c_s String ,
c_k Int32 primary key
);
create table t2 on cluster default as t2_source ENGINE = Distributed(default, test, t2_source, c_k);
insert into t2 (c_b, c_s, c_k) values
(true, 'go7h8', -30),
(false, 'v', -741275615),
(false, 'qqvalkv3u', -1865417109),
(false, 'yc7e_3r50', -367254559);
insert into t2 (c_b, c_s, c_k) values
(false, 'yk81vk2', -10),
(false, 'r7', -2112028332),
(true, 'f', 612512867),
(false, 'x_vx', 1891684388);
insert into t1 (c_s, c_k) values
('_', -1492433226),
('kw2krwo605', -37892574),
('f', 100663045),
('wgtfbfq45', 970950186);
insert into t2 (c_b, c_s, c_k) values
(false, 'p30ev6x', -565153215),
(false, 'rp2belxdc', 1462558685),
(true, 'h', 1026159363),
(false, 'hswe', -1226166248);
- error is ready to be reproduced, execute:
WITH cte_2 AS (
SELECT
cast(subq_1.c3 as Nullable(String)) as c0,
subq_1.c0 as c1
FROM (select c_b as c0, c_s as c3 from t2) as subq_1
ORDER BY 1, 2
LIMIT 10
)
SELECT ref_20.c_s as c2
FROM t1 as ref_20
WHERE (ref_20.c_s in (select c0 from cte_2));
the query gives different results with clickhouse client and mysqlsh.
I did tried to simplify the data and the query a bit, but it seems that is the end of it - any further simplifications of the query or the data makes the error not reproducible any more. I did checked:
- it is not reproducible without a cluster, Distributed tables and
distributed_product_mode=allow
- most changes in the data makes the bug not reproducible any more
- I did simplified the query itself, but removing
ORDER BY
,LIMIT
and changing the subqueries makes the bug not reproducible too - the EXPLAIN query for both interfaces looks the same:
┌─explain────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
1. │ Union │
2. │ CreatingSets (Create sets before main query execution) │
3. │ Expression ((Project names + Projection)) │
4. │ Filter ((WHERE + Change column names to column identifiers)) │
5. │ ReadFromMergeTree (test.t1_source) │
6. │ CreatingSet (Create set for subquery) │
7. │ Expression ((Project names + (Projection + (Change column names to column identifiers + Project names)))) │
8. │ Limit (preliminary LIMIT (without OFFSET)) │
9. │ Sorting (Sorting for ORDER BY) │
10. │ Union │
11. │ Expression ((Before ORDER BY + (Projection + (Change column names to column identifiers + (Project names + (Projection + Change column names to column identifiers)))))) │
12. │ ReadFromMergeTree (test.t2_source) │
13. │ Expression (( + ( + ))) │
14. │ ReadFromRemote (Read from remote replica) │
15. │ ReadFromRemote (Read from remote replica) │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
I assume the problem should be somewhere in mysql part, not sure where to dig next
from clickhouse.
Related Issues (20)
- Replicated table from PostgreSQL - "Decimal value is too big"
- Parametrized view - Is it possible to use parameter in subquery (maybe there is an issue with array)?
- Exception 'CAST AS Array can only be performed between same-dimensional array types' when fetching all data from table with columns using various aggregate functions/data types HOT 1
- Support DeltaLake table engine for Azure Blob Storage
- `optimize_aggregation_in_order` is ignored when calculating an aggregate function from a column in table's `ORDER BY`
- Flaky test `01526_max_untracked_memory`
- Please add more instructions about how to specify the HNSW parameters when building and querying the usearch ANN index HOT 6
- Keep getting error `Only literals can be skip index arguments` when creating a table with vector search index HOT 5
- After upgrading from version 22.3 to 22.7, an error occurred during the query::Exception: Unknown codec family code: 1: (while reading column ss_data_source): While executing MergeTreeInOrder. (UNKNOWN_CODEC) (version 22.7.7.24 (official build)) HOT 3
- Always trying to use default database when creating distributed table from select
- Function and SQL syntax proposals for improving the readability of nested function calls. HOT 2
- Polygon is not valid: Geometry has spikes HOT 2
- read exception HOT 1
- Incorrect status in s3queue_log after deleting s3queue table
- SETTINGS allow_implicit_materialized_view_creation HOT 2
- Is there some option to enable CREATE TABLE IF NOT EXISTS as default? HOT 4
- Materialized view are not aggregating properly HOT 2
- Develop a ClickHouse Codebase guide for faster development for new and experienced contributors
- Corrupted deb packages on CI
- CREATE ALIAS my_alias FOR TABLE my_table HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clickhouse.