Comments (10)
Thanks @yanjunz97 and @yuntanghsu . For your questions on the rawQuery
, rawSql
, I have an answer now.
As Yun-Tang might not know, for the datasource plugin, we were previously using the vertamedia-clickhouse-grafana and then switch to grafana-clickhouse-datasource. grafana-clickhouse-datasource
only introduced the rawSql
, while
vertamedia-clickhouse-grafana
introduced formattedQuery
, query
, and rawQuery
. So in our dashboard JSONs, they now have all these four "Query", which makes it confusing. Sorry for not discovering this issue earlier. I'll delete those unused formattedQuery
, query
, and rawQuery
from dashboard JSONs. Ref issue: grafana/clickhouse-datasource#144
So for grafana-clickhouse-datasource
, the only input is rawSql
. They do a relative complex parsing/translating on the rawSql
and send to the datasource.
Another (possibly good) update is, the plugin team suggested we can retrieve the query results from api/ds/query
. I'll spend time to verify whether that's what we want, as it definitely contradicts with what being told by the Grafana team 🙃
https://grafana.com/docs/grafana/latest/developers/http_api/data_source/#query-a-data-source
Update: I did some small tests and it seems to work overall. So the test pipeline will be:
- call the dashboard api to retrieve the dashboard JSON
- parse the JSON and select the queries(including all the query configuration, datasource uid) from it
- call the query api to execute the query and get the result in a data frame form
from theia.
No, I mean is it possible to test these steps The datasource plugin takes the query as the input, send the query to the external datasource API, and return the query result data as the output to Grafana directly? Without rendering dashboards and comparing screenshots in a browser.
Understood. I don't see the plugin exposes such an interface to use. Let me open an issue on their repo to verify with them.
Comparing (a) talking to the clickhouse server and (b) talking to the clickhouse datasource plugin, I think the only difference is where do we do the query parsing/translating:
-
For (a), the input query should be in a form that the clickhouse server can understand. In the dashboard JSON file, it should be the
rawQuery
. -
For (b), the input query is directly the query we wrote, including some macro like
$timeFilter()
. In the dashboard JSON file, it should be thesqlQuery
. The datasource plugin will parse the query and translate it to the format that clickhouse server can understand, send request to clickhouse server.
One pending issue of (a) is, I found the rawQuery
is not always aligned with the sqlQuery
, I'm still waiting for the reply from the datasource plugin team.
from theia.
I think the queries regarding the table flows and other MVs should have no difference between distributed table and stand alone table? Does "most of the queries" refer to system tables for things like storage consumption? I think even if we introduce ClickHouse cluster, we still need to support both cases, i.e., everything should work for both stand alone ClickHouse and ClickHouse cluster.
Yes, only for system tables. Adding "cluster" can make sure we have same behavior for standalone ClickHouse and ClickHouse cluster.
from theia.
cc @dreamtalen @yanjunz97 @yuntanghsu @salv-orlando for discussion and evaluation. Any feedback would be much appreciated.
Updated the description by a little bit.
from theia.
If we want to verify the query result, one alternative is: Send request to Grafana dashboard API, get the dashboard JSON file, extract the query from the dashboard JSON, and run the query independently against the datasource.
This approach sounds acceptable to me, although the steps of executing queries and get back result in Grafana are not covered.
In our case is the grafana-clickhouse datasource. The datasource plugin takes the query as the input, send the query to the external datasource API, and return the query result data as the output to Grafana
Is it possible to run e2e tests for grafana-clickhouse datasource directly to cover this missing part in the above approach?
from theia.
Is it possible to run e2e tests for grafana-clickhouse datasource directly to cover this missing part in the above approach?
Are you referring to this e2e test https://github.com/grafana/clickhouse-datasource/tree/5ff43fc6d609912e3ae1163bf19699d43e03d1a8/cypress-e2e ? It is built with Cypress, needs to be run in a browser.
from theia.
Is it possible to run e2e tests for grafana-clickhouse datasource directly to cover this missing part in the above approach?
Are you referring to this e2e test https://github.com/grafana/clickhouse-datasource/tree/5ff43fc6d609912e3ae1163bf19699d43e03d1a8/cypress-e2e ? It is built with Cypress, needs to be run in a browser.
No, I mean is it possible to test these steps The datasource plugin takes the query as the input, send the query to the external datasource API, and return the query result data as the output to Grafana
directly? Without rendering dashboards and comparing screenshots in a browser.
from theia.
I think the only difference between (a) and (b) is the $timeFilter()
, which can be deleted? And we can just use the parameters in (a) for the $timeInterval,
One thing comes up to my mind is that after clickhouse cluster merged, not sure if the original query are still usable.
I'm sending query to clickhouse db to retrieve metrics and verify their values. But for most of the queries, I need to add "cluster" to query in order to all the data in shards. e.g. "cluster('{cluster}', INFORMATION_SCHEMA.COLUMNS)"
If you only use distributed engine table, I think it might not be an issue?
from theia.
I do not have much to add to the discussion. Regarding to choices between (a) and (b), (b) looks better to me, as it seems to include the datasource plugin into the whole e2e tests. But as it depends on the plugin API, let's wait to see the reply from them.
Besides, I'm curious about the fact that rawQuery
is not always aligned with the sqlQuery
. In this case, do we still get the result as we expected based on the sqlQuery
we write? Does that mean rawQuery
is not something the datasource plugin used to interact with the ClickHouse server? IMO it might be better if we can use what the plugin uses to talk to the ClickHouse server if we have to choose option (a).
from theia.
One thing comes up to my mind is that after clickhouse cluster merged, not sure if the original query are still usable.
I'm sending query to clickhouse db to retrieve metrics and verify their values. But for most of the queries, I need to add "cluster" to query in order to all the data in shards. e.g. "cluster('{cluster}', INFORMATION_SCHEMA.COLUMNS)" If you only use distributed engine table, I think it might not be an issue?
I think the queries regarding the table flows and other MVs should have no difference between distributed table and stand alone table? Does "most of the queries" refer to system tables for things like storage consumption? I think even if we introduce ClickHouse cluster, we still need to support both cases, i.e., everything should work for both stand alone ClickHouse and ClickHouse cluster.
from theia.
Related Issues (20)
- Adding support for TAD Sparkstreaming job version and aggregated flow throughput Anomaly Detection
- Add the real-time throughput anomaly detection HOT 2
- Add the support for the aggregated throughput anomaly detection HOT 1
- Throughput value reported by Grafana is different than the value reported by iperf HOT 1
- Port Scan Detection HOT 2
- Codecov Flag Support for e2e-tests and kind-e2e-tests HOT 1
- Integration Tests for Theia HOT 2
- Enable Dependabot to create PRs to update dependencies
- Document multi-cluster support
- Bump Antrea to 1.11.1 HOT 1
- Coverage for Python files
- Update Go version to 1.20
- L7 Visibility HOT 1
- Update `hack/generate-manifest.sh --mode antrea-e2e` HOT 5
- Ability to create a read only user
- [E2E]Better way to check connection setup between CH client and CH server HOT 1
- Fix ClickHouse crashing after ZooKeeper lost data HOT 2
- Dependabot in Snowflake Folder
- Replace deprecated `set-output` command with environment file
- Enabling L7FlowExporter in Antrea FlowAggregator fails with missing columns in table HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from theia.