GithubHelp home page GithubHelp logo

Comments (14)

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

This is a cloud service on top of duckdb - an embedded database engine for Python, which has some origins and ideas borrowed from ClickHouse.

The registration page works, but login fails in some cases:

Screenshot_20230724_040346

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

The UI looks cute:

Screenshot_20230724_040749

One issue with the UI - when I type my query and press Enter, the entered query disappears, and I cannot easily run it again.

SELECT 1 does not work:

Screenshot_20230724_040922

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

Every example from the list shows "Access denied":

Screenshot_20230724_041022

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

I tried to use it with our public bucket, s3://clickhouse-public-datasets/, but it shows "access denied".
It looks like I need to create an IAM role and switch to it: https://www.boilingdata.com/apidoc.html

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

My test query is:

SELECT * FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet') LIMIT 10

@santrancisco created an IAM role for me: arn:aws:iam::609927696493:role/boilingdata_test

and I've set it with

PRAGMA s3AccessRoleArn='arn:aws:iam::123456789012:role/bdS3';

as in the documentation.

But I'm still getting an access denied error.

from clickbench.

danthegoodman1 avatar danthegoodman1 commented on May 19, 2024

Sometimes it took over 2 minutes for a “select 1” to return for me

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

After waiting for a while, the "Access denied" error disappeared, but now I'm getting a cryptic error message:

{
  "status": "Forward count > 1, loop?! (2)"
}

Screenshot_20230724_041404

which I don't know what it means.

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

Sometimes it took over 2 minutes for a “select 1” to return for me

This contradicts the statement on their website:

Instant Computing Power For Realtime Dashboarding

Realtime dashboarding is a natural and well-explored use case for ClickHouse, and I don't like when other inferior engines, like duckdb, make unfounded claims, pretending to have similar capabilities.

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

Trying other queries:

SELECT count(*) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet');
SELECT count(DISTINCT URL) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet');
SELECT count(DISTINCT URL) FROM parquet_scan('s3://clickhouse-public-datasets/hits_compatible/hits.parquet') GROUP BY UserID;

but none works giving the same error Forward count > 1, loop?! (2)

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

@danthegoodman1

Sometimes it took over 2 minutes for a “select 1” to return for me

I'm waiting for several minutes, but then it logs me out.
When I log in again, there is no query history.

I didn't have a single time the SELECT 1 query succeeded before it logged out my session.

from clickbench.

dforsber avatar dforsber commented on May 19, 2024

Hello, sorry for replying this late! Since Jul we have new GUI and new backend release so things should work better. You could use the bdcli to setup the IAM Role as it crafts the IAM Policy for you.

  • I think the IAM Role needs access to any public S3 Buckets too (egress/outbound rule). I'm trying to query that public bucket above and I'm getting access denied error too (even I explicitly added that S3 Bucket clickhouse-public-datasets into the IAM policy. Is it possible that the bucket policy does not allow get-bucket-location API call?).
  • SELECT 1 now also works 🎉 😅 and it should respond fast and is probably a good measure of the overhead latency between client and Boiling cloud
  • For now, we are not focusing on building a great GUI, so the GUI is for demo purposes for now. Query history would be definitely nice and also ability to save the queries (there are tons of very good BI Tool/workbenches)
  • NOTE: For now, the only up-to-date AWS region is eu-west-1. So, you would need to have data there in this region, i.e. the S3 Bucket.

Does this help?

from clickbench.

dforsber avatar dforsber commented on May 19, 2024

This is the set of queries I'm running on Boiling. I've fixed some sort orders and column aliases to make the results comparable with local laptop running DuckDB instance. So, I have copied the hits.parquet file into Boiling demo S3 bucket on eu-west-1.

SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE AdvEngineID <> 0;
SELECT UserID FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE UserID = 435090932899640449;
SELECT UserID FROM parquet_scan('s3://boilingdata-demo/hits.parquet') ORDER BY UserID LIMIT 10;
SELECT COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%';
SELECT SearchPhrase, EventTime FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY EventTime, SearchPhrase LIMIT 10;
SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY SearchPhrase LIMIT 10;
SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' ORDER BY EventTime, SearchPhrase LIMIT 10;
SELECT AdvEngineID, COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE AdvEngineID > 0 GROUP BY AdvEngineID ORDER BY COUNT(*) DESC;
SELECT SearchPhrase, MIN(URL) AS url, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC, SearchPhrase LIMIT 10;
SELECT COUNT(*) AS total, SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase != '' GROUP BY SearchPhrase ORDER BY SearchPhrase, total DESC LIMIT 10;
SELECT COUNT(DISTINCT SearchPhrase) AS total FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase != '';
SELECT COUNT(DISTINCT UserID) AS uniqueUserIds FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT RegionID, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY RegionID ORDER BY u DESC LIMIT 10;
SELECT UserID, COUNT(*) AS count FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY UserID ORDER BY COUNT(*) DESC, UserID LIMIT 10;
SELECT UserID, SearchPhrase, COUNT(*) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY UserID, SearchPhrase ORDER BY COUNT(*) DESC, UserID LIMIT 10;
SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE MobilePhoneModel <> '' AND UserID <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE MobilePhoneModel <> ''AND UserID <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC, SearchPhrase LIMIT 13;
SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
SELECT UserID, COUNT(*) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE UserID <> NULL GROUP BY UserID ORDER BY COUNT(*) DESC LIMIT 10;
SELECT SUM(ResolutionWidth) AS resWth, SUM(ResolutionWidth + 1) AS resW1, SUM(ResolutionWidth + 2) AS resW2, SUM(ResolutionWidth + 3) AS resW3, SUM(ResolutionWidth + 4) AS resW4, SUM(ResolutionWidth + 5) AS resW5, SUM(ResolutionWidth + 6) AS resW6, SUM(ResolutionWidth + 7) AS resW7, SUM(ResolutionWidth + 8) AS resW8, SUM(ResolutionWidth + 9) AS resW9, SUM(ResolutionWidth + 10) AS resW10, SUM(ResolutionWidth + 11) AS resW11, SUM(ResolutionWidth + 12) AS resW12, SUM(ResolutionWidth + 13) AS resW13, SUM(ResolutionWidth + 14) AS resW14, SUM(ResolutionWidth + 15) AS resW15, SUM(ResolutionWidth + 16) AS resW16, SUM(ResolutionWidth + 17) AS resW17, SUM(ResolutionWidth + 18) AS resW18, SUM(ResolutionWidth + 19) AS resW19, SUM(ResolutionWidth + 20) AS resW20, SUM(ResolutionWidth + 21) AS resW21, SUM(ResolutionWidth + 22) AS resW22, SUM(ResolutionWidth + 23) AS resW23, SUM(ResolutionWidth + 24) AS resW24, SUM(ResolutionWidth + 25) AS resW25, SUM(ResolutionWidth + 26) AS resW26, SUM(ResolutionWidth + 27) AS resW27, SUM(ResolutionWidth + 28) AS resW28, SUM(ResolutionWidth + 29) AS resW29, SUM(ResolutionWidth + 30) AS resW30, SUM(ResolutionWidth + 31) AS resW31, SUM(ResolutionWidth + 32) AS resW32, SUM(ResolutionWidth + 33) AS resW33, SUM(ResolutionWidth + 34) AS resW34, SUM(ResolutionWidth + 35) AS resW35, SUM(ResolutionWidth + 36) AS resW36, SUM(ResolutionWidth + 37) AS resW37, SUM(ResolutionWidth + 38) AS resW38, SUM(ResolutionWidth + 39) AS resW39, SUM(ResolutionWidth + 40) AS resW40, SUM(ResolutionWidth + 41) AS resW41, SUM(ResolutionWidth + 42) AS resW42, SUM(ResolutionWidth + 43) AS resW43, SUM(ResolutionWidth + 44) AS resW44, SUM(ResolutionWidth + 45) AS resW45, SUM(ResolutionWidth + 46) AS resW46, SUM(ResolutionWidth + 47) AS resW47, SUM(ResolutionWidth + 48) AS resW48, SUM(ResolutionWidth + 49) AS resW49, SUM(ResolutionWidth + 50) AS resW50, SUM(ResolutionWidth + 51) AS resW51, SUM(ResolutionWidth + 52) AS resW52, SUM(ResolutionWidth + 53) AS resW53, SUM(ResolutionWidth + 54) AS resW54, SUM(ResolutionWidth + 55) AS resW55, SUM(ResolutionWidth + 56) AS resW56, SUM(ResolutionWidth + 57) AS resW57, SUM(ResolutionWidth + 58) AS resW58, SUM(ResolutionWidth + 59) AS resW59, SUM(ResolutionWidth + 60) AS resW60, SUM(ResolutionWidth + 61) AS resW61, SUM(ResolutionWidth + 62) AS resW62, SUM(ResolutionWidth + 63) AS resW63, SUM(ResolutionWidth + 64) AS resW64, SUM(ResolutionWidth + 65) AS resW65, SUM(ResolutionWidth + 66) AS resW66, SUM(ResolutionWidth + 67) AS resW67, SUM(ResolutionWidth + 68) AS resW68, SUM(ResolutionWidth + 69) AS resW69, SUM(ResolutionWidth + 70) AS resW70, SUM(ResolutionWidth + 71) AS resW71, SUM(ResolutionWidth + 72) AS resW72, SUM(ResolutionWidth + 73) AS resW73, SUM(ResolutionWidth + 74) AS resW74, SUM(ResolutionWidth + 75) AS resW75, SUM(ResolutionWidth + 76) AS resW76, SUM(ResolutionWidth + 77) AS resW77, SUM(ResolutionWidth + 78) AS resW78, SUM(ResolutionWidth + 79) AS resW79, SUM(ResolutionWidth + 80) AS resW80, SUM(ResolutionWidth + 81) AS resW81, SUM(ResolutionWidth + 82) AS resW82, SUM(ResolutionWidth + 83) AS resW83, SUM(ResolutionWidth + 84) AS resW84, SUM(ResolutionWidth + 85) AS resW85, SUM(ResolutionWidth + 86) AS resW86, SUM(ResolutionWidth + 87) AS resW87, SUM(ResolutionWidth + 88) AS resW88, SUM(ResolutionWidth + 89) AS resW89 FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT SearchPhrase, MIN(URL) AS url, MIN(Title) AS title, COUNT(*) AS c, COUNT(DISTINCT UserID) AS usrIdCount FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE Title LIKE '%Google%' AND URL NOT LIKE '%.google.%' AND SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c, searchphrase DESC LIMIT 10;
SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh), AVG(ResolutionWidth) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY WatchID, ClientIP ORDER BY c DESC, WatchID, ClientIP LIMIT 10;
SELECT URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY URL ORDER BY c DESC LIMIT 12;
SELECT URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE url ='http://kinopoisk.ru/perm.irr.ru' GROUP BY URL;
SELECT SUM(AdvEngineID), COUNT(*), AVG(ResolutionWidth) FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT AVG(UserID) FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT MIN(EventDate) AS minEventDate, MAX(EventDate) AS maxEventDate FROM parquet_scan('s3://boilingdata-demo/hits.parquet');
SELECT RegionID, SUM(AdvEngineID) AS sumAdvEngineId, COUNT(*) AS c, AVG(ResolutionWidth) AS avgResolutionWidth, COUNT(DISTINCT UserID) AS distinctUsers FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY RegionID ORDER BY c DESC LIMIT 10;
SELECT 1, URL, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY 1, URL ORDER BY c DESC LIMIT 10;
SELECT URL, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0 AND IsRefresh = 0 AND URL <> '' GROUP BY URL ORDER BY PageViews DESC LIMIT 10;
SELECT Title, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND DontCountHits = 0 AND IsRefresh = 0 AND Title <> '' GROUP BY Title ORDER BY PageViews DESC LIMIT 10;
SELECT URL, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND IsLink <> 0 AND IsDownload = 0 GROUP BY URL ORDER BY PageViews DESC LIMIT 10 OFFSET 1000;
SELECT TraficSourceID, SearchEngineID, AdvEngineID, CASE WHEN (SearchEngineID = 0 AND AdvEngineID = 0) THEN Referer ELSE '' END AS Src, URL AS Dst, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 GROUP BY TraficSourceID, SearchEngineID, AdvEngineID, Src, Dst ORDER BY PageViews DESC LIMIT 10 OFFSET 1000;
SELECT URLHash, EventDate, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND TraficSourceID IN (-1, 6) AND RefererHash = 3594120000172545465 GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 10 OFFSET 100;
SELECT WindowClientWidth, WindowClientHeight, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND DontCountHits = 0 AND URLHash = 2868770270353813622 GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10 OFFSET 10000;
SELECT REGEXP_REPLACE(Referer, '^https?://(?:www.)?([^/]+)/.*$', '\x01') AS k, AVG(STRLEN(Referer)) AS l, COUNT(*) AS c, MIN(Referer) FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE Referer <> '' GROUP BY k HAVING COUNT(*) > 100000 ORDER BY l DESC, c LIMIT 25;
SELECT SearchEngineID, ClientIP, COUNT(*) AS c, SUM(IsRefresh) AS sumIsRefresh, AVG(ResolutionWidth) AS avgResWidth FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY SearchEngineID, ClientIP ORDER BY c DESC, SearchEngineID, ClientIP LIMIT 10;
SELECT WatchID, ClientIP, COUNT(*) AS c, SUM(IsRefresh) AS sumIsRefresh, AVG(ResolutionWidth) AS AvgRsesWidth FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE SearchPhrase <> '' GROUP BY WatchID, ClientIP ORDER BY c, WatchID, ClientIP DESC LIMIT 10;
SELECT ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY ClientIP, ClientIP - 1, ClientIP - 2, ClientIP - 3 ORDER BY c DESC, ClientIP LIMIT 11;
SELECT COUNT(*) AS count FROM ( SELECT SearchPhrase FROM parquet_scan('s3://boilingdata-demo/hits.parquet') GROUP BY SearchPhrase ) AS a ORDER BY count DESC;
SELECT DATE_TRUNC('minute', EventTime::timestamp) AS M, COUNT(*) AS PageViews FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE CounterID = 62 AND EventDate >= '2013-07-14' AND EventDate <= '2013-07-15' AND IsRefresh = 0 AND DontCountHits = 0 GROUP BY DATE_TRUNC('minute', EventTime::timestamp) ORDER BY DATE_TRUNC('minute', EventTime::timestamp) LIMIT 10 OFFSET 1000;
SELECT CounterID, AVG(STRLEN(URL)) AS l, COUNT(*) AS c FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL <> '' GROUP BY CounterID HAVING COUNT(*) > 100000 ORDER BY l DESC LIMIT 25;
SELECT * FROM parquet_scan('s3://boilingdata-demo/hits.parquet') WHERE URL LIKE '%google%' ORDER BY EventTime, WatchID LIMIT 1;

from clickbench.

alexey-milovidov avatar alexey-milovidov commented on May 19, 2024

Thank you! Yes, let's try these queries.
I can also try if I will not forget, but you can also submit the results.

from clickbench.

dforsber avatar dforsber commented on May 19, 2024

I've created "engineering white paper" with the test results. Link is on front page: https://www.boilingdata.com/

from clickbench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.