Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate why queries 41, 42 and 43 from clickbench are failing #1326

Closed
robert3005 opened this issue Nov 15, 2024 · 4 comments
Closed

Investigate why queries 41, 42 and 43 from clickbench are failing #1326

robert3005 opened this issue Nov 15, 2024 · 4 comments

Comments

@robert3005
Copy link
Member

after #1304 we will have clickbench benchmarks but they fail with misaligned reads

@AdamGS
Copy link
Contributor

AdamGS commented Nov 15, 2024

The queries are:

SELECT URLHash, EventDate, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND TraficSourceID IN (-1, 6) AND RefererHash = 3594120000172545465 GROUP BY URLHash, EventDate ORDER BY PageViews DESC LIMIT 10 OFFSET 100;
SELECT WindowClientWidth, WindowClientHeight, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-01' AND EventDate <= '2013-07-31' AND IsRefresh = 0 AND DontCountHits = 0 AND URLHash = 2868770270353813622 GROUP BY WindowClientWidth, WindowClientHeight ORDER BY PageViews DESC LIMIT 10 OFFSET 10000;
SELECT DATE_TRUNC('minute', EventTime) AS M, COUNT(*) AS PageViews FROM hits WHERE CounterID = 62 AND EventDate >= '2013-07-14' AND EventDate <= '2013-07-15' AND IsRefresh = 0 AND DontCountHits = 0 GROUP BY DATE_TRUNC('minute', EventTime) ORDER BY DATE_TRUNC('minute', EventTime) LIMIT 10 OFFSET 1000;

@robert3005
Copy link
Member Author

#1346 fixed part of the issue, still failing though on develop

@robert3005
Copy link
Member Author

#1365 fixed some more of it, still not working

@robert3005
Copy link
Member Author

#1367, #1368, #1369 need to be fixed as well

gatesn added a commit that referenced this issue Dec 2, 2024
Adds a [clickbench](https://github.com/ClickHouse/ClickBench) run to our
datafusion benchmarks. ~Some queries fail, see #1326.~
Running this is pretty resource intensive as the dataset is ~100m
rows/15GB of parquet.

---------

Co-authored-by: Robert Kruszewski <[email protected]>
Co-authored-by: Dan King <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Andrew Duffy <[email protected]>
Co-authored-by: Will Manning <[email protected]>
Co-authored-by: Nicholas Gates <[email protected]>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants