Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

Closed
2 of 3 tasks
hanahmily opened this issue Dec 31, 2024 · 0 comments · Fixed by apache/skywalking-banyandb#597
Closed
2 of 3 tasks
Labels
database BanyanDB - SkyWalking native database feature New feature

Comments

@hanahmily
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

BanyanDB's query pipeline currently utilizes the iterator pattern for sorting, aggregating, and limiting data. However, in the initial stage of the pipeline—the raw data retrieval—all data in the segments are loaded into memory. This approach can lead to excessive memory usage, especially for heavy aggregation queries, such as retrieving the top 10 items ordered by a tag over a large time range (e.g., "last month").

We propose extending the iterator pattern to the initial raw data retrieval step to address this issue. By doing so, we can significantly reduce memory consumption by streaming data from segments on-demand rather than loading all segment data into memory at once.

Use case

No response

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

  • Yes I am willing to submit a pull request on my own!

Code of Conduct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database BanyanDB - SkyWalking native database feature New feature
Projects
None yet
1 participant