[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

hanahmily · 2024-12-31T01:46:01Z

Search before asking

I had searched in the issues and found no similar feature requirement.

Description

BanyanDB's query pipeline currently utilizes the iterator pattern for sorting, aggregating, and limiting data. However, in the initial stage of the pipeline—the raw data retrieval—all data in the segments are loaded into memory. This approach can lead to excessive memory usage, especially for heavy aggregation queries, such as retrieving the top 10 items ordered by a tag over a large time range (e.g., "last month").

We propose extending the iterator pattern to the initial raw data retrieval step to address this issue. By doing so, we can significantly reduce memory consumption by streaming data from segments on-demand rather than loading all segment data into memory at once.

Use case

No response

Related issues

No response

Are you willing to submit a pull request to implement this on your own?

Yes I am willing to submit a pull request on my own!

Code of Conduct

I agree to follow this project's Code of Conduct

hanahmily added feature New feature database BanyanDB - SkyWalking native database labels Dec 31, 2024

hanahmily modified the milestones: BanyanDB-0.9.0, BanyanDB - 0.8.0 Dec 31, 2024

hanahmily mentioned this issue Jan 18, 2025

Introduce the batch scan to improve the performance of the query and limit the memory usage apache/skywalking-banyandb#597

Merged

2 tasks

hanahmily closed this as completed in apache/skywalking-banyandb#597 Jan 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

hanahmily commented Dec 31, 2024

[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

[Feature] Add Full Iterator Pattern to BanyanDB's Query Pipeline #12913

Comments

hanahmily commented Dec 31, 2024

Search before asking

Description

Use case

Related issues

Are you willing to submit a pull request to implement this on your own?

Code of Conduct