Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: metrics engine cache #1624

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

zealchen
Copy link
Contributor

Rationale

#1623

Detailed Changes

  1. Sequence diagram
image
  1. Implement three caches: MetricsCache, SeriesCache, and TagIndexCache, each with an asynchronous serialization function.
  2. Implement async write to storage in batch mode.

TODO:

  1. Some parameters need to be made configurable.

Test Plan

UT

running 1 test
test index::cache::tests::test_cache_manager_updates ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.05s

     Running unittests src/lib.rs (target/debug/deps/pb_types-2fb383c00286addd)

@zealchen zealchen changed the title Feat metrics engine cache feat: metrics engine cache Jan 25, 2025
@github-actions github-actions bot added the feature New feature or request label Jan 25, 2025
@zealchen zealchen requested a review from jiacai2050 January 25, 2025 08:26
let metric_ids = samples
.iter()
.map(|s| MetricId(hash(s.name.as_slice())))
.collect::<Vec<_>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid create new Vec in write path, it will hurt perf.

});

// 2.1 update cache metrics
futures::future::join_all(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do this in MetricManager module.

}
}

pub struct CacheManager {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For index cache manager, it will only cache series, metrics will be managed be metrics manager.

In this way, our code is more modular.

}

struct TagIndexCache {
cache: DashMap<SegmentDuration, ConcurrentTagKVMap>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thoughts here:

  1. We should prefer std map over dash map, those third party may introduce unexpected bugs.
  2. For current segment cache, we can use an independent field to represent, so we can save one hashmap lookup.

Other questions we need to consider:

  1. How will you evict segment?

))
}

async fn load_from_storage(&mut self) -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache module shouldn't known how the persistence layer is implemented, we can move this method to IndexManager.

Ok(())
}

fn schema() -> Arc<Schema> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache module has no schema, this field belong to IndexManager.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants