From 329766e79f305539ea37849291724f731cd81684 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Tue, 8 Oct 2024 15:47:04 -0400 Subject: [PATCH 01/22] adapter: Expression cache design doc This commit adds a design doc for an optimized expression cache. Works towards resolving #MaterializeInc/database-issues/issues/8384 --- .../design/2024_10_08_expression_cache.md | 135 ++++++++++++++++++ 1 file changed, 135 insertions(+) create mode 100644 doc/developer/design/2024_10_08_expression_cache.md diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md new file mode 100644 index 0000000000000..e4b05c1f67b42 --- /dev/null +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -0,0 +1,135 @@ +# Expression Cache + +## The Problem + +Optimization is a slow process that is on the critical path for `environmentd` startup times. Some +environments spend 30 seconds just on optimization. All optimization time spent in startup is +experienced downtime for users when `environmentd` restarts. + +## Success Criteria + +Startup spends less than 1 second optimizing expressions. + +## Solution Proposal + +The solution being proposed in this document is a cache of optimized expressions. During startup, +`environmentd` will first look in the cache for optimized expressions and only compute a new +expression if it isn't present in the cache. If enough expressions are cached and the cache is fast +enough, then the time spent on this part of startup should be small. + +The cache will present similarly as a key-value value store where the key is a composite of + + - deployment generation + - object global ID + - expression type (local MIR, global MIR, LIR, etc) + +The value will be a serialized version of the optimized expression. The cache will also be made +durable so that it's available after a restart, at least within the same deployment generation. + +Upgrading an environment will look something like this: + +1. Start deploy generation `n` in read-only mode. +2. Populate the expression cache for generation `n`. +3. Start deploy generation `n` in read-write mode. +4. Read optimized expressions from cache. + +Restarting an environment will look something like this: + +1. Start deploy generation `n` in read-write mode. +2. Read optimized expressions from cache. + +### Cache API + +Below is the API that the cache will present. It may be further wrapped with typed methods that +take care of serializing and deserializing bytes. Additionally, we probably don't need a trait when +implementing. + +```Rust +trait ExpressionCache { + /// Returns the `expression_type` of `global_id` that is currently deployed in a cluster. This + /// will not change in-between restarts as result of DDL, as long as `global_id` exists. + fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; + + /// Returns the `expression_type` of `global_id` based on the current catalog contents of + /// `deploy_generation`. This may change in-between restarts as result of DDL. + fn get_durable_expression(&self, deploy_generation: u64, global_id: GlobalId, expression_type: ExpressionType) -> Option; + + /// Durably inserts `expression`, with key `(deploy_generation, global_id, expression_type)`. + /// + /// Panics if `(deploy_generation, global_id, expression_type)` already exists. + fn insert_expression(&mut self, deploy_generation: u64, global_id: GlobalId, expression_type: ExpressionType, expression: Bytes); + + /// Durably remove and return all entries in `deploy_generation` that depend on an ID in + /// `dropped_ids`. + fn invalidate_entries(&mut self, deploy_generation: u64, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; + + /// Durably removes all entries in `deploy_generation`. + fn remove_deploy_generation(&mut self, deploy_generation: u64); + + /// Remove all entries that depend on a global ID that is not present in `txn`. + fn reconcile(&mut self, txn: mz_catalog::durable::Transaction); +} +``` + +### Startup + +Below is a detailed set of steps that will happen in startup. + +1. Call `ExpressionCache::reconcile` to remove any invalid entries. +2. While opening the catalog, for each object: + a. If the object is present in the cache, read the cached optimized expression via + `ExpressionCache::get_durable_expression`. + b. Else generate the optimized expressions and insert the expressions via + `ExpressionCache::insert_expression`. +3. If in read-write mode, call `ExpressionCache::remove_deploy_generation` to remove the previous + deploy generation. + +### DDL - Create + +1. Execute catalog transaction. +2. Update cache via `ExpressionCache::insert_expression`. + +### DDL - Drop +1. Execute catalog transaction. +2. Invalidate cache entries via `ExpressionCache::invalidate_entries`. +3. Re-compute and repopulate cache entries that depended on dropped entries via + `ExpressionCache::insert_expression`. + +### File System Implementation + +One potential implementation is via the filesystem of an attached durable storage to environmentd. +Each cache entry would be saved as a file of the format +`/path/to/cache///`. + +#### Pros +- No need to worry about coordination across K8s pods. +- Bulk deletion is a simple directory delete. + +#### Cons +- Need to worry about Flushing/fsync. +- Need to worry about concurrency. +- Need to worry about atomicity. +- Need to worry about mocking things in memory for tests. +- If we lose the pod, then we also lose the cache. + +### Persist implementation + +Another potential implementation is via persist. Each cache entry would be keyed by +`(deploy_generation, global_id, expression_type)` and the value would be a serialized version of the +expression. + +#### Pros +- Flushing, concurrency, atomicity, mocking are already implemented by persist. + +#### Cons +- We need to worry about coordinating access across multiple pods. It's expected that during + upgrades at least two `environmentd`s will be communicating with the cache. + +## Open questions + +- Which implementation should we use? +- If we use the persist implementation, how do we coordinate writes across pods? + - I haven't thought much about this, but here's one idea. The cache will maintain a subscribe on + the persist shard. Everytime it experiences an upper mismatch, it will listen for all new + changes. If any of the changes contain the current deploy generation, then panic, else ignore + them. From d371cdfb7b92ce757c1eb56db4aebbe8cda33ac5 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Tue, 8 Oct 2024 17:34:57 -0400 Subject: [PATCH 02/22] Fix lint --- doc/developer/design/2024_10_08_expression_cache.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index e4b05c1f67b42..624c2d5837c93 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -62,10 +62,10 @@ trait ExpressionCache { /// Durably remove and return all entries in `deploy_generation` that depend on an ID in /// `dropped_ids`. fn invalidate_entries(&mut self, deploy_generation: u64, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; - + /// Durably removes all entries in `deploy_generation`. fn remove_deploy_generation(&mut self, deploy_generation: u64); - + /// Remove all entries that depend on a global ID that is not present in `txn`. fn reconcile(&mut self, txn: mz_catalog::durable::Transaction); } @@ -81,23 +81,23 @@ Below is a detailed set of steps that will happen in startup. `ExpressionCache::get_durable_expression`. b. Else generate the optimized expressions and insert the expressions via `ExpressionCache::insert_expression`. -3. If in read-write mode, call `ExpressionCache::remove_deploy_generation` to remove the previous +3. If in read-write mode, call `ExpressionCache::remove_deploy_generation` to remove the previous deploy generation. ### DDL - Create 1. Execute catalog transaction. -2. Update cache via `ExpressionCache::insert_expression`. +2. Update cache via `ExpressionCache::insert_expression`. ### DDL - Drop 1. Execute catalog transaction. 2. Invalidate cache entries via `ExpressionCache::invalidate_entries`. -3. Re-compute and repopulate cache entries that depended on dropped entries via +3. Re-compute and repopulate cache entries that depended on dropped entries via `ExpressionCache::insert_expression`. ### File System Implementation -One potential implementation is via the filesystem of an attached durable storage to environmentd. +One potential implementation is via the filesystem of an attached durable storage to `environmentd`. Each cache entry would be saved as a file of the format `/path/to/cache///`. From 90a3a5b5ce63c289784b9a4ff27770b7a652886d Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Tue, 8 Oct 2024 17:36:49 -0400 Subject: [PATCH 03/22] Add compatibility guarantees --- doc/developer/design/2024_10_08_expression_cache.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 624c2d5837c93..f44ee6b29505b 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -23,8 +23,11 @@ The cache will present similarly as a key-value value store where the key is a c - object global ID - expression type (local MIR, global MIR, LIR, etc) -The value will be a serialized version of the optimized expression. The cache will also be made -durable so that it's available after a restart, at least within the same deployment generation. +The value will be a serialized version of the optimized expression. An `environmentd` process with +deploy generation `n`, will never be expected to look at a serialized expression with a deploy +generation `m` s.t. `n != m`. Therefore, there are no forwards or backwards compatibility needed on +the serialized representation of expressions. The cache will also be made durable so that it's +available after a restart, at least within the same deployment generation. Upgrading an environment will look something like this: From b7d4aae777f85329a4607259d1ff01df403455b3 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 09:38:32 -0400 Subject: [PATCH 04/22] Fix lint --- doc/developer/design/2024_10_08_expression_cache.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index f44ee6b29505b..57a4550b86c0a 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -23,8 +23,8 @@ The cache will present similarly as a key-value value store where the key is a c - object global ID - expression type (local MIR, global MIR, LIR, etc) -The value will be a serialized version of the optimized expression. An `environmentd` process with -deploy generation `n`, will never be expected to look at a serialized expression with a deploy +The value will be a serialized version of the optimized expression. An `environmentd` process with +deploy generation `n`, will never be expected to look at a serialized expression with a deploy generation `m` s.t. `n != m`. Therefore, there are no forwards or backwards compatibility needed on the serialized representation of expressions. The cache will also be made durable so that it's available after a restart, at least within the same deployment generation. From 1cf7925b29d6fd3d6e121d7128eb00d2d5296660 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:13:38 -0400 Subject: [PATCH 05/22] Expand on persist cons and alternatives --- doc/developer/design/2024_10_08_expression_cache.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 57a4550b86c0a..635d28433e2c1 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -127,6 +127,12 @@ expression. #### Cons - We need to worry about coordinating access across multiple pods. It's expected that during upgrades at least two `environmentd`s will be communicating with the cache. +- We need to worry about compaction and read latency during startup. + +## Alternatives + +For the persist implementation, we could mint a new shard for each deploy generation. This would +require us to finalize old shards during startup which would accumulate shard tombstones in CRDB. ## Open questions From 677bcfc4c339bc72956cc0a047e70c0c7a8e2e26 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:14:38 -0400 Subject: [PATCH 06/22] Update --- doc/developer/design/2024_10_08_expression_cache.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 635d28433e2c1..41398740364b8 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -66,8 +66,8 @@ trait ExpressionCache { /// `dropped_ids`. fn invalidate_entries(&mut self, deploy_generation: u64, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; - /// Durably removes all entries in `deploy_generation`. - fn remove_deploy_generation(&mut self, deploy_generation: u64); + /// Durably removes all entries with a deploy generation <= `deploy_generation`. + fn remove_deploy_generations(&mut self, deploy_generation: u64); /// Remove all entries that depend on a global ID that is not present in `txn`. fn reconcile(&mut self, txn: mz_catalog::durable::Transaction); From a7ff06120aa5eb6d30f956d76b95853306f028f8 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:24:11 -0400 Subject: [PATCH 07/22] Add rationale for --- .../design/2024_10_08_expression_cache.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 41398740364b8..25a86aa187d50 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -41,6 +41,19 @@ Restarting an environment will look something like this: 1. Start deploy generation `n` in read-write mode. 2. Read optimized expressions from cache. +### Prior Art + +The catalog currently has an in-memory expression cache. + + - [https://github.com/MaterializeInc/materialize/blob/bff231953f4bb97b70cae81bdd6dd1716dbf8cec/src/adapter/src/catalog.rs#L127](https://github.com/MaterializeInc/materialize/blob/bff231953f4bb97b70cae81bdd6dd1716dbf8cec/src/adapter/src/catalog.rs#L127) + - [https://github.com/MaterializeInc/materialize/blob/bff231953f4bb97b70cae81bdd6dd1716dbf8cec/src/adapter/src/catalog.rs#L145-L345](https://github.com/MaterializeInc/materialize/blob/bff231953f4bb97b70cae81bdd6dd1716dbf8cec/src/adapter/src/catalog.rs#L145-L345) + +This cache is used to serve `EXPLAIN` queries to ensure accurate and consistent responses. When an +index is dropped, it may change how an object _would_ be optimized, but it does not change how the +object is currently deployed in a cluster. This cache contains the expressions that are deployed in +a cluster, but not necessarily the expressions that would result from optimization from the current +catalog contents. + ### Cache API Below is the API that the cache will present. It may be further wrapped with typed methods that @@ -51,6 +64,8 @@ implementing. trait ExpressionCache { /// Returns the `expression_type` of `global_id` that is currently deployed in a cluster. This /// will not change in-between restarts as result of DDL, as long as `global_id` exists. + /// + /// This is useful for serving `EXPLAIN` queries. fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; /// Returns the `expression_type` of `global_id` based on the current catalog contents of From d5bc28cf82c54a6e4b0b9c38cbe3257900529628 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:38:35 -0400 Subject: [PATCH 08/22] Remove from API --- .../design/2024_10_08_expression_cache.md | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 25a86aa187d50..637d861463cd6 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -62,27 +62,31 @@ implementing. ```Rust trait ExpressionCache { + /// Opens a new [`ExpressionCache`] for `deploy_generation`. + fn open(deploy_generation: u64) -> Self; + /// Returns the `expression_type` of `global_id` that is currently deployed in a cluster. This /// will not change in-between restarts as result of DDL, as long as `global_id` exists. /// /// This is useful for serving `EXPLAIN` queries. fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; - /// Returns the `expression_type` of `global_id` based on the current catalog contents of - /// `deploy_generation`. This may change in-between restarts as result of DDL. - fn get_durable_expression(&self, deploy_generation: u64, global_id: GlobalId, expression_type: ExpressionType) -> Option; + /// Returns the `expression_type` of `global_id` based on the current catalog contents. This + /// may change in-between restarts as result of DDL. + fn get_durable_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; /// Durably inserts `expression`, with key `(deploy_generation, global_id, expression_type)`. /// /// Panics if `(deploy_generation, global_id, expression_type)` already exists. - fn insert_expression(&mut self, deploy_generation: u64, global_id: GlobalId, expression_type: ExpressionType, expression: Bytes); + fn insert_expression(&mut self, global_id: GlobalId, expression_type: ExpressionType, expression: Bytes); /// Durably remove and return all entries in `deploy_generation` that depend on an ID in /// `dropped_ids`. - fn invalidate_entries(&mut self, deploy_generation: u64, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; + fn invalidate_entries(&mut self, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; - /// Durably removes all entries with a deploy generation <= `deploy_generation`. - fn remove_deploy_generations(&mut self, deploy_generation: u64); + /// Durably removes all entries with a deploy generation less than this cache's deploy + /// generation. + fn cleanup_all_prior_deploy_generations(&mut self); /// Remove all entries that depend on a global ID that is not present in `txn`. fn reconcile(&mut self, txn: mz_catalog::durable::Transaction); From d8cdec10b78526e79327996833fa4eb7f2ccb732 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:43:25 -0400 Subject: [PATCH 09/22] Make invalidation optional --- doc/developer/design/2024_10_08_expression_cache.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 637d861463cd6..979f62c5da16b 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -82,14 +82,16 @@ trait ExpressionCache { /// Durably remove and return all entries in `deploy_generation` that depend on an ID in /// `dropped_ids`. + /// + /// Optional for v1. fn invalidate_entries(&mut self, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; /// Durably removes all entries with a deploy generation less than this cache's deploy /// generation. fn cleanup_all_prior_deploy_generations(&mut self); - /// Remove all entries that depend on a global ID that is not present in `txn`. - fn reconcile(&mut self, txn: mz_catalog::durable::Transaction); + /// Remove all entries that depend on a global ID that is not present in `ids`. + fn reconcile(&mut self, ids: &BTreeSet); } ``` @@ -112,6 +114,10 @@ Below is a detailed set of steps that will happen in startup. 2. Update cache via `ExpressionCache::insert_expression`. ### DDL - Drop + +This is optional for v1, `ExpressionCache::reconcile` on startup will update the cache to the +correct state. + 1. Execute catalog transaction. 2. Invalidate cache entries via `ExpressionCache::invalidate_entries`. 3. Re-compute and repopulate cache entries that depended on dropped entries via From 94719731ff394f375a75cfe2429c743fc748a1c3 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 10:56:02 -0400 Subject: [PATCH 10/22] Allow batching inserts --- doc/developer/design/2024_10_08_expression_cache.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 979f62c5da16b..3dfc03f6000b0 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -61,7 +61,7 @@ take care of serializing and deserializing bytes. Additionally, we probably don' implementing. ```Rust -trait ExpressionCache { +trait ExpressionCache { /// Opens a new [`ExpressionCache`] for `deploy_generation`. fn open(deploy_generation: u64) -> Self; @@ -69,16 +69,18 @@ trait ExpressionCache { /// will not change in-between restarts as result of DDL, as long as `global_id` exists. /// /// This is useful for serving `EXPLAIN` queries. - fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; + fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; /// Returns the `expression_type` of `global_id` based on the current catalog contents. This /// may change in-between restarts as result of DDL. - fn get_durable_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; + fn get_durable_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; /// Durably inserts `expression`, with key `(deploy_generation, global_id, expression_type)`. /// - /// Panics if `(deploy_generation, global_id, expression_type)` already exists. - fn insert_expression(&mut self, global_id: GlobalId, expression_type: ExpressionType, expression: Bytes); + /// Returns a [`Future`] that completes once `expressions` have been made durable. + /// + /// Panics if any `(GlobalId, ExpressionType)` pair already exists in the cache. + fn insert_expressions(&mut self, expressions: Vec<(GlobalId, ExpressionType, T)>) -> impl Future; /// Durably remove and return all entries in `deploy_generation` that depend on an ID in /// `dropped_ids`. From cad39d42472614e248a0468ebc0e544044c6e905 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 13:14:47 -0400 Subject: [PATCH 11/22] Combine all expressions into a single blob --- .../design/2024_10_08_expression_cache.md | 44 ++++++++++++------- 1 file changed, 28 insertions(+), 16 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 3dfc03f6000b0..33eebf0b1ef52 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -21,7 +21,6 @@ The cache will present similarly as a key-value value store where the key is a c - deployment generation - object global ID - - expression type (local MIR, global MIR, LIR, etc) The value will be a serialized version of the optimized expression. An `environmentd` process with deploy generation `n`, will never be expected to look at a serialized expression with a deploy @@ -61,32 +60,45 @@ take care of serializing and deserializing bytes. Additionally, we probably don' implementing. ```Rust -trait ExpressionCache { + + +/// All the cached expressions for a single `GlobalId`. +/// +/// Note: This is just a placeholder for now, don't index too hard on the exact fields. I haven't +/// done the necessary research to figure out what they are. +struct Expressions { + local_mir: OptimizedMirRelationExpr, + global_mir: DataflowDescription, + physical_plan: DataflowDescription, + dataflow_metainfos: DataflowMetainfo>, + notices: SmallVec<[Arc; 4]>, +} + +trait ExpressionCache { /// Opens a new [`ExpressionCache`] for `deploy_generation`. fn open(deploy_generation: u64) -> Self; - /// Returns the `expression_type` of `global_id` that is currently deployed in a cluster. This - /// will not change in-between restarts as result of DDL, as long as `global_id` exists. + /// Returns the optimized expressions of `global_id` that is currently deployed in a cluster. + /// This will not change in-between restarts as result of DDL, as long as `global_id` exists. /// /// This is useful for serving `EXPLAIN` queries. - fn get_deployed_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; + fn get_deployed_expressions(&self, global_id: GlobalId) -> Option<&Expressions>; - /// Returns the `expression_type` of `global_id` based on the current catalog contents. This + /// Returns the optimized expressions of `global_id` based on the current catalog contents. This /// may change in-between restarts as result of DDL. - fn get_durable_expression(&self, global_id: GlobalId, expression_type: ExpressionType) -> Option; + fn get_durable_expressions(&self, global_id: GlobalId) -> Option<&Expressions>; - /// Durably inserts `expression`, with key `(deploy_generation, global_id, expression_type)`. + /// Durably inserts `expressions`. /// /// Returns a [`Future`] that completes once `expressions` have been made durable. /// - /// Panics if any `(GlobalId, ExpressionType)` pair already exists in the cache. - fn insert_expressions(&mut self, expressions: Vec<(GlobalId, ExpressionType, T)>) -> impl Future; + /// Panics if any `GlobalId` already exists in the cache. + fn insert_expressions(&mut self, expressions: Vec<(GlobalId, Expressions)>) -> impl Future; - /// Durably remove and return all entries in `deploy_generation` that depend on an ID in - /// `dropped_ids`. + /// Durably remove and return all entries that depend on an ID in `dropped_ids`. /// /// Optional for v1. - fn invalidate_entries(&mut self, dropped_ids: BTreeSet) -> Vec<(GlobalId, ExpressionType)>; + fn invalidate_entries(&mut self, dropped_ids: BTreeSet) -> Vec<(GlobalId, Expressions)>; /// Durably removes all entries with a deploy generation less than this cache's deploy /// generation. @@ -117,7 +129,7 @@ Below is a detailed set of steps that will happen in startup. ### DDL - Drop -This is optional for v1, `ExpressionCache::reconcile` on startup will update the cache to the +This is optional for v1, on startup `ExpressionCache::reconcile` will update the cache to the correct state. 1. Execute catalog transaction. @@ -129,7 +141,7 @@ correct state. One potential implementation is via the filesystem of an attached durable storage to `environmentd`. Each cache entry would be saved as a file of the format -`/path/to/cache///`. +`/path/to/cache//`. #### Pros - No need to worry about coordination across K8s pods. @@ -145,7 +157,7 @@ Each cache entry would be saved as a file of the format ### Persist implementation Another potential implementation is via persist. Each cache entry would be keyed by -`(deploy_generation, global_id, expression_type)` and the value would be a serialized version of the +`(deploy_generation, global_id)` and the value would be a serialized version of the expression. #### Pros From 9eec23bc09b51768374a2cad71a77a377c007045 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 13:24:46 -0400 Subject: [PATCH 12/22] Fix lint --- doc/developer/design/2024_10_08_expression_cache.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/2024_10_08_expression_cache.md index 33eebf0b1ef52..b885151eb6edb 100644 --- a/doc/developer/design/2024_10_08_expression_cache.md +++ b/doc/developer/design/2024_10_08_expression_cache.md @@ -77,10 +77,10 @@ struct Expressions { trait ExpressionCache { /// Opens a new [`ExpressionCache`] for `deploy_generation`. fn open(deploy_generation: u64) -> Self; - + /// Returns the optimized expressions of `global_id` that is currently deployed in a cluster. /// This will not change in-between restarts as result of DDL, as long as `global_id` exists. - /// + /// /// This is useful for serving `EXPLAIN` queries. fn get_deployed_expressions(&self, global_id: GlobalId) -> Option<&Expressions>; From 03753e595ec8a9f5b8aa6ba1bd920199c3719419 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 16:41:07 -0400 Subject: [PATCH 13/22] Fix file name --- ...024_10_08_expression_cache.md => 20241008_expression_cache.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename doc/developer/design/{2024_10_08_expression_cache.md => 20241008_expression_cache.md} (100%) diff --git a/doc/developer/design/2024_10_08_expression_cache.md b/doc/developer/design/20241008_expression_cache.md similarity index 100% rename from doc/developer/design/2024_10_08_expression_cache.md rename to doc/developer/design/20241008_expression_cache.md From 57e8d648b2acd4707b9acfed5997055a066a5020 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 16:51:20 -0400 Subject: [PATCH 14/22] Simplify API further --- .../design/20241008_expression_cache.md | 45 ++++++++----------- 1 file changed, 19 insertions(+), 26 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index b885151eb6edb..af2a1eefec5ef 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -75,37 +75,31 @@ struct Expressions { } trait ExpressionCache { - /// Opens a new [`ExpressionCache`] for `deploy_generation`. - fn open(deploy_generation: u64) -> Self; + /// Creates a new [`ExpressionCache`] for `deploy_generation`. + fn new(&mut self, deploy_generation: u64) -> Self; - /// Returns the optimized expressions of `global_id` that is currently deployed in a cluster. - /// This will not change in-between restarts as result of DDL, as long as `global_id` exists. + /// Remove all entries in current deploy generation that depend on a global ID that is not + /// present in `current_ids`. /// - /// This is useful for serving `EXPLAIN` queries. - fn get_deployed_expressions(&self, global_id: GlobalId) -> Option<&Expressions>; + /// Returns all cached expressions. + fn reconcile_and_open(&mut self, current_ids: &BTreeSet) -> Vec<(GlobalId, Expressions)>; - /// Returns the optimized expressions of `global_id` based on the current catalog contents. This - /// may change in-between restarts as result of DDL. - fn get_durable_expressions(&self, global_id: GlobalId) -> Option<&Expressions>; + /// Durably removes all entries with a deploy generation less than this cache's deploy + /// generation. + fn cleanup_all_prior_deploy_generations(&mut self); - /// Durably inserts `expressions`. + /// Durably inserts `expressions` into current deploy generation. /// /// Returns a [`Future`] that completes once `expressions` have been made durable. /// /// Panics if any `GlobalId` already exists in the cache. fn insert_expressions(&mut self, expressions: Vec<(GlobalId, Expressions)>) -> impl Future; - /// Durably remove and return all entries that depend on an ID in `dropped_ids`. + /// Durably remove and return all entries in current deploy generation that depend on an ID in + /// `dropped_ids` . /// /// Optional for v1. fn invalidate_entries(&mut self, dropped_ids: BTreeSet) -> Vec<(GlobalId, Expressions)>; - - /// Durably removes all entries with a deploy generation less than this cache's deploy - /// generation. - fn cleanup_all_prior_deploy_generations(&mut self); - - /// Remove all entries that depend on a global ID that is not present in `ids`. - fn reconcile(&mut self, ids: &BTreeSet); } ``` @@ -113,29 +107,28 @@ trait ExpressionCache { Below is a detailed set of steps that will happen in startup. -1. Call `ExpressionCache::reconcile` to remove any invalid entries. +1. Call `ExpressionCache::open` to remove any invalid entries and retrieve cached entries. 2. While opening the catalog, for each object: - a. If the object is present in the cache, read the cached optimized expression via - `ExpressionCache::get_durable_expression`. + a. If the object is present in the cache, use cached optimized expression. b. Else generate the optimized expressions and insert the expressions via - `ExpressionCache::insert_expression`. -3. If in read-write mode, call `ExpressionCache::remove_deploy_generation` to remove the previous + `ExpressionCache::insert_expressions`. +3. If in read-write mode, call `ExpressionCache::cleanup_all_prior_deploy_generations` to remove the previous deploy generation. ### DDL - Create 1. Execute catalog transaction. -2. Update cache via `ExpressionCache::insert_expression`. +2. Update cache via `ExpressionCache::insert_expressions`. ### DDL - Drop -This is optional for v1, on startup `ExpressionCache::reconcile` will update the cache to the +This is optional for v1, on startup `ExpressionCache::open` will update the cache to the correct state. 1. Execute catalog transaction. 2. Invalidate cache entries via `ExpressionCache::invalidate_entries`. 3. Re-compute and repopulate cache entries that depended on dropped entries via - `ExpressionCache::insert_expression`. + `ExpressionCache::insert_expressions`. ### File System Implementation From 8f0bc28293c72f11a23dad831be8acbbd38aa347 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 9 Oct 2024 16:59:51 -0400 Subject: [PATCH 15/22] Add more alternatives --- doc/developer/design/20241008_expression_cache.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index af2a1eefec5ef..be8fce4d70bd6 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -163,8 +163,12 @@ expression. ## Alternatives -For the persist implementation, we could mint a new shard for each deploy generation. This would +- For the persist implementation, we could mint a new shard for each deploy generation. This would require us to finalize old shards during startup which would accumulate shard tombstones in CRDB. +- We could use persist's `FileBlob` for durability. It's extremely well tested (most of CI uses it + for persist) and solves at least some of the file system cons. +- We could use persist for durability, but swap in the `FileBlob` as the blob store and some local + consensus implementation. ## Open questions From 4a75c9ef373d9bad8009f436b38d373e3cad0656 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Thu, 10 Oct 2024 11:11:17 -0400 Subject: [PATCH 16/22] Combine open methods --- .../design/20241008_expression_cache.md | 23 +++++++++++-------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index be8fce4d70bd6..c1c362ce0faa7 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -74,19 +74,23 @@ struct Expressions { notices: SmallVec<[Arc; 4]>, } -trait ExpressionCache { +struct ExpressionCache { + deploy_generation: u64, + information_needed_to_connect_to_durable_store: _, +} + +impl ExpressionCache { /// Creates a new [`ExpressionCache`] for `deploy_generation`. - fn new(&mut self, deploy_generation: u64) -> Self; + fn new(&mut self, deploy_generation: u64, information_needed_to_connect_to_durable_store: _) -> Self; /// Remove all entries in current deploy generation that depend on a global ID that is not /// present in `current_ids`. /// + /// If `remove_prior_gens` is `true`, all previous generations are durably removed from the + /// cache. + /// /// Returns all cached expressions. - fn reconcile_and_open(&mut self, current_ids: &BTreeSet) -> Vec<(GlobalId, Expressions)>; - - /// Durably removes all entries with a deploy generation less than this cache's deploy - /// generation. - fn cleanup_all_prior_deploy_generations(&mut self); + fn open(&mut self, current_ids: &BTreeSet, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; /// Durably inserts `expressions` into current deploy generation. /// @@ -107,13 +111,12 @@ trait ExpressionCache { Below is a detailed set of steps that will happen in startup. -1. Call `ExpressionCache::open` to remove any invalid entries and retrieve cached entries. +1. Call `ExpressionCache::open` to remove any invalid entries and retrieve cached entries. When + passing in the arguments, `remove_prior_gens == !read_only_mode`. 2. While opening the catalog, for each object: a. If the object is present in the cache, use cached optimized expression. b. Else generate the optimized expressions and insert the expressions via `ExpressionCache::insert_expressions`. -3. If in read-write mode, call `ExpressionCache::cleanup_all_prior_deploy_generations` to remove the previous - deploy generation. ### DDL - Create From 46aa4399eaf16063b6c0336e8dfae3c9e0c3d8f3 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Thu, 10 Oct 2024 14:37:28 -0400 Subject: [PATCH 17/22] Add optimizer feature overrides --- doc/developer/design/20241008_expression_cache.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index c1c362ce0faa7..fc77e4f602c1d 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -72,6 +72,7 @@ struct Expressions { physical_plan: DataflowDescription, dataflow_metainfos: DataflowMetainfo>, notices: SmallVec<[Arc; 4]>, + optimizer_feature_overrides: OptimizerFeatureOverrides, } struct ExpressionCache { @@ -84,13 +85,13 @@ impl ExpressionCache { fn new(&mut self, deploy_generation: u64, information_needed_to_connect_to_durable_store: _) -> Self; /// Remove all entries in current deploy generation that depend on a global ID that is not - /// present in `current_ids`. + /// present in `current_ids` or that do not have a matching `optimizer_feature_overrides`. /// - /// If `remove_prior_gens` is `true`, all previous generations are durably removed from the + /// If `remove_prior_gens` is `true`, all previous generations are durably removed from the /// cache. /// - /// Returns all cached expressions. - fn open(&mut self, current_ids: &BTreeSet, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; + /// Returns all cached expressions in the current deploy generation. + fn open(&mut self, current_ids: &BTreeSet, optimizer_feature_overrides: &OptimizerFeatureOverrides, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; /// Durably inserts `expressions` into current deploy generation. /// From 4f3524af8f876837aec23c924993590a1b996bb9 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Thu, 10 Oct 2024 14:43:35 -0400 Subject: [PATCH 18/22] Switch optimizer struct --- doc/developer/design/20241008_expression_cache.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index fc77e4f602c1d..2742636e59a72 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -72,7 +72,7 @@ struct Expressions { physical_plan: DataflowDescription, dataflow_metainfos: DataflowMetainfo>, notices: SmallVec<[Arc; 4]>, - optimizer_feature_overrides: OptimizerFeatureOverrides, + optimizer_feature_overrides: OptimizerFeatures, } struct ExpressionCache { @@ -85,13 +85,13 @@ impl ExpressionCache { fn new(&mut self, deploy_generation: u64, information_needed_to_connect_to_durable_store: _) -> Self; /// Remove all entries in current deploy generation that depend on a global ID that is not - /// present in `current_ids` or that do not have a matching `optimizer_feature_overrides`. + /// present in `current_ids` or that do not have a matching `optimizer_feature`. /// /// If `remove_prior_gens` is `true`, all previous generations are durably removed from the /// cache. /// /// Returns all cached expressions in the current deploy generation. - fn open(&mut self, current_ids: &BTreeSet, optimizer_feature_overrides: &OptimizerFeatureOverrides, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; + fn open(&mut self, current_ids: &BTreeSet, optimizer_feature: &OptimizerFeatures, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; /// Durably inserts `expressions` into current deploy generation. /// From 17659a20875aacb37354e6c518a65ca8e18dd1a9 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Fri, 11 Oct 2024 11:30:12 -0400 Subject: [PATCH 19/22] WIP --- doc/developer/design/20241008_expression_cache.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index 2742636e59a72..a17c290b58b99 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -84,14 +84,14 @@ impl ExpressionCache { /// Creates a new [`ExpressionCache`] for `deploy_generation`. fn new(&mut self, deploy_generation: u64, information_needed_to_connect_to_durable_store: _) -> Self; - /// Remove all entries in current deploy generation that depend on a global ID that is not - /// present in `current_ids` or that do not have a matching `optimizer_feature`. + /// Reconciles all entries in current deploy generation with the current objects, `current_ids`, + /// and current optimizer features, `optimizer_features`. /// /// If `remove_prior_gens` is `true`, all previous generations are durably removed from the /// cache. /// - /// Returns all cached expressions in the current deploy generation. - fn open(&mut self, current_ids: &BTreeSet, optimizer_feature: &OptimizerFeatures, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; + /// Returns all cached expressions in the current deploy generation, after reconciliation. + fn open(&mut self, current_ids: &BTreeSet, optimizer_features: &OptimizerFeatures, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; /// Durably inserts `expressions` into current deploy generation. /// From fee46f3cc949064960bdf0030dfdb1ad54d60fca Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Tue, 15 Oct 2024 13:48:20 -0400 Subject: [PATCH 20/22] Update insert invalidation logic --- .../design/20241008_expression_cache.md | 42 +++++++++++++++---- 1 file changed, 34 insertions(+), 8 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index a17c290b58b99..d1c1b95aec2e6 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -60,8 +60,6 @@ take care of serializing and deserializing bytes. Additionally, we probably don' implementing. ```Rust - - /// All the cached expressions for a single `GlobalId`. /// /// Note: This is just a placeholder for now, don't index too hard on the exact fields. I haven't @@ -75,6 +73,15 @@ struct Expressions { optimizer_feature_overrides: OptimizerFeatures, } +struct NewEntry { + /// `GlobalId` of the new expression. + id: GlobalId, + /// New `Expressions` to cache. + expressions: Expressions, + /// `GlobalId`s to invalidate as a result of the new entry. + invalidate_ids: BTreeSet, +} + struct ExpressionCache { deploy_generation: u64, information_needed_to_connect_to_durable_store: _, @@ -93,12 +100,13 @@ impl ExpressionCache { /// Returns all cached expressions in the current deploy generation, after reconciliation. fn open(&mut self, current_ids: &BTreeSet, optimizer_features: &OptimizerFeatures, remove_prior_gens: bool) -> Vec<(GlobalId, Expressions)>; - /// Durably inserts `expressions` into current deploy generation. + /// Durably inserts `expressions` into current deploy generation. This may also invalidate + /// entries giving by `expressions`. /// - /// Returns a [`Future`] that completes once `expressions` have been made durable. + /// Returns a [`Future`] that completes once the changes have been made durable. /// /// Panics if any `GlobalId` already exists in the cache. - fn insert_expressions(&mut self, expressions: Vec<(GlobalId, Expressions)>) -> impl Future; + fn insert_expressions(&mut self, expressions: Vec) -> impl Future; /// Durably remove and return all entries in current deploy generation that depend on an ID in /// `dropped_ids` . @@ -112,18 +120,36 @@ impl ExpressionCache { Below is a detailed set of steps that will happen in startup. -1. Call `ExpressionCache::open` to remove any invalid entries and retrieve cached entries. When - passing in the arguments, `remove_prior_gens == !read_only_mode`. +1. Call `ExpressionCache::open` to read the cache into memory and perform reconciliation (See + [Startup Reconciliation](#startup reconciliation)). When passing in the arguments, + `remove_prior_gens == !read_only_mode`. 2. While opening the catalog, for each object: a. If the object is present in the cache, use cached optimized expression. b. Else generate the optimized expressions and insert the expressions via - `ExpressionCache::insert_expressions`. + `ExpressionCache::insert_expressions`. This will also perform any necessary invalidations if + the new expression is an index. See ([Create Invalidations](#create invalidations)). + +#### Startup Reconciliation +When opening the cache for the first time, we need to perform the following reconciliation tasks: + + - Remove any entries that exist in the cache but not in the catalog. + - If `remove_prior_gens` is true, then remove all prior gens. + ### DDL - Create 1. Execute catalog transaction. 2. Update cache via `ExpressionCache::insert_expressions`. +#### Create Invalidations + +When creating and inserting a new index, we need to invalidate some entries that may optimize to +new expressions. When creating index `i` on object `o`, we need to invalidate the following objects: + + - `o`. + - All compute objects that depend directly on `o`. + - All compute objects that would directly depend on `o`, if all views were inlined. + ### DDL - Drop This is optional for v1, on startup `ExpressionCache::open` will update the cache to the From 3b491b538d496f51428a1e2e7bd7507efc18d093 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Tue, 15 Oct 2024 14:29:44 -0400 Subject: [PATCH 21/22] Commit to persist implementation --- .../design/20241008_expression_cache.md | 67 ++++++++++++++----- 1 file changed, 50 insertions(+), 17 deletions(-) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index d1c1b95aec2e6..417f762a199b3 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -160,28 +160,44 @@ correct state. 3. Re-compute and repopulate cache entries that depended on dropped entries via `ExpressionCache::insert_expressions`. -### File System Implementation +### Implementation -One potential implementation is via the filesystem of an attached durable storage to `environmentd`. -Each cache entry would be saved as a file of the format -`/path/to/cache//`. +The implementation will use persist for durability. The cache will be a single dedicated shard. +Each cache entry will be keyed by `(deploy_generation, global_id)` and the value will be a +serialized version of the expression. -#### Pros -- No need to worry about coordination across K8s pods. -- Bulk deletion is a simple directory delete. +#### Conflict Resolution -#### Cons -- Need to worry about Flushing/fsync. -- Need to worry about concurrency. -- Need to worry about atomicity. -- Need to worry about mocking things in memory for tests. -- If we lose the pod, then we also lose the cache. +It is possible and expected that multiple environments will be writing to the cache at the same +time. This would manifest in an upper mismatch error during an insert or invalidation. In case of +this error, the cache should read in all new updates, apply each update as described below, and +retry the operation from the beginning. -### Persist implementation +If the update is in a different deploy generation as the current cache, then ignore it. It is in a +different logical namespace and won't conflict with the operation. -Another potential implementation is via persist. Each cache entry would be keyed by -`(deploy_generation, global_id)` and the value would be a serialized version of the -expression. +If the update is in the same deploy generation, then we must be in a split-brain scenario where +both the current process and another process think they are the leader. We should still update any +in-memory state as if the current cache had made that change. This relies on the following +invariants: + + - Two processes with the same deploy generation MUST be running the same version of code. + - A global ID only ever maps to a single object. + - Optimization is deterministic. + +Therefore, we can be sure that any new global IDs refer to the same object that the current cache +thinks it refers to. Also, the optimized expressions that the other process produced is identical +to the optimized expression that the current process would have produced. Eventually, one of the +processes will be fenced out on some other operation. The reason that we don't panic immediately, +is because the current process may actually be the leader and enter a live-lock scenario like the +following: + +1. Process `A` starts up and becomes the leader. +2. Process `B` starts up and becomes the leader. +3. Process `A` writes to the cache. +4. Process `B` panics. +5. Process `A` is fenced. +6. Go back to step (1). #### Pros - Flushing, concurrency, atomicity, mocking are already implemented by persist. @@ -200,6 +216,23 @@ require us to finalize old shards during startup which would accumulate shard to - We could use persist for durability, but swap in the `FileBlob` as the blob store and some local consensus implementation. +### File System Implementation + +One potential implementation is via the filesystem of an attached durable storage to `environmentd`. +Each cache entry would be saved as a file of the format +`/path/to/cache//`. + +#### Pros +- No need to worry about coordination across K8s pods. +- Bulk deletion is a simple directory delete. + +#### Cons +- Need to worry about Flushing/fsync. +- Need to worry about concurrency. +- Need to worry about atomicity. +- Need to worry about mocking things in memory for tests. +- If we lose the pod, then we also lose the cache. + ## Open questions - Which implementation should we use? From d5b6f0033f404f401ea253805360a74f26301740 Mon Sep 17 00:00:00 2001 From: Joseph Koshakow Date: Wed, 16 Oct 2024 16:55:14 -0400 Subject: [PATCH 22/22] Add note about recomputed expressions --- doc/developer/design/20241008_expression_cache.md | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/developer/design/20241008_expression_cache.md b/doc/developer/design/20241008_expression_cache.md index 417f762a199b3..b91b0d97b53db 100644 --- a/doc/developer/design/20241008_expression_cache.md +++ b/doc/developer/design/20241008_expression_cache.md @@ -140,6 +140,7 @@ When opening the cache for the first time, we need to perform the following reco 1. Execute catalog transaction. 2. Update cache via `ExpressionCache::insert_expressions`. +3. [Optional for v1] Recompute and insert any invalidated expressions. #### Create Invalidations