Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compute: spillable MV correction buffer #30083

Merged
merged 5 commits into from
Jan 28, 2025

Conversation

teskje
Copy link
Contributor

@teskje teskje commented Oct 18, 2024

This PR introduces a "correction v2" buffer that differs from the existing one in that it stores data in columnated regions that can be spilled to disk. It follows the general design of the arrangement merge batcher, with the major difference that it sorts updates by time first, in an attempt to more efficiently deal with the presence of updates in
the future (commonly introduced by temporal filters).

The new correction buffer can be switched on through a feature flag, enable_compute_correction_v2, and is switched off by default. The plan is to keep it disabled in production but have it available for emergencies where replicas fail to hydrate due to the MV memory spike. Eventually we'll want to make the new correction buffer the default, but we should do more performance testing before that.

Motivation

  • This PR adds a known-desirable feature.

Part of https://github.com/MaterializeInc/database-issues/issues/8464

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@teskje teskje force-pushed the correction-lgalloc branch 2 times, most recently from ffdf64d to 77afd64 Compare October 21, 2024 08:31
@teskje teskje force-pushed the correction-lgalloc branch 2 times, most recently from 25b0987 to 0b2e359 Compare October 25, 2024 12:13
@teskje teskje force-pushed the correction-lgalloc branch 4 times, most recently from ffb5488 to 4697f9c Compare November 25, 2024 10:22
@teskje teskje force-pushed the correction-lgalloc branch 3 times, most recently from 4356350 to 185c99c Compare December 1, 2024 12:06
@teskje teskje force-pushed the correction-lgalloc branch 2 times, most recently from e95c32f to 49c0569 Compare January 3, 2025 11:39
@teskje
Copy link
Contributor Author

teskje commented Jan 3, 2025

The feature benchmarks report a bunch of regressions. Some increased CPU and memory usage is expected, but the memory regressions here are worryingly large:

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
FastPathOrderByLimit                | wallclock       |           0.827 |           0.871 |   s    |    10%     |      no       | better:  5.0% faster
FastPathOrderByLimit                | memory_mz       |         467.682 |         464.535 |   MB   |    20%     |      no       | worse:   0.7% more
FastPathOrderByLimit                | memory_clusterd |         959.396 |         149.727 |   MB   |    50%     |    !!YES!!    | worse:   6.4 TIMES more
OrderBy                             | wallclock       |           9.010 |           8.825 |   s    |    10%     |      no       | worse:   2.1% slower
OrderBy                             | memory_mz       |         473.213 |         473.499 |   MB   |    20%     |      no       | better:  0.1% less
OrderBy                             | memory_clusterd |        1179.695 |         307.178 |   MB   |    50%     |    !!YES!!    | worse:   3.8 TIMES more
DifferentialJoin                    | wallclock       |           1.396 |           1.401 |   s    |    10%     |      no       | better:  0.4% faster
DifferentialJoin                    | memory_mz       |         455.093 |         466.061 |   MB   |    20%     |      no       | better:  2.4% less
DifferentialJoin                    | memory_clusterd |         930.405 |         144.672 |   MB   |    50%     |    !!YES!!    | worse:   6.4 TIMES more
FastPathFilterIndex                 | wallclock       |           3.423 |           4.440 |   s    |    10%     |      no       | better: 22.9% faster
FastPathFilterIndex                 | memory_mz       |         470.448 |         465.775 |   MB   |    20%     |      no       | worse:   1.0% more
FastPathFilterIndex                 | memory_clusterd |         935.078 |         155.830 |   MB   |    50%     |    !!YES!!    | worse:   6.0 TIMES more
CrossJoin                           | wallclock       |           3.416 |           2.429 |   s    |    10%     |    !!YES!!    | worse:  40.7% slower
CrossJoin                           | memory_mz       |         468.540 |         463.772 |   MB   |    20%     |      no       | worse:   1.0% more
CrossJoin                           | memory_clusterd |         918.388 |         205.231 |   MB   |    50%     |    !!YES!!    | worse:   4.5 TIMES more
FullOuterJoin                       | wallclock       |           9.714 |           8.691 |   s    |    10%     |    !!YES!!    | worse:  11.8% slower
FullOuterJoin                       | memory_mz       |         461.483 |         468.254 |   MB   |    20%     |      no       | better:  1.4% less
FullOuterJoin                       | memory_clusterd |        1033.783 |         220.871 |   MB   |    50%     |    !!YES!!    | worse:   4.7 TIMES more
MFPPushdown                         | wallclock       |           0.356 |           0.363 |   s    |    10%     |      no       | better:  2.2% faster
MFPPushdown                         | memory_mz       |         470.924 |         466.919 |   MB   |    20%     |      no       | worse:   0.9% more
MFPPushdown                         | memory_clusterd |        6407.738 |         145.721 |   MB   |    50%     |    !!YES!!    | worse:  44.0 TIMES more
CountDistinct                       | wallclock       |           1.807 |           1.802 |   s    |    10%     |      no       | worse:   0.3% slower
CountDistinct                       | memory_mz       |         464.344 |         462.818 |   MB   |    20%     |      no       | worse:   0.3% more
CountDistinct                       | memory_clusterd |         727.177 |         150.299 |   MB   |    50%     |    !!YES!!    | worse:   4.8 TIMES more
AccumulateReductions                | wallclock       |          65.003 |          50.974 |   s    |    10%     |    !!YES!!    | worse:  27.5% slower
AccumulateReductions                | memory_mz       |        5145.073 |        5155.563 |   MB   |    20%     |      no       | better:  0.2% less
AccumulateReductions                | memory_clusterd |        7984.161 |         144.672 |   MB   |    50%     |    !!YES!!    | worse:  55.2 TIMES more
FastPathFilterNoIndex               | wallclock       |           1.309 |           1.316 |   s    |    10%     |      no       | better:  0.5% faster
FastPathFilterNoIndex               | memory_mz       |         501.537 |         488.853 |   MB   |    20%     |      no       | worse:   2.6% more
FastPathFilterNoIndex               | memory_clusterd |        6087.303 |         206.852 |   MB   |    50%     |    !!YES!!    | worse:  29.4 TIMES more
GroupBy                             | wallclock       |           4.553 |           4.545 |   s    |    10%     |      no       | worse:   0.2% slower
GroupBy                             | memory_mz       |         459.099 |         464.630 |   MB   |    20%     |      no       | better:  1.2% less
GroupBy                             | memory_clusterd |         960.350 |         244.617 |   MB   |    50%     |    !!YES!!    | worse:   3.9 TIMES more
FinishOrderByLimit                  | wallclock       |           1.361 |           1.386 |   s    |    10%     |      no       | better:  1.8% faster
FinishOrderByLimit                  | memory_mz       |         467.396 |         463.486 |   MB   |    20%     |      no       | worse:   0.8% more
FinishOrderByLimit                  | memory_clusterd |         916.767 |         158.215 |   MB   |    50%     |    !!YES!!    | worse:   5.8 TIMES more
GroupByMaintained                   | wallclock       |          16.215 |          16.302 |   s    |    10%     |      no       | better:  0.5% faster
GroupByMaintained                   | memory_mz       |         478.172 |         475.502 |   MB   |    20%     |      no       | worse:   0.6% more
GroupByMaintained                   | memory_clusterd |        1375.198 |         510.120 |   MB   |    50%     |    !!YES!!    | worse:   2.7 TIMES more
MinMaxMaintained                    | wallclock       |           6.351 |           6.444 |   s    |    10%     |      no       | better:  1.4% faster
MinMaxMaintained                    | memory_mz       |         465.298 |         468.063 |   MB   |    20%     |      no       | better:  0.6% less
MinMaxMaintained                    | memory_clusterd |        1131.058 |         296.307 |   MB   |    50%     |    !!YES!!    | worse:   3.8 TIMES more
MinMax                              | wallclock       |           1.538 |           1.582 |   s    |    10%     |      no       | better:  2.8% faster
MinMax                              | memory_mz       |         464.916 |         459.671 |   MB   |    20%     |      no       | worse:   1.1% more
MinMax                              | memory_clusterd |        1009.941 |         191.689 |   MB   |    50%     |    !!YES!!    | worse:   5.3 TIMES more
FastPathLimit                       | wallclock       |           0.315 |           0.322 |   s    |    10%     |      no       | better:  2.1% faster
FastPathLimit                       | memory_mz       |         469.971 |         470.257 |   MB   |    20%     |      no       | better:  0.1% less
FastPathLimit                       | memory_clusterd |        1018.524 |         216.293 |   MB   |    50%     |    !!YES!!    | worse:   4.7 TIMES more
DeltaJoinMaintained                 | wallclock       |           2.521 |           2.581 |   s    |    10%     |      no       | better:  2.3% faster
DeltaJoinMaintained                 | memory_mz       |         463.009 |         464.535 |   MB   |    20%     |      no       | better:  0.3% less
DeltaJoinMaintained                 | memory_clusterd |         989.914 |         203.991 |   MB   |    50%     |    !!YES!!    | worse:   4.9 TIMES more
DeltaJoin                           | wallclock       |           2.303 |           2.330 |   s    |    10%     |      no       | better:  1.2% faster
DeltaJoin                           | memory_mz       |         456.619 |         464.916 |   MB   |    20%     |      no       | better:  1.8% less
DeltaJoin                           | memory_clusterd |         776.482 |         161.076 |   MB   |    50%     |    !!YES!!    | worse:   4.8 TIMES more
Retraction                          | wallclock       |           5.739 |           4.276 |   s    |    10%     |    !!YES!!    | worse:  34.2% slower
Retraction                          | memory_mz       |         467.777 |         468.922 |   MB   |    20%     |      no       | better:  0.2% less
Retraction                          | memory_clusterd |        1175.880 |         282.764 |   MB   |    50%     |    !!YES!!    | worse:   4.2 TIMES more

I have a suspicion that this isn't an actual regression but a result of the usage of lgalloc, which holds onto allocated memory and only releases it slowly over time. The memory_clusterd measurement is made by inspecting the Docker memory usage after the test has completed, at which point the old implementation will have freed most of the correction buffer memory while the new implementation still retains some of it in lgalloc. I should be able to validate that suspicion by switching off lgalloc.

@teskje teskje force-pushed the correction-lgalloc branch from 49c0569 to e01a7dc Compare January 6, 2025 12:11
@teskje
Copy link
Contributor Author

teskje commented Jan 8, 2025

I retried the feature benchmarks with lgalloc disabled and things do look better:

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
CrossJoin                           | wallclock       |           4.623 |           2.237 |   s    |    10%     |    !!YES!!    | worse:   2.1 TIMES slower
CrossJoin                           | memory_mz       |         468.159 |         476.074 |   MB   |    20%     |      no       | better:  1.7% less
CrossJoin                           | memory_clusterd |         213.051 |         157.261 |   MB   |    50%     |      no       | worse:  35.5% more
FastPathOrderByLimit                | wallclock       |           0.846 |           0.845 |   s    |    10%     |      no       | worse:   0.2% slower
FastPathOrderByLimit                | memory_mz       |         463.104 |         474.739 |   MB   |    20%     |      no       | better:  2.5% less
FastPathOrderByLimit                | memory_clusterd |         133.514 |          87.414 |   MB   |    50%     |    !!YES!!    | worse:  52.7% more
FastPathFilterIndex                 | wallclock       |           3.614 |           3.142 |   s    |    10%     |    !!YES!!    | worse:  15.0% slower
FastPathFilterIndex                 | memory_mz       |         481.033 |         466.251 |   MB   |    20%     |      no       | worse:   3.2% more
FastPathFilterIndex                 | memory_clusterd |         139.046 |          93.822 |   MB   |    50%     |      no       | worse:  48.2% more
AccumulateReductions                | wallclock       |          79.178 |          51.314 |   s    |    10%     |    !!YES!!    | worse:  54.3% slower
AccumulateReductions                | memory_mz       |        5117.416 |        5090.714 |   MB   |    20%     |      no       | worse:   0.5% more
AccumulateReductions                | memory_clusterd |         563.717 |          88.825 |   MB   |    50%     |    !!YES!!    | worse:   6.3 TIMES more
FullOuterJoin                       | wallclock       |           9.918 |           8.866 |   s    |    10%     |    !!YES!!    | worse:  11.9% slower
FullOuterJoin                       | memory_mz       |         458.431 |         460.243 |   MB   |    20%     |      no       | better:  0.4% less
FullOuterJoin                       | memory_clusterd |         195.122 |         161.076 |   MB   |    50%     |      no       | worse:  21.1% more
MFPPushdown                         | wallclock       |           0.343 |           0.363 |   s    |    10%     |      no       | better:  5.4% faster
MFPPushdown                         | memory_mz       |         482.559 |         473.213 |   MB   |    20%     |      no       | worse:   2.0% more
MFPPushdown                         | memory_clusterd |         464.916 |          83.294 |   MB   |    50%     |    !!YES!!    | worse:   5.6 TIMES more
FastPathFilterNoIndex               | wallclock       |           1.275 |           1.503 |   s    |    10%     |      no       | better: 15.1% faster
FastPathFilterNoIndex               | memory_mz       |         508.404 |         494.099 |   MB   |    20%     |      no       | worse:   2.9% more
FastPathFilterNoIndex               | memory_clusterd |         537.968 |         139.618 |   MB   |    50%     |    !!YES!!    | worse:   3.9 TIMES more
Retraction                          | wallclock       |           6.622 |           4.453 |   s    |    10%     |    !!YES!!    | worse:  48.7% slower
Retraction                          | memory_mz       |         458.622 |         467.682 |   MB   |    20%     |      no       | better:  1.9% less
Retraction                          | memory_clusterd |         300.980 |         175.953 |   MB   |    50%     |    !!YES!!    | worse:  71.1% more

Still a higher than expected memory regression for some of them, I want to look into those.

@teskje teskje force-pushed the correction-lgalloc branch from 9d51f8f to c8846d6 Compare January 15, 2025 12:30
@teskje teskje changed the title [WIP] compute: spillable MV correction buffer compute: spillable MV correction buffer Jan 15, 2025
let mut heap = MergeHeap::from_iter(cursors);
let mut merged = Chain::default();
while let Some(cursor1) = heap.pop() {
let (data, time, mut diff) = cursor1.get();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we'd have the option to reuse whole chunks, if all updates in the current chunk are less than the first update in the next cursor on the heap. However, if we do this naively we could end up with a lot empty space in our chunks, and therefore chains that have a lot more chunks than they'd need were their updates tightly packed.

Comment on lines +80 to +88
//! Unfortunately, performing consolidation as described above can break the chain invariant and we
//! might need to restore it by merging chains, including ones containing future updates. This is
//! something that would be great to fix! In the meantime the hope is that in steady state it
//! doesn't matter too much because either there are no future retractions and U is approximately
//! equal to N, or the amount of future retractions is much larger than the amount of current
//! changes, in which case removing the current changes has a good chance of leaving the chain
//! invariant intact.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is, I think, the main wrinkle in the current implementation. It'd be nice to never have to look at future updates when consolidating the current ones. All alternatives I've come up with so far are not great:

  • Don't consolidate the correction updates at all, just provide a read-only merging iterator. That's sufficient for the MV sink to work, but it doesn't explain when the correction contents will be consolidated. Inserts will trigger merges, but if few inserts happen, we can't rely on those. We'd need some form of idle merging, but that adds significant complexity.
  • Skip restoring the chain invariant and rely on subsequent inserts to do so. It's not clear to me if that actually improves anything or just moves work from one operation to the other. It definitely makes the data structure harder to reason about, since now the chain invariant isn't an invariant anymore and we can't make crisp statements about the complexity of inserts anymore.

@teskje teskje force-pushed the correction-lgalloc branch 4 times, most recently from 0993a98 to 6607ce2 Compare January 15, 2025 16:08
@antiguru antiguru self-requested a review January 15, 2025 16:12
@teskje teskje force-pushed the correction-lgalloc branch 3 times, most recently from c825ca6 to f0c6e77 Compare January 17, 2025 11:32
@teskje teskje force-pushed the correction-lgalloc branch 2 times, most recently from 2262dee to 70d0355 Compare January 17, 2025 18:06
@teskje teskje marked this pull request as ready for review January 20, 2025 08:59
@teskje teskje requested a review from a team as a code owner January 20, 2025 08:59
@teskje teskje requested a review from frankmcsherry January 20, 2025 09:08
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, but I'm still reading.

src/compute/src/sink/correction_v2.rs Outdated Show resolved Hide resolved
src/compute/src/sink/correction_v2.rs Show resolved Hide resolved
src/compute/src/sink/correction_v2.rs Show resolved Hide resolved
Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change looks good on a high level, and I think we can soon merge it and iterate on it from there. I left some comments, but what stands out for me are two things:

  • It seems better to use the Array type instead of StackWrapper since we want constant capacity chunks. ChunkedStack will give us strange behavior I fear, it has space for more elements we'd ever want in a chunk. Note that Array doesn't know about columnation, so some glue code would need to be written.
  • The cursor API expected to copy data element-by-element, which can be inefficient. Is there an alternative where we can move whole chunks, or partial chunks?

src/compute/src/sink/correction_v2.rs Outdated Show resolved Hide resolved
src/compute/src/sink/correction_v2.rs Outdated Show resolved Hide resolved
}

/// Return consolidated updates before the given `upper`.
pub fn updates_before<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could return an iterator over (&D, _, _) to avoid cloning all contents. This is important when computing the initial snapshot since it would roughly 2x the memory requirements.

It might not be possible to do this right now, and I see that your implementation follows existing APIs, so please record this as an issue so we can change it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's hard as long as we're using the SourceData type. Well, something to note as future work I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Though I think this doesn't 2x the memory requirements since the cloning only happens during iteration and there is no place where the entire iterator is collected at the same time. It is given to a persist BatchBuilder which uses bounded memory by uploading parts to S3 as it goes.

Comment on lines 303 to 317
let needs_merge = self.chains.get(i).is_some_and(|a| {
let b = &self.chains[i - 1];
a.len() * CHAIN_PROPORTIONALITY > b.len()
});
if needs_merge {
let a = self.chains.remove(i);
let b = std::mem::take(&mut self.chains[i - 1]);
let merged = merge_chains(vec![a, b], &self.since);
self.chains[i - 1] = merged;
} else {
i -= 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the self.chains.get(i).is_some_and part: If we unconditionally decremented i by 1 in every iteration, chains at i should always exist, no? That would simplify the logic a bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we cannot unconditionally decrement i unfortunately. Merging two chains might result in a smaller change that breaks the invariant again for chains you have already looked at, so you have to go back and merge these too.

For example, assume you have these three chains:

[0] xxxx
[1] xxxx
[2] x     <- i

--- no merge required ---

[0] xxxx
[1] xxxx  <- i
[2] x       

--- merge ---

[0] x     <- i
[1] x

--- done ---

You end up not having restored the invariant.

Whereas if you only advance i when you don't merge, you merge chains at i until the invariant is restored up to i again, before you continue. When you get to i = 0 the invariant holds for all chains.

src/compute/src/sink/correction_v2.rs Show resolved Hide resolved
Comment on lines +740 to +753
while let Some(cursor) = remaining.take() {
if cursor.chunk_offset == 0 {
remaining = Some(cursor);
break;
}
let update = cursor.get();
chain.push(update);
remaining = cursor.step();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop here (and elsewhere) performs element-by-element copies. It seems to me that we could trade-off how compactly we store data with how much we need to copy around. For example, when we extracted some prefix of a chunk, we could just leave the remaining data in the chunk and remember the offset. Then, we'd need to add a check that chunks in a chain are always at least half full to avoid sitting on too much memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! This PR takes the conservative approach of always packing updates as tightly as possible, in exchange for more work that has to be done. There are definitely ways to optimize things, but I'd like to leave these as follow-ups given that they make the code more complicated and this PR is already large enough.

Comment on lines +761 to +777
impl<D: Data> From<Cursor<D>> for Chain<D> {
fn from(cursor: Cursor<D>) -> Self {
match cursor.try_unwrap() {
Ok(chain) => chain,
Err((_, cursor)) => {
let mut chain = Chain::default();
chain.push_cursor(cursor);
chain
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I can follow when the try_unwrap call in this From implementation can fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know nothing about the input cursor here, so it can fail for all the reasons described in the docs of try_unwrap: there might be a limit or overwrite_ts set, or some of the referenced chunks might be shared with other cursors.

@teskje teskje force-pushed the correction-lgalloc branch from 8a95782 to 4336fb6 Compare January 24, 2025 11:01
@teskje teskje requested a review from antiguru January 24, 2025 13:44
@teskje
Copy link
Contributor Author

teskje commented Jan 24, 2025

I addressed most comments and created follow-up issues for the rest. RFAL!

Copy link
Member

@antiguru antiguru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Let's merge and then iterate based on that!

let Some(since_ts) = self.since.as_option() else {
// If the since frontier is empty, discard all updates.
return;
};

for (_, time, _) in &mut updates {
for (_, time, _) in &mut *updates {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this equivalent to just using updates directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason it's not. If you try that you get an error with the helpful hint that you should use a reborrow instead 🤷

This commit introduces a "correction v2" buffer that differs from the
existing one in that it stores data in columnated regions that can be
spilled to disk. It follows the general design of the arrangement merge
batcher, with the major difference that it sorts updates by time first,
in an attempt to more efficiently deal with the presence of updates in
the future (commonly introduced by temporal filters).
@teskje teskje force-pushed the correction-lgalloc branch from d6a275b to dee1f3f Compare January 28, 2025 14:10
@teskje teskje enabled auto-merge January 28, 2025 14:10
@teskje
Copy link
Contributor Author

teskje commented Jan 28, 2025

TFTR!

This commit adds a new dyncfg, `enable_compute_correction_v2`, that
controlls whether the MV sink v2 should use the old or the new
implementation of the correction buffer.

Disabled by default for now, even in CI, to give us more time to improve
performance.
@teskje teskje force-pushed the correction-lgalloc branch from dee1f3f to 1c3d58b Compare January 28, 2025 14:29
@teskje teskje merged commit 450fb06 into MaterializeInc:main Jan 28, 2025
79 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants