Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements concurrent Smt::compute_mutations #365

Open
wants to merge 7 commits into
base: next
Choose a base branch
from

Conversation

krushimir
Copy link

This PR introduces a concurrent implementation of Smt::compute_mutations, leveraging an approach similar to the existing parallel construction logic.

Benchmark results were collected on a 64-core (128-thread) AMD EPYC 7662 processor, with Rayon’s thread pool explicitly limited to the specified thread counts.

For context, construction benchmarks are also included for performance comparison.

1. Construction Benchmark

10k key-value pairs

Threads Parallel Time (s) Sequential Time (s) Speedup
16 0.5 5.7 11.11x
32 0.4 5.7 15.22x
64 0.3 5.7 17.35x
128 0.4 5.7 16.90x
  • Optimal performance was achieved with 64 threads.
  • Diminishing returns were observed with 128 threads

2. Batched Insertion Benchmark

10k key-value pairs

Threads Parallel Time (ms) Sequential Time (ms) Speedup Avg Insert Time (μs)
16 517.0 6308.7 12.20x 52
32 395.8 6334.5 16.00x 40
64 333.0 6321.6 18.98x 33
128 383.7 6300.7 16.42x 38
  • 64 threads offered the best performance, reducing average insertion time to 33 μs.
  • Scaling beyond 64 threads led to slight performance degradation.

3. Batched Update Benchmark

10k key-value pairs

Threads Parallel Time (ms) Sequential Time (ms) Speedup Avg Update Time (μs)
16 482.7 6369.8 13.20x 48
32 357.7 6351.5 17.76x 36
64 304.7 6378.5 20.93x 30
128 273.5 6418.8 23.47x 27
  • Batched updates scaled better with increased threads.
  • 128 threads achieved the fastest update speed, reducing average time to 27 μs.

Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! I think the logic itself looks good. My comments are mostly about naming, docs and deduplication. I might have to take another look anyway, since I first had to understand how the Smt is implemented in sequential code 😅, so I'll just comment for now.

In general, I think adding comments to code parts that are not easy to understand would improve readability and understandability.

Regarding the approach, please correct me if I have misunderstandings, but my understanding of the approach is the following.

Assuming a tree of depth 64 with subtrees of depth 8 and mutations of just two (for example's sake) leaves at indices 0 and 65536, compute_mutations would do this, on a high-level and making some simple assumptions about how rayon assigns threads:

  1. Compute subtrees that were modified. This happens in sorted_pairs_to_mutated_leaves. This would yield two subtrees, covering the column ranges 0..256 and 65536..65792.
  2. Then in build_subtree_mutations, the subtrees are updated in parallel.
    • 1st iteration:
      • Thread 0: Compute updates for leaves with indices 0..256 at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 0.
      • Thread 1: Compute updates for leaves with indices 65536..65792 at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 256 (= 65536 >> 8).
    • 2nd iteration:
      • Thread 0: Compute updates for leaves with indices 0..256 at depth 56 (only root 0 has changed). Eventually this results in a new root at depth 48, column 0.
      • Thread 1: Compute updates for leaves with indices 256..512 at depth 56 (only root 256 has changed). Eventually this results in a new root at depth 48, column 1.
    • 3rd iteration:
      • Thread 0: Compute updates for leaves with indices 0..256 at depth 48 (only root 0 has changed). Eventually this results in a new root at depth 40, column 0.
    • More iterations like the 3rd until the root at depth 0 has been reached.

Is this accurate? Would it make sense to add something like this as a doc comment to compute_mutations_subtree (with corrections if it's inaccurate)?

src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/tests.rs Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
@krushimir
Copy link
Author

10M entries tree.

batch insertions (10k inserts):
without smt_hashmaps: 383.3 ms (~38 μs per insert)
with smt_hashmaps: 281.9 ms (~28 μs per insert)
~26% faster
concurrent vs. sequential: 17.7x faster
concurrent with smt_hashmaps vs. sequential: 24.1x faster

batch updates (10k updates):
without smt_hashmaps: 287.9 ms (~29 μs per update)
without smt_hashmaps: 265.5 ms (~27 μs per update)
~8% faster
concurrent vs. sequential: 23.6x faster
concurrent with smt_hashmaps vs. sequential: 25.6x faster

Co-authored-by: Philipp Gackstatter <[email protected]>
@PhilippGackstatter
Copy link
Contributor

Hey @krushimir, quick question: Is this still Work-In-Progress or can it be marked as ready for review?

@krushimir
Copy link
Author

Hi @PhilippGackstatter, I'll push some more changes today and then I'll mark it ready.

@krushimir krushimir marked this pull request as ready for review January 23, 2025 07:12
@krushimir krushimir changed the title [WIP] implements concurrent Smt::compute_mutations Implements concurrent Smt::compute_mutations Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: I'm surprised we don't use criterion and rust's builtin benchmark support for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these benchmarks take a while to run (many minutes). Also, I believe @polydez tried to use criterion but found that there is a pretty significant discrepancy in results.

@@ -233,7 +233,7 @@ impl<const DEPTH: u8> SimpleSmt<DEPTH> {
&self,
kv_pairs: impl IntoIterator<Item = (LeafIndex<DEPTH>, Word)>,
) -> MutationSet<DEPTH, LeafIndex<DEPTH>, Word> {
<Self as SparseMerkleTree<DEPTH>>::compute_mutations(self, kv_pairs)
<Self as SparseMerkleTree<DEPTH>>::compute_mutations_sequential(self, kv_pairs)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this sequential?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel implementation works only with trees whose depth is a multiple of 8 - some context here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks. I would add a comment explaining that 👍

Comment on lines +186 to +193
#[cfg(feature = "concurrent")]
{
self.compute_mutations_concurrent(kv_pairs)
}
#[cfg(not(feature = "concurrent"))]
{
self.compute_mutations_sequential(kv_pairs)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually ever want sequential outside of test purposes? Can we not just have no feature split.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason is that this also needs to work in no_std setting.

src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
src/merkle/smt/mod.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@PhilippGackstatter PhilippGackstatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

src/main.rs Outdated Show resolved Hide resolved
src/merkle/smt/tests.rs Show resolved Hide resolved
@krushimir krushimir force-pushed the krushimir/subtree_mutations branch from 9242cff to e89daa9 Compare January 23, 2025 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants