Implements concurrent `Smt::compute_mutations` #365

krushimir · 2025-01-15T22:22:46Z

This PR introduces a concurrent implementation of Smt::compute_mutations, leveraging an approach similar to the existing parallel construction logic.

Benchmark results were collected on a 64-core (128-thread) AMD EPYC 7662 processor, with Rayon’s thread pool explicitly limited to the specified thread counts.

For context, construction benchmarks are also included for performance comparison.

1. Construction Benchmark

10k key-value pairs

Threads	Parallel Time (s)	Sequential Time (s)	Speedup
16	0.5	5.7	11.11x
32	0.4	5.7	15.22x
64	0.3	5.7	17.35x
128	0.4	5.7	16.90x

Optimal performance was achieved with 64 threads.
Diminishing returns were observed with 128 threads

2. Batched Insertion Benchmark

10k key-value pairs

Threads	Parallel Time (ms)	Sequential Time (ms)	Speedup	Avg Insert Time (μs)
16	517.0	6308.7	12.20x	52
32	395.8	6334.5	16.00x	40
64	333.0	6321.6	18.98x	33
128	383.7	6300.7	16.42x	38

64 threads offered the best performance, reducing average insertion time to 33 μs.
Scaling beyond 64 threads led to slight performance degradation.

3. Batched Update Benchmark

10k key-value pairs

Threads	Parallel Time (ms)	Sequential Time (ms)	Speedup	Avg Update Time (μs)
16	482.7	6369.8	13.20x	48
32	357.7	6351.5	17.76x	36
64	304.7	6378.5	20.93x	30
128	273.5	6418.8	23.47x	27

Batched updates scaled better with increased threads.
128 threads achieved the fastest update speed, reducing average time to 27 μs.

PhilippGackstatter

Looks great to me! I think the logic itself looks good. My comments are mostly about naming, docs and deduplication. I might have to take another look anyway, since I first had to understand how the Smt is implemented in sequential code 😅, so I'll just comment for now.

In general, I think adding comments to code parts that are not easy to understand would improve readability and understandability.

Regarding the approach, please correct me if I have misunderstandings, but my understanding of the approach is the following.

Assuming a tree of depth 64 with subtrees of depth 8 and mutations of just two (for example's sake) leaves at indices 0 and 65536, compute_mutations would do this, on a high-level and making some simple assumptions about how rayon assigns threads:

Compute subtrees that were modified. This happens in sorted_pairs_to_mutated_leaves. This would yield two subtrees, covering the column ranges 0..256 and 65536..65792.
Then in build_subtree_mutations, the subtrees are updated in parallel.
- 1st iteration:
  - Thread 0: Compute updates for leaves with indices 0..256 at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 0.
  - Thread 1: Compute updates for leaves with indices 65536..65792 at depth 64. Then updates for leaves at depth 63 within this subtree, and so on, until it eventually results in new root at depth 56, column 256 (= 65536 >> 8).
- 2nd iteration:
  - Thread 0: Compute updates for leaves with indices 0..256 at depth 56 (only root 0 has changed). Eventually this results in a new root at depth 48, column 0.
  - Thread 1: Compute updates for leaves with indices 256..512 at depth 56 (only root 256 has changed). Eventually this results in a new root at depth 48, column 1.
- 3rd iteration:
  - Thread 0: Compute updates for leaves with indices 0..256 at depth 48 (only root 0 has changed). Eventually this results in a new root at depth 40, column 0.
- More iterations like the 3rd until the root at depth 0 has been reached.

Is this accurate? Would it make sense to add something like this as a doc comment to compute_mutations_subtree (with corrections if it's inaccurate)?

src/merkle/smt/mod.rs

src/merkle/smt/tests.rs

src/merkle/smt/mod.rs

krushimir · 2025-01-17T21:37:36Z

10M entries tree.

batch insertions (10k inserts):
without smt_hashmaps: 383.3 ms (~38 μs per insert)
with smt_hashmaps: 281.9 ms (~28 μs per insert)
~26% faster
concurrent vs. sequential: 17.7x faster
concurrent with smt_hashmaps vs. sequential: 24.1x faster

batch updates (10k updates):
without smt_hashmaps: 287.9 ms (~29 μs per update)
without smt_hashmaps: 265.5 ms (~27 μs per update)
~8% faster
concurrent vs. sequential: 23.6x faster
concurrent with smt_hashmaps vs. sequential: 25.6x faster

Co-authored-by: Philipp Gackstatter <[email protected]>

src/merkle/smt/full/mod.rs

PhilippGackstatter · 2025-01-22T09:46:41Z

Hey @krushimir, quick question: Is this still Work-In-Progress or can it be marked as ready for review?

krushimir · 2025-01-22T11:11:22Z

Hi @PhilippGackstatter, I'll push some more changes today and then I'll mark it ready.

Mirko-von-Leipzig · 2025-01-23T08:06:54Z

src/main.rs

Side note: I'm surprised we don't use criterion and rust's builtin benchmark support for this.

Some of these benchmarks take a while to run (many minutes). Also, I believe @polydez tried to use criterion but found that there is a pretty significant discrepancy in results.

Mirko-von-Leipzig · 2025-01-23T08:11:03Z

src/merkle/smt/simple/mod.rs

@@ -233,7 +233,7 @@ impl<const DEPTH: u8> SimpleSmt<DEPTH> {
        &self,
        kv_pairs: impl IntoIterator<Item = (LeafIndex<DEPTH>, Word)>,
    ) -> MutationSet<DEPTH, LeafIndex<DEPTH>, Word> {
-        <Self as SparseMerkleTree<DEPTH>>::compute_mutations(self, kv_pairs)
+        <Self as SparseMerkleTree<DEPTH>>::compute_mutations_sequential(self, kv_pairs)


Why is this sequential?

Parallel implementation works only with trees whose depth is a multiple of 8 - some context here

Ah thanks. I would add a comment explaining that 👍

Mirko-von-Leipzig · 2025-01-23T08:17:35Z

src/merkle/smt/mod.rs

+        #[cfg(feature = "concurrent")]
+        {
+            self.compute_mutations_concurrent(kv_pairs)
+        }
+        #[cfg(not(feature = "concurrent"))]
+        {
+            self.compute_mutations_sequential(kv_pairs)
+        }


Do we actually ever want sequential outside of test purposes? Can we not just have no feature split.

I think the reason is that this also needs to work in no_std setting.

src/merkle/smt/mod.rs

PhilippGackstatter

Looks good to me!

src/main.rs

src/merkle/smt/tests.rs

Co-authored-by: Philipp Gackstatter <[email protected]>

sonarqubecloud · 2025-01-23T17:18:35Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

feat: adds concurrent Smt::compute_mutations

c3bbe1c

bobbinth requested a review from PhilippGackstatter January 16, 2025 17:35

PhilippGackstatter reviewed Jan 17, 2025

View reviewed changes

chore: cleanup bench

c447c6f

chore: adds comment

f42d597

Co-authored-by: Philipp Gackstatter <[email protected]>

bobbinth reviewed Jan 22, 2025

View reviewed changes

src/merkle/smt/full/mod.rs Outdated Show resolved Hide resolved

chore: addressing comments

a76506f

krushimir marked this pull request as ready for review January 23, 2025 07:12

krushimir changed the title ~~[WIP] implements concurrent Smt::compute_mutations~~ Implements concurrent Smt::compute_mutations Jan 23, 2025

Mirko-von-Leipzig reviewed Jan 23, 2025

View reviewed changes

PhilippGackstatter reviewed Jan 23, 2025

View reviewed changes

src/merkle/smt/mod.rs Outdated Show resolved Hide resolved

src/merkle/smt/mod.rs Outdated Show resolved Hide resolved

PhilippGackstatter approved these changes Jan 23, 2025

View reviewed changes

src/main.rs Outdated Show resolved Hide resolved

src/merkle/smt/tests.rs Show resolved Hide resolved

krushimir and others added 2 commits January 23, 2025 17:32

chore: update docs

ec35f28

Co-authored-by: Philipp Gackstatter <[email protected]>

chore: linting and addressing comments

e89daa9

krushimir force-pushed the krushimir/subtree_mutations branch from 9242cff to e89daa9 Compare January 23, 2025 17:03

Merge branch 'next' into krushimir/subtree_mutations

c1bcd6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implements concurrent `Smt::compute_mutations` #365

Implements concurrent `Smt::compute_mutations` #365

krushimir commented Jan 15, 2025

PhilippGackstatter left a comment

krushimir commented Jan 17, 2025

PhilippGackstatter commented Jan 22, 2025

krushimir commented Jan 22, 2025

Mirko-von-Leipzig Jan 23, 2025

bobbinth Jan 23, 2025

Mirko-von-Leipzig Jan 23, 2025

krushimir Jan 23, 2025

Mirko-von-Leipzig Jan 24, 2025

Mirko-von-Leipzig Jan 23, 2025

bobbinth Jan 23, 2025

PhilippGackstatter left a comment

sonarqubecloud bot commented Jan 23, 2025

Implements concurrent Smt::compute_mutations #365

Are you sure you want to change the base?

Implements concurrent Smt::compute_mutations #365

Conversation

krushimir commented Jan 15, 2025

1. Construction Benchmark

2. Batched Insertion Benchmark

3. Batched Update Benchmark

PhilippGackstatter left a comment

Choose a reason for hiding this comment

krushimir commented Jan 17, 2025

PhilippGackstatter commented Jan 22, 2025

krushimir commented Jan 22, 2025

Mirko-von-Leipzig Jan 23, 2025

Choose a reason for hiding this comment

bobbinth Jan 23, 2025

Choose a reason for hiding this comment

Mirko-von-Leipzig Jan 23, 2025

Choose a reason for hiding this comment

krushimir Jan 23, 2025

Choose a reason for hiding this comment

Mirko-von-Leipzig Jan 24, 2025

Choose a reason for hiding this comment

Mirko-von-Leipzig Jan 23, 2025

Choose a reason for hiding this comment

bobbinth Jan 23, 2025

Choose a reason for hiding this comment

PhilippGackstatter left a comment

Choose a reason for hiding this comment

sonarqubecloud bot commented Jan 23, 2025

Quality Gate passed

Implements concurrent `Smt::compute_mutations` #365

Implements concurrent `Smt::compute_mutations` #365