-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(SMT): reverse mutations generation, mutations serialization #355
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! Not an in-depth review yet, but I left a few small comments inline.
The only concern I have is that now we produce reverse mutations set on each applying of mutations set. For the node it's fine, because we always need them after changing of the latest accounts SMT. But if someone else uses our SMT, it might be unnecessary to calculate such reversion on each mutations set applying.
Yeah - I was thinking about this as well. Could you run benchmarks to see how this affects things - especially for relatively large trees? (we may already have benchmarks for this).
If the effect is significant, I wonder if there is some clean interface to let the user apply mutations with or without generating the reverse set. (but let's run the benchmarks first).
One of solutions would be introducing new |
As discussed offline, having 2 separate methods would be a more flexible solution (because there may be situations where the same codebase may want to use different versions of the method). Maybe, under the hood, both methods would use a single method that could look something like: fn apply_mutations(
&mut self,
mutations: MutationSet<DEPTH, Self::Key, Self::Value>,
return_inverse_mutations: bool,
) -> Result<MutationSet<DEPTH, Self::Key, Self::Value>, MerkleError> |
I implemented benchmark for With generation of reverse mutations set: apply_mutations/SimpleSmt: apply_mutations/1000 apply_mutations/SimpleSmt: apply_mutations/100000 Without generation of reverse mutations set: apply_mutations/SimpleSmt: apply_mutations/1000 apply_mutations/SimpleSmt: apply_mutations/100000 Here we can see up to 11% of performance improvement (for 100,000 pairs). The bad news here is that it takes up to 3.6 seconds to apply mutation set (without any other operations, like computations of mutation set). It's higher than current limit per block (65,536 accounts), but I think, it's too slow for fitting in 2 seconds per block limit. |
To clarify: how big are the trees in these benchmarks and how many leaves are we updating? For now, we can assume that we probably won't be updating more than 10K leaves per block - so, I'd benchmark this number. Could you benchmark how long this takes on a fairly large tree (e.g., 1M or 10M entries)? If this still takes a long time, we may need to start thinking about how to optimize the |
In these benchmarks we set up trees to have 10x more leaves than updates (10K leaves for 1K updates and 1M leaves for 100K updates). I will try separate benchmarks for 1M and 10M trees with 10K updating leaves. |
We should also benchmark |
@bobbinth, I have first numbers. For tree with 1M leaves, benchmark 10K updates: apply_mutations/SimpleSmt: apply_mutations/10000 |
This is pretty slow (probably around 10x slower than I thought it would be). This basically means we spend 0.1 ms per updated leaf. Could you check what's the bottleneck here? |
The bottleneck here is I will try to rewrite |
I've rewritten apply_mutations/SimpleSmt: apply_mutations/10000 But overall benchmarking takes more than half of hour for 10 samples. So probably |
Very nice! Could you create a separate PR for this? Not sure if possible, but would be awesome to have this behind a feature flag. Also, we are mostly interested in
We are working on parallelizing this - so, hopefully, we should be able to run |
Do we rely on ordering of iterators over inner nodes or leaves? This is the only obstacle I can see now for full migration of |
Hmmm - good question. I don't think we should be, but I'm not 100% sure we are not. |
One thing that comes to mind is that order of leaves is important during |
btw, I think we already have some benchmarks for computing mutations here. |
Benchmarking of compute_mutations/SimpleSmt: compute_mutations/10000
Ah, didn't find this, thank you! |
If it's important, we can sort leaves before serialization. |
90b993e
to
70a6383
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you! I left a couple of small comments inline.
868fa4e
to
7777e33
Compare
As we had decided before, I rebased this branch onto v0.12.0 tag. We're going to release this as a patch (v0.12.1, I guess). |
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good! Thank you!
Related to 0xPolygonMiden/miden-node#527
While refactoring account inclusion proofs in-memory storage, I realized that it's very hard to keep persistent state for additional "initial" structure, and it would be easier to keep only reverse mutation sets and the latest state. Thus, we need to store only reverse mutations set, but latest state we can still reconstruct from the database.
In this PR I implemented reverse mutations set generation on mutations applying (using a new method
apply_mutations_with_reversion()
. Also implemented serialization of mutations set and added benchmarks for mutations calculation/applying.