-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split node files #360
split node files #360
Conversation
richardpringle
commented
Nov 20, 2023
- Add path to branch-node
- Split leaf, branch, and extension node files
- Remove path again
f53251a
to
dc9001f
Compare
firewood/src/merkle/node.rs
Outdated
let mut items = items.into_iter(); | ||
let decoded_key: Vec<u8> = items.next().unwrap().decode()?; | ||
|
||
let decoded_key_nibbles = Nibbles::<0>::new(&decoded_key); | ||
|
||
let (cur_key_path, term) = | ||
PartialPath::from_nibbles(decoded_key_nibbles.into_iter()); | ||
dbg!(PartialPath::from_nibbles(decoded_key_nibbles.into_iter())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove dbg
let meta_raw = | ||
mem.get_view(addr, META_SIZE as u64) | ||
.ok_or(ShaleError::InvalidCacheView { | ||
offset: addr, | ||
size: META_SIZE as u64, | ||
})?; | ||
let attrs = meta_raw.as_deref()[32]; | ||
|
||
let attrs = meta_raw.as_deref()[TRIE_HASH_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every use of meta_raw
calls as_deref()
. Can we move this to the assignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this code is going away in a future PR
@@ -727,12 +505,15 @@ impl Storable for Node { | |||
.collect(); | |||
|
|||
let (path, _) = PartialPath::decode(&nibbles); | |||
let value = Data(remainder.as_deref()[path_len as usize..].to_vec()); | |||
Ok(Self::new_from_hash( | |||
let data = Data(remainder.as_deref()[path_len as usize..].to_vec()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing with remainder
. We immediately call as_deref()
on every use.
@@ -792,6 +574,7 @@ impl Storable for Node { | |||
|
|||
match &self.inner { | |||
NodeType::Branch(n) => { | |||
// TODO: add path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what this means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is challenging to review, because I can't tell if changes were made to the data structures without extracting them and doing a different set of diffs.
Would it be possible to split this into 3 parts:
(1) rename leaf.0 and leaf.1 to leaf.key and leaf.data
(2) rename NBRANCH to MAX_CHILDREN
(3) move different node types to different files
dc9001f
to
6ac3637
Compare
It would be possible, but it's a lot of work for very little value. You have already identified the three major changes in this PR, and you can rest assured that the functionality hasn't changed when the tests pass. You'll notice that I did not change any of the integration test files here. You may also notice that there are currently three commits here, one of which will be very easy to remove when continuing to get the roughly 2500 lines of #344 applied. If I didn't have that many more lines to go, I wouldn't push back, but this is the tip of the ice-berg. |
6ac3637
to
8f28cbd
Compare
49e1562
to
a1e34a9
Compare
There was some value here, but to review it I had to split it up locally, identifying a few problems I would have otherwise missed; comments coming. Please consider splitting these up in the future; it literally took just over an hour. |
} | ||
} | ||
|
||
impl From<Vec<u8>> for PartialPath { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this addition. It's now easy for someone to put non-nibbles into a PartialPath. If you really think this is needed as a convenience, it should be TryFrom, and each byte should be checked.
This also ties PartialPath more strongly to a Vec, which is something I'd like to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created #366 suggesting to add convenience methods for extending and shrinking PartialPath
. The current problem is that in order to perform any change to the path, you essentially need to copy the PartialPath
, and then make changes to the internal Vec
and reset the value.
Adding From
was a convenience for getting the rest of the "branch with path" implementation further along. I will improve this once it's complete.
} | ||
|
||
// TODO: add path | ||
let path = Vec::new().into(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PartialPath::new()
(or empty()
or default()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is temporary
#[test_case(0b10 << 4, vec![0x12, 0x34], vec![1, 2, 3, 4]; "even length")] | ||
// first nibble is part of the prefix | ||
#[test_case((0b11 << 4) + 2, vec![0x34], vec![2, 3, 4]; "odd length")] | ||
fn encode_regression_test(prefix: u8, path: Vec<u8>, nibbles: Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: at a minimum add a comment in the PR that this is a new, not moved, test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just testing partial_path
? If you're checking to see if the node is storing it correctly, rather than hardcoding this here (which can change if partial_path changes), perhaps use partial_path and compare that encoding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it looks like this is just testing partial-path
I've created #365 to move the test
use crate::merkle::{from_nibbles, PartialPath}; | ||
|
||
#[derive(PartialEq, Eq, Clone)] | ||
pub struct LeafNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: LeafNode shouldn't escape the crate, so the inner ones can just be pub, but this one should be pub(crate)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, added to #362
} | ||
} | ||
|
||
pub fn path(&self) -> &PartialPath { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since these are pub fn
we should reduce the visibility of the members themselves, or get rid of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed (both path
and data
methods are redundant)
#[test_case(0b10 << 4, vec![0x12, 0x34], vec![1, 2, 3, 4]; "even length")] | ||
// first nibble is part of the prefix | ||
#[test_case((0b11 << 4) + 2, vec![0x34], vec![2, 3, 4]; "odd length")] | ||
fn encode_regression_test(prefix: u8, path: Vec<u8>, nibbles: Vec<u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just testing partial_path? If so, can't we move these tests there? If not, a better choice here would be to call partial_path in this test
iter::once, | ||
}; | ||
|
||
// TODO: use smallvec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment only: We might even get by with Box<[u8]> since we almost always know how big it will be when we start. I'm not sure how often these grow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Box<[u8]>
would definitely be an improvement, but the goal of smallvec
or something like it would be quick stack-copies since we actually copy these partial-paths a lot.
Something like Bytes
might be more of a benefit, though, where we store the longest keys on the heap and reference subsets of the keys in the nodes themselves. Again, the goal is a stack-copy when creating a node (branch of leaf) instead of having to allocate copied data.
a1e34a9
to
ab685a2
Compare
702f671
to
9a78695
Compare
9a78695
to
f12df2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
approved with new issues