Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split node files #360

Merged
merged 8 commits into from
Nov 22, 2023
Merged

split node files #360

merged 8 commits into from
Nov 22, 2023

Conversation

richardpringle
Copy link
Contributor

  • Add path to branch-node
  • Split leaf, branch, and extension node files
  • Remove path again

@richardpringle richardpringle force-pushed the split-node-files branch 2 times, most recently from f53251a to dc9001f Compare November 20, 2023 21:47
let mut items = items.into_iter();
let decoded_key: Vec<u8> = items.next().unwrap().decode()?;

let decoded_key_nibbles = Nibbles::<0>::new(&decoded_key);

let (cur_key_path, term) =
PartialPath::from_nibbles(decoded_key_nibbles.into_iter());
dbg!(PartialPath::from_nibbles(decoded_key_nibbles.into_iter()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove dbg

let meta_raw =
mem.get_view(addr, META_SIZE as u64)
.ok_or(ShaleError::InvalidCacheView {
offset: addr,
size: META_SIZE as u64,
})?;
let attrs = meta_raw.as_deref()[32];

let attrs = meta_raw.as_deref()[TRIE_HASH_LEN];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every use of meta_raw calls as_deref(). Can we move this to the assignment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this code is going away in a future PR

@@ -727,12 +505,15 @@ impl Storable for Node {
.collect();

let (path, _) = PartialPath::decode(&nibbles);
let value = Data(remainder.as_deref()[path_len as usize..].to_vec());
Ok(Self::new_from_hash(
let data = Data(remainder.as_deref()[path_len as usize..].to_vec());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing with remainder. We immediately call as_deref() on every use.

@@ -792,6 +574,7 @@ impl Storable for Node {

match &self.inner {
NodeType::Branch(n) => {
// TODO: add path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this means?

Copy link
Collaborator

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is challenging to review, because I can't tell if changes were made to the data structures without extracting them and doing a different set of diffs.

Would it be possible to split this into 3 parts:
(1) rename leaf.0 and leaf.1 to leaf.key and leaf.data
(2) rename NBRANCH to MAX_CHILDREN
(3) move different node types to different files

@richardpringle
Copy link
Contributor Author

@rkuris

Would it be possible to split this into 3 parts

It would be possible, but it's a lot of work for very little value. You have already identified the three major changes in this PR, and you can rest assured that the functionality hasn't changed when the tests pass. You'll notice that I did not change any of the integration test files here.

You may also notice that there are currently three commits here, one of which will be very easy to remove when continuing to get the roughly 2500 lines of #344 applied. If I didn't have that many more lines to go, I wouldn't push back, but this is the tip of the ice-berg.

@richardpringle richardpringle enabled auto-merge (squash) November 20, 2023 23:46
@rkuris
Copy link
Collaborator

rkuris commented Nov 21, 2023

It would be possible, but it's a lot of work for very little value. You have already identified the three major changes in this PR, and you can rest assured that the functionality hasn't changed when the tests pass. You'll notice that I did not change any of the integration test files here.

There was some value here, but to review it I had to split it up locally, identifying a few problems I would have otherwise missed; comments coming. Please consider splitting these up in the future; it literally took just over an hour.

firewood/src/shale/mod.rs Show resolved Hide resolved
}
}

impl From<Vec<u8>> for PartialPath {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this addition. It's now easy for someone to put non-nibbles into a PartialPath. If you really think this is needed as a convenience, it should be TryFrom, and each byte should be checked.

This also ties PartialPath more strongly to a Vec, which is something I'd like to change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #366 suggesting to add convenience methods for extending and shrinking PartialPath. The current problem is that in order to perform any change to the path, you essentially need to copy the PartialPath, and then make changes to the internal Vec and reset the value.

Adding From was a convenience for getting the rest of the "branch with path" implementation further along. I will improve this once it's complete.

}

// TODO: add path
let path = Vec::new().into();
Copy link
Collaborator

@rkuris rkuris Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartialPath::new() (or empty() or default())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is temporary

#[test_case(0b10 << 4, vec![0x12, 0x34], vec![1, 2, 3, 4]; "even length")]
// first nibble is part of the prefix
#[test_case((0b11 << 4) + 2, vec![0x34], vec![2, 3, 4]; "odd length")]
fn encode_regression_test(prefix: u8, path: Vec<u8>, nibbles: Vec<u8>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: at a minimum add a comment in the PR that this is a new, not moved, test

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just testing partial_path? If you're checking to see if the node is storing it correctly, rather than hardcoding this here (which can change if partial_path changes), perhaps use partial_path and compare that encoding?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like this is just testing partial-path

I've created #365 to move the test

use crate::merkle::{from_nibbles, PartialPath};

#[derive(PartialEq, Eq, Clone)]
pub struct LeafNode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: LeafNode shouldn't escape the crate, so the inner ones can just be pub, but this one should be pub(crate).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, added to #362

}
}

pub fn path(&self) -> &PartialPath {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since these are pub fn we should reduce the visibility of the members themselves, or get rid of this.

Copy link
Contributor Author

@richardpringle richardpringle Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed (both path and data methods are redundant)

#364

#[test_case(0b10 << 4, vec![0x12, 0x34], vec![1, 2, 3, 4]; "even length")]
// first nibble is part of the prefix
#[test_case((0b11 << 4) + 2, vec![0x34], vec![2, 3, 4]; "odd length")]
fn encode_regression_test(prefix: u8, path: Vec<u8>, nibbles: Vec<u8>) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just testing partial_path? If so, can't we move these tests there? If not, a better choice here would be to call partial_path in this test

iter::once,
};

// TODO: use smallvec
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment only: We might even get by with Box<[u8]> since we almost always know how big it will be when we start. I'm not sure how often these grow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Box<[u8]> would definitely be an improvement, but the goal of smallvec or something like it would be quick stack-copies since we actually copy these partial-paths a lot.

Something like Bytes might be more of a benefit, though, where we store the longest keys on the heap and reference subsets of the keys in the nodes themselves. Again, the goal is a stack-copy when creating a node (branch of leaf) instead of having to allocate copied data.

Copy link
Collaborator

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved with new issues

@richardpringle richardpringle merged commit 9e1fd97 into main Nov 22, 2023
@richardpringle richardpringle deleted the split-node-files branch November 22, 2023 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants