Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delegate branch-storage logic to the branch mod #373

Merged
merged 2 commits into from
Nov 29, 2023

Conversation

richardpringle
Copy link
Contributor

@richardpringle richardpringle commented Nov 22, 2023

There is a code move here and well as some code changes. They are unfortunately in one commit. Breaking it up would mean re-implementing it and possibly introducing a bug.

I will be doing the same thing with both leaf and extension nodes.

Upon reviewing my own code, I see that this is actually just serialization. There's a todo!() that will come later for deserialization.

@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from f5a3c51 to 2c794ea Compare November 22, 2023 22:59
@richardpringle richardpringle force-pushed the refactor-node-serialization branch 2 times, most recently from d1650bd to fd6e3b1 Compare November 22, 2023 23:25
@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from 2c794ea to 751b7fa Compare November 22, 2023 23:26
@richardpringle richardpringle force-pushed the refactor-node-serialization branch from fd6e3b1 to 49a0e8a Compare November 23, 2023 00:16
@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from 751b7fa to cc490cf Compare November 23, 2023 00:21
Base automatically changed from refactor-node-serialization to main November 23, 2023 00:23
@richardpringle richardpringle marked this pull request as ready for review November 23, 2023 00:26
@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from cc490cf to 0dceb44 Compare November 23, 2023 00:26
Copy link
Collaborator

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, will finish on the moved bits later


let offset = if raw_len == u32::MAX as u64 {
let offset = if raw_len == u32::MAX {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

Comment on lines 31 to 35
enum BranchDataLength {
None,
Length(u32),
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing; why is this not an Option<u32>, or even better, Option<NonZeroU32>?

If zero is a valid value, consider https://crates.io/crates/nonmax

This way you get niche optimization with None so the size is smaller and serialization is easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be less confusing if it was

enum BranchDataLength {
    NoData,
    /// `0` is a valid length, *empty* data and *no* data are treated separately
    Length(u32),
}

Shepmaster put this pretty nicely in a comment on SO:

I'll just have to do more profiling to see if it's truly worth it. Using less bytes seems like an obvious win; using less bytes and mandatory math everywhere is less sure-fire.

(emphasis mine)

In any case, I didn't end up using this code. I'll delete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted 313cae7

@@ -179,3 +212,59 @@ impl BranchNode {
.unwrap()
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to review this later to verify it didn't change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually do not have to verify that it didn't change. What you have to verify is that we have tests that make sure that things that we serialize can also be deserialized since we have no burden of backward compatibility (yet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 , now that I say that... I should have explicit tests for this. This is actually tested in all the merkle.rs tests because of this line, but that's pretty fragile.

I created a new issue: #375

@@ -30,6 +30,8 @@ impl DerefMut for DiskAddress {
}

impl DiskAddress {
pub const MSIZE: u64 = size_of::<Self>() as u64;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was the visibility increase needed?

Copy link
Contributor Author

@richardpringle richardpringle Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in (de)serialization

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pub(crate) please

Copy link
Collaborator

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved, almost all nits

}

fn optional_data_len<Len, T: AsRef<[u8]>>(data: Option<T>) -> u64 {
size_of::<Len>() as u64 + data.as_ref().map_or(0, |data| data.as_ref().len() as u64)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unnecessary as_ref(), which is just changing Option to Option<&T>:

Suggested change
size_of::<Len>() as u64 + data.as_ref().map_or(0, |data| data.as_ref().len() as u64)
size_of::<Len>() as u64 + data.map_or(0, |data| data.as_ref().len() as u64)

I wonder why clippy doesn't care about this.

}
}

fn optional_data_len<Len, T: AsRef<[u8]>>(data: Option<T>) -> u64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method needs a comment that describes what it does, and maybe better names, so if you mouse over it you can see what it does:

Suggested change
fn optional_data_len<Len, T: AsRef<[u8]>>(data: Option<T>) -> u64 {
/// Adds the size of some base object plus the length of some optional data
fn base_plus_optional_data_len<Base, T: AsRef<[u8]>>(data: Option<T>) -> u64 {

I'm also not convinced this is any more readable than either (a) just inlining the size_of part; or (b) doing both parts in the callers.

.value
.as_ref()
.map(|val| (val.len() as DataLen, val.deref()))
.unwrap_or((DataLen::MAX, &[]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This needs a new const somewhere (with a comment) since this isn't actually MAX. Maybe DataLen::None? Even better would be for this to be an Option<T> where the value of None is DataLen::MAX.

Comment on lines +197 to +198
let children_encoded_len = self.children_encoded.iter().fold(0, |len, child| {
len + optional_data_len::<EncodedChildLen, _>(child.as_ref())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: readability:

Suggested change
let children_encoded_len = self.children_encoded.iter().fold(0, |len, child| {
len + optional_data_len::<EncodedChildLen, _>(child.as_ref())
let children_encoded_len = self.children_encoded.iter().fold(0, |acc_len, child| {
acc_len + optional_data_len::<EncodedChildLen, _>(child.as_ref())

let (child_len, child) = child_encoded
.as_ref()
.map(|child| (child.len() as EncodedChildLen, child.as_slice()))
.unwrap_or((EncodedChildLen::MIN, &[]));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment. Using MAX and MIN is magics reduces readability

@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from 313cae7 to 9fa4827 Compare November 29, 2023 03:10
@richardpringle richardpringle force-pushed the move-storable-logic-to-branch branch from 9fa4827 to 4c9f31b Compare November 29, 2023 04:19
@richardpringle
Copy link
Contributor Author

@rkuris, I see your nits and raise you one
#394

I think storage could be done in a much more idiomatic way.

@richardpringle richardpringle merged commit 4a4f4ba into main Nov 29, 2023
5 checks passed
@richardpringle richardpringle deleted the move-storable-logic-to-branch branch November 29, 2023 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants