Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(corelib): ByteArray impl Hash trait #7042

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hudem1
Copy link

@hudem1 hudem1 commented Jan 9, 2025

Close #7015

Implement Hash trait for ByteArray

Example

let byte_array: ByteArray = "A simple ByteArray";
PoseidonTrait::new().update_with(byte_array).finalize()

let byte_array_2: ByteArray = "A 2nd simple ByteArray";
PedersenTrait::new(0).update_with(byte_array_2).finalize()

@reviewable-StarkWare
Copy link

This change is Reviewable

@hudem1 hudem1 force-pushed the feat/bytearray_hash branch from 841c4c5 to 1380127 Compare January 9, 2025 23:54
Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)


corelib/src/hash.cairo line 244 at r1 (raw file):

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

should probably consider this internally.
this looks like the correct implementation for pedersen but no necessarily for poseidon.

Code quote:

impl ByteArrayHash<S, +HashStateTrait<S>, +Drop<S>> of Hash<ByteArray, S> {
    fn update_state(mut state: S, value: ByteArray) -> S {
        state = state.update(value.data.len().into());

        for word_index in 0..value.data.len() {
            let word_in_data = (*value.data.at(word_index)).into();
            state = state.update(word_in_data);
        };

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

@hudem1 hudem1 closed this Jan 10, 2025
@hudem1 hudem1 reopened this Jan 10, 2025
@hudem1
Copy link
Author

hudem1 commented Jan 10, 2025

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)

corelib/src/hash.cairo line 244 at r1 (raw file):

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

should probably consider this internally. this looks like the correct implementation for pedersen but no necessarily for poseidon.

Code quote:

impl ByteArrayHash<S, +HashStateTrait<S>, +Drop<S>> of Hash<ByteArray, S> {
    fn update_state(mut state: S, value: ByteArray) -> S {
        state = state.update(value.data.len().into());

        for word_index in 0..value.data.len() {
            let word_in_data = (*value.data.at(word_index)).into();
            state = state.update(word_in_data);
        };

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

Hi @orizi,

Thank you for reviewing my PR! :)

I understand the concern to generalize the implementation for the 2 different hashing algorithms. Could you clarify a bit what you are suggesting to do ? Do you mean you are going to ask internally at Starkware ?

Also, what do you refer to to make the statement that it looks like the correct implementation for pedersen but not necessarily for poseidon? Understanding this should help me better grasp how both hashing algorithms handle ByteArray and therefore improve my PR :)

Thank you for your feedback!

Copy link
Collaborator

@orizi orizi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)


corelib/src/hash.cairo line 244 at r1 (raw file):

Previously, hudem1 wrote…

Hi @orizi,

Thank you for reviewing my PR! :)

I understand the concern to generalize the implementation for the 2 different hashing algorithms. Could you clarify a bit what you are suggesting to do ? Do you mean you are going to ask internally at Starkware ?

Also, what do you refer to to make the statement that it looks like the correct implementation for pedersen but not necessarily for poseidon? Understanding this should help me better grasp how both hashing algorithms handle ByteArray and therefore improve my PR :)

Thank you for your feedback!

No problem.

I meant discuss internally at starkware indeed.

specifically - poseidon has a capacity part of its hash state, unlike pedersen - this enables us to know what is a "valid" start state, and what isn't.

For example:
assuming the hash of (a, b, c) is hash(hash(a, b), c):
in pedersen - it would look exactly this way.
in poseidon - it would look more like hash(hash((a, 0), b), c)

so in pedersen i would easily claim this is the hash of the pair (not the triplet!) (hash(a, b), c) - and you'd have no choice but to believe me.
in posiedon you won't believe me - as the state would be not the hash of hash((hash(a, b), 0), c).

this doesn't matter if we already only hash triples - but in hashing arrays it might matter!

solution in pedersen - we add the array length to the hash, so you won't be able to start from a middle state (as you gotta hash the length first)

in poseidon - you don't need it, as the capacity takes care of this issue.

@hudem1
Copy link
Author

hudem1 commented Jan 10, 2025

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)


corelib/src/hash.cairo line 244 at r1 (raw file):

Previously, orizi wrote...

No problem.

I meant discuss internally at starkware indeed.

specifically - poseidon has a capacity part of its hash state, unlike pedersen - this enables us to know what is a "valid" start state, and what isn't.

For example: assuming the hash of (a, b, c) is hash(hash(a, b), c): in pedersen - it would look exactly this way. in poseidon - it would look more like hash(hash((a, 0), b), c)

so in pedersen i would easily claim this is the hash of the pair (not the triplet!) (hash(a, b), c) - and you'd have no choice but to believe me. in posiedon you won't believe me - as the state would be not the hash of hash((hash(a, b), 0), c).

this doesn't matter if we already only hash triples - but in hashing arrays it might matter!

solution in pedersen - we add the array length to the hash, so you won't be able to start from a middle state (as you gotta hash the length first)

in poseidon - you don't need it, as the capacity takes care of this issue.

Ok, thank you for your explanation!

I think I understand the issue. I'll leave this PR open as stale for now, let me know how to move forward after it was discussed internally!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: ByteArray should impl Hash trait
3 participants