feat(corelib): ByteArray impl Hash trait #7042

hudem1 · 2025-01-09T23:52:12Z

Implement Hash trait for ByteArray

Example

let byte_array: ByteArray = "A simple ByteArray";
PoseidonTrait::new().update_with(byte_array).finalize()

let byte_array_2: ByteArray = "A 2nd simple ByteArray";
PedersenTrait::new(0).update_with(byte_array_2).finalize()

reviewable-StarkWare · 2025-01-09T23:52:19Z

This change is

orizi

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)

corelib/src/hash.cairo line 244 at r1 (raw file):

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

should probably consider this internally.
this looks like the correct implementation for pedersen but no necessarily for poseidon.

Code quote:

impl ByteArrayHash<S, +HashStateTrait<S>, +Drop<S>> of Hash<ByteArray, S> {
    fn update_state(mut state: S, value: ByteArray) -> S {
        state = state.update(value.data.len().into());

        for word_index in 0..value.data.len() {
            let word_in_data = (*value.data.at(word_index)).into();
            state = state.update(word_in_data);
        };

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

hudem1 · 2025-01-10T11:26:09Z

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)

corelib/src/hash.cairo line 244 at r1 (raw file):

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

should probably consider this internally. this looks like the correct implementation for pedersen but no necessarily for poseidon.

Code quote:

impl ByteArrayHash<S, +HashStateTrait<S>, +Drop<S>> of Hash<ByteArray, S> {
    fn update_state(mut state: S, value: ByteArray) -> S {
        state = state.update(value.data.len().into());

        for word_index in 0..value.data.len() {
            let word_in_data = (*value.data.at(word_index)).into();
            state = state.update(word_in_data);
        };

        state.update(value.pending_word_len.into()).update(value.pending_word)
    }
}

Hi @orizi,

Thank you for reviewing my PR! :)

I understand the concern to generalize the implementation for the 2 different hashing algorithms. Could you clarify a bit what you are suggesting to do ? Do you mean you are going to ask internally at Starkware ?

Also, what do you refer to to make the statement that it looks like the correct implementation for pedersen but not necessarily for poseidon? Understanding this should help me better grasp how both hashing algorithms handle ByteArray and therefore improve my PR :)

Thank you for your feedback!

orizi

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)

corelib/src/hash.cairo line 244 at r1 (raw file):

Previously, hudem1 wrote…

Hi @orizi,

Thank you for reviewing my PR! :)

I understand the concern to generalize the implementation for the 2 different hashing algorithms. Could you clarify a bit what you are suggesting to do ? Do you mean you are going to ask internally at Starkware ?

Also, what do you refer to to make the statement that it looks like the correct implementation for pedersen but not necessarily for poseidon? Understanding this should help me better grasp how both hashing algorithms handle ByteArray and therefore improve my PR :)

Thank you for your feedback!

No problem.

I meant discuss internally at starkware indeed.

specifically - poseidon has a capacity part of its hash state, unlike pedersen - this enables us to know what is a "valid" start state, and what isn't.

For example:
assuming the hash of (a, b, c) is hash(hash(a, b), c):
in pedersen - it would look exactly this way.
in poseidon - it would look more like hash(hash((a, 0), b), c)

so in pedersen i would easily claim this is the hash of the pair (not the triplet!) (hash(a, b), c) - and you'd have no choice but to believe me.
in posiedon you won't believe me - as the state would be not the hash of hash((hash(a, b), 0), c).

this doesn't matter if we already only hash triples - but in hashing arrays it might matter!

solution in pedersen - we add the array length to the hash, so you won't be able to start from a middle state (as you gotta hash the length first)

in poseidon - you don't need it, as the capacity takes care of this issue.

hudem1 · 2025-01-10T15:13:01Z

Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @hudem1)

corelib/src/hash.cairo line 244 at r1 (raw file):

Previously, orizi wrote...

No problem.

I meant discuss internally at starkware indeed.

specifically - poseidon has a capacity part of its hash state, unlike pedersen - this enables us to know what is a "valid" start state, and what isn't.

For example: assuming the hash of (a, b, c) is hash(hash(a, b), c): in pedersen - it would look exactly this way. in poseidon - it would look more like hash(hash((a, 0), b), c)

so in pedersen i would easily claim this is the hash of the pair (not the triplet!) (hash(a, b), c) - and you'd have no choice but to believe me. in posiedon you won't believe me - as the state would be not the hash of hash((hash(a, b), 0), c).

this doesn't matter if we already only hash triples - but in hashing arrays it might matter!

solution in pedersen - we add the array length to the hash, so you won't be able to start from a middle state (as you gotta hash the length first)

in poseidon - you don't need it, as the capacity takes care of this issue.

Ok, thank you for your explanation!

I think I understand the issue. I'll leave this PR open as stale for now, let me know how to move forward after it was discussed internally!

feat(corelib): ByteArray impl Hash trait

1380127

hudem1 force-pushed the feat/bytearray_hash branch from 841c4c5 to 1380127 Compare January 9, 2025 23:54

orizi requested changes Jan 10, 2025

View reviewed changes

hudem1 closed this Jan 10, 2025

hudem1 reopened this Jan 10, 2025

orizi requested changes Jan 10, 2025

View reviewed changes

feat(corelib): Added test for ByteArray hash using Pedersen

a3172b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(corelib): ByteArray impl Hash trait #7042

feat(corelib): ByteArray impl Hash trait #7042

hudem1 commented Jan 9, 2025 •

edited

Loading

reviewable-StarkWare commented Jan 9, 2025

orizi left a comment

hudem1 commented Jan 10, 2025

orizi left a comment

hudem1 commented Jan 10, 2025

feat(corelib): ByteArray impl Hash trait #7042

Are you sure you want to change the base?

feat(corelib): ByteArray impl Hash trait #7042

Conversation

hudem1 commented Jan 9, 2025 • edited Loading

reviewable-StarkWare commented Jan 9, 2025

orizi left a comment

Choose a reason for hiding this comment

hudem1 commented Jan 10, 2025

orizi left a comment

Choose a reason for hiding this comment

hudem1 commented Jan 10, 2025

hudem1 commented Jan 9, 2025 •

edited

Loading