Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

256-bit variant MurMurMurMur #75

Open
DonaldTsang opened this issue Sep 21, 2019 · 1 comment
Open

256-bit variant MurMurMurMur #75

DonaldTsang opened this issue Sep 21, 2019 · 1 comment

Comments

@DonaldTsang
Copy link

Will there ever be a 256-bit version of MurMur for avoiding hash collisions in larger datasets?

@sebres
Copy link

sebres commented Sep 23, 2019

I guess it would then not as much deviate from SHA-256, according to performance and distribution quality (note I said SHA, not SHA3).
Anyway the performance comparison of both looks like:

MethodCalculate over 32 bytesCalculate over 1KB
PerformanceRatePerformanceRate
MM3-1280.236128 µs/#4234986 #/sec0.462609 µs/#2161655 #/sec
SHA-2560.487272 µs/#2052241 #/sec3.917825 µs/#255243 #/sec

Why I think so - because too many nested shifts/mixes expected by interim calculations on (non-native) 256-bit "integers".
This is comparable to use MM64-128 compiled for 32-bit systems (due to non-native or missing 128-bit int's there) - just try to compare by yourself, you'd see similar performance "degradation".

Note that MM3 is good cascade-able (if we're talking about not crypto-strong hashing, to avoid collisions only)... thus combining N hashes you'll be able to build hash with sizes 256, 384, 512, etc bits.
So you can simply try to use 2-blocks MM3-128 for hashing one by one the blocks of large strings each 32 bytes (128 bits): so calculate first 32-bytes block of string into one hash, second into another, 3rd again into first hash, etc, then you'd reach almost the same performance (comparable with 0.46µs for 1KB like above), and your both 128-bits hashes representing one 256-bits hash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants