-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uhash should take a seed #6
Comments
Here's the rationale for why seeding isn't explicitly handled: tl;dr: There is more than one good way to do this (per-process or per-hash-functor), and both are easy to do. So this proposal declined to specify what is the best way for everyone. Also, the various seedable hash algorithms typically use the seed in an algorithm-specific way to initialize the algorithm's state. Blindly taking a seed and xor-ing that seed to any algorithm's result sounds dangerous to me. Hash algorithms are a black art that I am nearly completely ignorant of. The point of this proposal is to not mess with hash algorithms, just enable clients to use them as their author's designed, without modifications or compromises. I.e. to not do, require, or even enable anything like |
Yes, I did read this section; it would be much simplified by my suggested addition.
For any reasonable "good hash" criterion you give me, I'll give you a mathematical proof that it also holds for h^const. That said, my initial impulse was to
Yes, that's true, and to take advantage of algorithm-specific ways you have to write an algorithm-specific hasher as per the aforequoted section. We're here talking about the universal hasher, which by its universal nature is not algorithm-specific. Putting the responsibility to support seeding on the hash algorithm may be more principled, but from user's point of view, in a seeded scenario it's much more convenient if all hash algorithms worked, so that the user can easily try them and choose the best one. |
I think one of my big mistakes with N3980 was too much rambling, and not a clear presentation of what was actually being proposed. It implied more complication than there is. If I ever get around to a revision, I will try to greatly simplify the presentation. Most of the hash algorithms that are seedable use (or can be made to use) a generic syntax for seeding (construct the algorithm with a seed). So it seems logical to take advantage of that. The proposal could include both a per-process seed hasher, and a per-instance seed hasher, but that adds complication where I need to take it away. But maybe it is worth it. The per-instance seed hasher turns a state-less hasher into a stateful one, adding space overhead to every container. Plus gcc-5.2 has a bug that doesn't handle stateful hashers. I'm not sure if it has been fixed yet. The per-process seed hasher requires a singleton in the source to retrieve the seed from. Your suggestion makes |
There are alternative approaches to this, but I don't like them. Such as for instance It'd be better if authors of hash algorithms weren't required to have a constructor taking a seed. Although I suppose that we could do that for the standard algorithms we define. |
Something like
|
Or, after looking at your examples more closely, make that
and then
and
|
Not bad. And it looks like I can derive from |
I didn't think of that, was assuming a specialization for the no-args case. On third thought though, I take back the forwarding constructor, don't want default-constructability in the argumentful case. Better to have two constructors taking |
uhash
should have a constructor that takes a seed. One possible (performance-oriented) implementation could then xor the seed withh
before returning it fromoperator()
. Combined with initializing the seed to 0 in the default constructor, this preserves existing behavior without introducing a branch and an additional call tohash_append
.It's of course possible to seed in a hash adapter, as explained, but this makes seeded hashing second-class, something that the user needs to work for, and it needs to be easy.
Seeding hash functions is all the rage nowadays among the security-savvy, and without explicit support, the standard library may well decide to do it on the container level, which makes it impossible for the user to influence it or supply a seed. It would be better for those standard library implementations to have the option to process-wide seed in the default constructor of
uhash
instead, in which case the user would be able to override it.The text was updated successfully, but these errors were encountered: