-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/tighten hash usage (fixing Issues #316 and #423) #441
Conversation
…o defines a hash value and equality.
…n achieve the same by adjusting hashbuf_length accordingly.
…hes the format specifier.
…SEGFAULTing on the first EXAMPLES problem.
…(or anything else that may come along, presumably).
…INSIZE), allowing access to that value from outside FXF, and carefully storing whatever we can before moving on to the next segment.
…t an empty translation unit that we get when not using FXF.
…s otherwise size would have necessarily fit).
NOTE: The calculations in |
I'm currently updating the calculations in |
Really looking forward for this PR to be merged. This solves many known issues, including the one described in #360. |
I'm closing this PR as the updates to |
I think it's finally time to throw this monstrosity to the wolves so that it can receive some impartial review.
My original modest goal was to resolve my personal Issue #316 that likely bothered no one else. The standard
dhtElement
type storedKey
andData
asdhtConstValue
s which weretypedef
ed asvoid const *
s, but we apparently wanted to storedata_type
s (i.e.,unsigned int
s) natively. Theunion
aimed to treat the memory holding avoid const *
as instead holding adata_type
, but such punning left me squeamish and my attempt to simply cast back and forth between pointers and integers didn't obviously work. I therefore went with a different idea, namely to makeKey
andData
themselvesunion
s. Theseunion
s can (I believe) store any fundamental type natively, though I leave floating point as a compile-time option that we don't make use of.Making the above change required me to update the various hash data types accordingly. I tightened a number of their operations, allowing them to handle more cases correctly and catch more bugs via
assert
s. I also made the code more portable.I was pretty satisfied with the above, but then Pull Request #360 came in which described Issue #423. The suggested fix there was to increase hashed buffer sizes, and without doing a deep analysis of the Issue I could see that that could be necessary. However, I was skeptical of the arbitrary nature of the new size, and I wanted to determine the actual size needed. As this seemed related to hash operations, I worked on that update here. My results can be seen in
optimisations/hash.h
, and I invite scrutiny of my calculations there. (In particular, I'm still just guessing onMAX_EN_PASSANT_TOP_DIFFERENCE
, and the right value there could be smaller [or conceivably larger].)I thought that the above would fix Issue #423, but testing showed that I also needed to increase
MaxPieceId
. That update has nothing to do with hashing, but I went ahead and made it. I think I've fixed whatever relied onMaxPieceId
being only 63, but please double-check this.It seemed that this should all have been sufficient, but further testing yielded surprising SEGFAULTs. It took a lot of investigation to sort this out, and I found that the issue came down to alignment. Specifically,
fxfAlloc
didn't really enforce data type alignment, despite comments suggesting otherwise. This wasn't a problem when our allocated objects only heldunsigned char
s and similar, but as we were upgrading one element to anunsigned short
(to solve Issue #423) this was becoming a problem. I therefor dug into the fxf infrastructure, figuring out how it operates and modifying it to respect alignment requirements. I think I got this right, but this could also use a lot of testing on different systems.While working on the above, I in some sense came full circle and began to worry about strict aliasing violations. I asked about one usage in this StackOverflow question, and the responses—especially this linked question—suggested that I was right to be worried. I therefore modified my implementation, doing my best to avoid assigning an obvious type to the pointers. This seems to work in practice, though we may want to consider compiler options like
-fno-strict-aliasing
and/or-fno-ipa-strict-aliasing
(where available) to be safe. (Only testing can determine where or if they're necessary and what performance impacts they may have. Unfortunately, I haven't found any documentation that would allow me to say which one(s) would be most appropriate.)With all of the above changes, testing on the non-lengthy examples seems to work fine. I do see small differences in a few files, but they all have relatively large numbers of hash table accesses, so my assumption is that these are cases where we're running out of hash space. Since I changed the allocation logic it's not so surprising that we'd see some differences when that happens, but notably the new numbers are sometimes better and sometimes worse than the previous ones.