-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib: actually hash all 16 bytes of IPv6 addresses, not just 4 #17901
base: master
Are you sure you want to change the base?
Conversation
While the loop is currently exited in all cases after using nexthop, it is a footgun to have "nh" around to be reused in another iteration of the loop. This would leave nexthop with partial data from the previous use. Make it local where needed instead. Signed-off-by: David Lamparter <[email protected]>
Zero out the 12 unused bytes (for the IPv6 address) when reading in an IPv4 address. Signed-off-by: David Lamparter <[email protected]>
Doesn't seem to break anything but really poor style to pass potentially uninitialized data to hash_lookup. Signed-off-by: David Lamparter <[email protected]>
@@ -2624,7 +2624,6 @@ static void bgp_pbr_policyroute_add_to_zebra(struct bgp *bgp, | |||
static void bgp_pbr_handle_entry(struct bgp *bgp, struct bgp_path_info *path, | |||
struct bgp_pbr_entry_main *api, bool add) | |||
{ | |||
struct nexthop nh; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm - could we just initialize this with = {};
instead of adding multiple versions in inner scopes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't fix the footgun mentioned in the commit description: this is used in a loop below, and while currently all paths exit the loop after using the nh
variable if that doesn't remain the case we can end up with leftover garbage from a previous iteration of the loop.
It not being initialized isn't the thing being addressed here, in fact it's being cleared with memset twice in the current code (for good measure, I guess?)
*/ | ||
char _hash_begin[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm - I get that there's an assert, but ... could we just fix the core problem, that we're not hashing enough v6 octets? if someone could measure a meaningful improvement to nexthop hashing - at the level of performance that we offer - then there could be a separate improvement PR that reorganized the code struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My perspective is that the bug we're dealing with is best described as "we weren't hashing the things we thought we were hashing". While it's also intended to perform well by doing it in one block — the doing it in one block in itself also makes it clearer what exactly is being hashed?
But I'm not terribly set on this
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would contain whatever junk is on the stack on function entry. Also move the IPv4 parse before the IPv6 parse so if it's successful we can be sure the other bytes haven't been touched. Signed-off-by: David Lamparter <[email protected]>
When reading in a nexthop from ZAPI, only set the fields that actually have meaning. While it shouldn't happen to begin with, we can otherwise carry padding garbage into the unused leftover union bytes. Signed-off-by: David Lamparter <[email protected]>
We were hashing 4 bytes of the address. Even for IPv6 addresses. Oops. The reason this was done was to try to make it faster, but made a complex maze out of everything. Time for a refactor. Signed-off-by: David Lamparter <[email protected]>
We've accumulated a pair of rather fundamental mixups regarding nexthop hashing and recursive nexthops. Add kind of a "canary" test to make sure the basics are in place -- if this breaks, something has seriously gone wrong. Signed-off-by: Donald Sharp <[email protected]>
fb87a20
to
05b7942
Compare
Core change: We were hashing 4 bytes of the address. Even for IPv6 addresses. Oops.
Related changes:
gate
src
andrmap_src
to check that we're not feeding garbage into the 12 unused bytes and hash thatrmap_src
use is just… wrong… this does a best-effort fixup but it's only a bandaid.nexthop_hash
as a whole. It just feeds a contiguous block ofstruct nexthop
intojhash
now, this should perform better than thejhash_word
shenanigans we've gone to before.this PR is incomplete,
nhg_compare_nexthops
needs to descend into->resolved
(cf. #16970, but that PR is also incomplete)Unfortunately the test should only go red if both the compare and the hash are broken at the same time, so I'd prefer to also fix NHG comparison to see if fixing that (and having the hash broken) makes the test go green to validate that. But fixing the compare needs a bit more work (I'll keep updating this PR)