Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(net): Fix a potential hang caused by accessing the address book directly #7902

Merged
merged 5 commits into from
Nov 5, 2023

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Nov 3, 2023

Motivation

The outbound connector accesses the address book directly, potentially blocking other connections or async tasks. Instead, we can send the failure to the channel.

Also there are some unnecessary services arguments in address book updates.

These fixes happened as part of #7787.

PR Author Checklist

Check before marking the PR as ready for review:

  • Will the PR name make sense to users?
  • Does the PR have a priority label?
  • Have you added or updated tests? Hangs are extremely difficult to test for.
  • Is the documentation up to date?

If a checkbox isn't relevant to the PR, mark it as done.

Complex Code or Requirements

Sending to channels is much faster and cheaper than locking a mutex, which can be delayed for a long time under heavy load.

Solution

  • Send failures to the address book update channel
  • Fix a (possibly unreachable) panic with a zero channel size
  • Remove unnecessary services arguments from some address book updates
  • Update tests

Testing

Existing tests cover this code, but hangs are hard to test for.

Review

This is a low priority bug fix.

Reviewer Checklist

Check before approving the PR:

  • Does the PR scope match the ticket?
  • Are there enough tests to make sure it works? Do the tests cover the PR motivation?
  • Are all the PR blockers dealt with?
    PR blockers can be dealt with in new tickets or PRs.

And check the PR Author checklist is complete.

@teor2345 teor2345 added C-bug Category: This is a bug P-Low ❄️ I-hang A Zebra component stops responding to requests A-network Area: Network protocol updates or fixes A-concurrency Area: Async code, needs extra work to make it work properly. labels Nov 3, 2023
@teor2345 teor2345 self-assigned this Nov 3, 2023
@teor2345 teor2345 requested a review from a team as a code owner November 3, 2023 05:35
@teor2345 teor2345 requested review from arya2 and removed request for a team November 3, 2023 05:35
@teor2345 teor2345 added the I-panic Zebra panics with an internal error message label Nov 3, 2023
@mpguerra mpguerra linked an issue Nov 3, 2023 that may be closed by this pull request
Copy link
Contributor

@arya2 arya2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice.

zebra-network/src/address_book_updater.rs Show resolved Hide resolved
zebra-network/src/peer_set/candidate_set.rs Show resolved Hide resolved
mergify bot added a commit that referenced this pull request Nov 5, 2023
@mergify mergify bot merged commit 43e54d1 into main Nov 5, 2023
102 checks passed
@mergify mergify bot deleted the addr-fixes branch November 5, 2023 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: Async code, needs extra work to make it work properly. A-network Area: Network protocol updates or fixes C-bug Category: This is a bug I-hang A Zebra component stops responding to requests I-panic Zebra panics with an internal error message
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants