Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(debug): Log a privacy preserving hash of IP and UserAgent to assist in rate limiting debugging #148

Merged
merged 8 commits into from
Jul 24, 2024

Conversation

ryanschneider
Copy link
Contributor

@ryanschneider ryanschneider commented Jul 10, 2024

📝 Summary

We are seeing abuse of our endpoint that is bypassing our firewall rules. To help determine how they are getting past the firewall we need to log the "source" of the traffic in a privacy-preserving way.

What I came up with is the xxhash( x-forwarded-for + user-agent ) or more specifically the left-most non-private IP in the X-Forwarded-For and the User-Agent headers.

However, after discussing w/ @0x416e746f6e he made the good point that User-Agent can be gamed by randomly generating a new UA w/ each request. Furthermore, the fingerprint should not be long-term stable, as that would allow a malicious actor with access to Flashbot logs the ability to track user behavior long-term. So now the hash is salted w/ the current timestamp truncated to the hour, and the UA is removed. And finally, the truncated timestamp is XOR'ed w/ a random uint64 at startup to prevent "rainbow table" attacks where a malicious RPC operator exhaustively hashes all IP/timestamp combinations to determine the source IP of a fingerprint.

In addition, we currently use proxyd as our upstream which by default uses the X-Forwarded-For header for rate limiting. Since xxhash is a uint64 it actually fits perfectly in an IPv6 address, so we convert the fingerprint to a fake IPv6 address in the reserved "example address" documentation prefix from https://datatracker.ietf.org/doc/html/rfc3849 and insert this "IP" as the X-Forwarded-For field.

⛱ Motivation and Context

Give us to means to perform log aggregation to see if the offending traffic is coming from a single or multiple sources.

📚 References


✅ I have run these commands

  • make lint
  • make test
  • go mod tidy

@ryanschneider ryanschneider marked this pull request as ready for review July 11, 2024 22:39
server/fingerprint.go Outdated Show resolved Hide resolved
server/fingerprint.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@metachris metachris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks great. left a few minor comment but nothing important

@ryanschneider ryanschneider merged commit ebf1086 into main Jul 24, 2024
2 checks passed
@ryanschneider ryanschneider deleted the log-x-forwarded branch July 24, 2024 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants