Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EAP Section #13

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ethan-thompson
Copy link
Contributor

Added EAP section to include details from RFC 6613

Comment on lines +385 to +388
## EAP Sessions
When RADIUS clients send EAP requests using RADIUS/(D)TLS, they MUST choose the same connection for all packets related to one EAP session. This practice ensures that problems with any one connection affect the minimum number of EAP sessions.

A simple method that may work in many situations is to hash the contents of the Calling-Station-Id attribute, which normally contains the Media Access Control (MAC) address. The output of that hash can be used to select a particular connection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused about this MUST requirement.

Just for my own understanding: we are talking about actual clients (i.e. NAS), and not proxies?

I'm not very familiar with EAP on a NAS. I would expect that a client is always aware of the EAP sessions - but its details area outside of this specification. To me this is a prerequieiste for this MUST statement. If we can't guarantee this for a client, then we can't put a MUST here.

The second paragraph is confusing me. Having to keep EAP sessions together based on a radius attribute is something a proxy might do - as stated above a client should already know the EAP sessions. If a client has to guess the EAP sessions from a radius attribute, we can't call this a MUST.

I would propose to remove the second paragraph (if my initial assumption of clients having perfect knowledge about EAP sessions is true).

@Janfred
Copy link
Collaborator

Janfred commented Oct 4, 2024

I think it's worth noting that one has to be careful when proxying EAP requests.
I agree with Fabian that if the RADIUS/(D)TLS client is the "first in line", they should know the EAP session and take care of the correct path themselves, so it is only relevant for proxies.

I would include a bit more reasoning why EAP is different than other RADIUS traffic, so people reading this can understand why they should care.

@alandekok
Copy link
Contributor

I think the connection / load-balancing issues are similar for a client which originates traffic, and a proxy which forwards traffic. So the text should be similar.

The connection issue isn't limited to EAP, it's "all packets for one session". e.g., MFA challenge / response.

If an authentication step takes one packet, then it doesn't matter how the client does packet load-balancing. The failure of any one connection will only affect authentications which go over that one connection.

When authentication requires multiple packets, we have a choice for load-balancing:

  1. put all packets for the same session over the same connection
  2. distribute packets for the same session across multiple connections

The client does not know if the server is an IdP / home server, or a proxy. The client does not even know if the connections are to multiple "equivalent" home servers. As a result, when the client initiates an authentication session with a particular server, that authentication session MUST be tied to one connection. Distributing packets for one session across multiple connections means that in some valid architectures, multiple home servers will see packets for one authentication session.

While this situation may work in some limited scenarios, it guarantees that authentication cannot work in other scenarios which are in use today.

The situation isn't much different if the next hop is a proxy.

With (1), the failure of any one connection will affect only the sessions which use that connection. i.e. for 5 active connections which then has one fail, 1/5 of authentication sessions will be affected. It is likely that those authentication sessions will fail, especially if the connections are to separate home servers. At the minimum, 1/5 authentication sessions will experience increased timeouts and instability as the client does fail-over.

With (2) the failure of any one connection will affect all of the authentication sessions. In the above scenario, any multi-round authentication session will have 1/5 of the packets sent across each connection. So when a connection fails, every ongoing session is affected.

In order to help with network robustness, clients MUST put all packets for one authentication session across the same connection. If that connections fails, clients MAY distribute packets to a different connection, but this will work in only limited situations, and those authentication sessions are likely to fail.

The issue is different for accounting. Each packet is stand-alone, and the packets are far enough apart in time that there is no benefit to putting them on the same connection.

@Janfred
Copy link
Collaborator

Janfred commented Oct 4, 2024

But then we have the problem of defining a "session".
We don't have any reliable easy-to-find indication which RADIUS packets belong to the same session.
So as far as I can see we have two options here:

Option 1: mandate that load-balancing must only be done on the basis of the source of the RADIUS packet (i.e. if your proxy has two connected clients (A,B) and two servers (C,D), the proxy MAY load-balance so that all packets from A go to C and all packets from B go to D. If every RADIUS peer in the chain follows this procedure, we will always have a deterministic path through the RADIUS proxy fabric.
This is the safe option, but also ineffective.

Option 2: Mandate nothing. Just say "Proxying is hard. Beware, there be dragons" and elaborate on what operators may have to consider if they want to enable load-balancing.
If the operators are running a RADIUS service that is strictly single request-response, then there is no harm in randomly throwing the packets to all the servers equally.
If the operators know that their RADIUS service has sessions and the sessions have to go to the same server, then they can disable load-balancing and instead move to a hot-standby configuration

@alandekok
Copy link
Contributor

alandekok commented Oct 4, 2024 via email

@Janfred
Copy link
Collaborator

Janfred commented Oct 4, 2024

I would opt for documenting some best practices, and suggest, but not mandate behavior.

@alandekok
Copy link
Contributor

alandekok commented Oct 4, 2024 via email

@h-vn
Copy link

h-vn commented Oct 8, 2024

Hashing with primarily Calling-Station-Id and User-Name; 👍 for that. We use the both by default and it works. I haven't ever seen them change during the EAP authentication exchange. MAC address randomisation may cause the C-S-I to change, but that would affect the subsequent (re)authentications only.

A note about hashing: Let's say that there are 5 connections and hashing uses modulo 5 to choose the next hop. When a connection goes down, it would be good if the hashing is still done so that the existing EAP exchanges do not switch the next hop, for example, hashing being done with module 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants