uip | title | description | author | status | type | category | created |
---|---|---|---|---|---|---|---|
74 |
Informal Ping |
informal (stateless) pinging to reduce galaxy load |
~master-morzod (@joemfb), ~norsyr-torryn (@yosoyubik) |
Draft |
Standards Track |
Kernel |
~2023.8.9 |
This proposal introduces an informal (stateless) ping mechanism, moving route maintenance out of arvo and dramatically reducing the load on galaxies, without requiring operator configuration.
For any ship to be accessible to others in the network, its sponsoring galaxy must be able to forward packets to it. For the sponsoring galaxy to forward packets to a ship, it must have a lane to that ship; to learn a lane to a ship, one must hear packets from that ship.
Galaxies are always axiomatically available: any ship must always be able to get packets to their galaxy. Routinely pinging the sponsoring galaxy serves two purposes: the galaxy can learn a route to the ship in question (always direct), and any stateful firewalls or "NAT pinholes" are kept open, maintaining the route so that the galaxy can use it to forward requests to us.
Many local networks are configured to enforce client/server behavior: inbound packets from the broader internet are not allowed on to the local network unless they can be interpreted as a response to a request previously issued from the local network. Practically, this means that peers on such a network must send packets first before they can be received, and must resend often to maintain inbound connectivity -- the de-facto timeout for such behavior is ~s30.
These pings are currently stateful request messages (ie, pokes). Since they must be sent quite frequently, the load they impose on the galaxies is substantial. If we assume that a poke (and the associated disk write) takes 5ms and every ship must ping their galaxy every ~s25, galaxies can forward for no more than 5000 ships (of any class). That puts a hard upper bound of 1.2 million active ships on the entire network (assuming all galaxies active and perfect distribution across them).
This scaling limit can be raised dramatically by decoupling these functions of the galaxy pings: allowing the galaxy to observe that our route has changed, and maintaining an inbound route from our galaxy through stateful firewalls and/or NAT devices.
To separate these functions, we propose to:
- add a STUN server to vere (sharing %ames' UDP socket, for maximum precision)
- add a gift to %ames, informing vere of our full sponsorship chain
- add a task to %ames, allowing outside control of the
:ping
app- disable formal pinging entirely
- send a single formal ping to the sponsorship chain
- resume regular formal pinging
- on %born (boot or restart)
:ping
will commence its regular, formal pings- %ames will give the sponsorship chain to vere
- once vere has the sponsorship chain
- vere eagerly resolves the sponsoring galaxy's DNS record
- vere sends a STUN request to the galaxy
- on STUN response
- vere saves the result
- vere sets a retry timer for every
~s25
- vere injects an event disabling formal ping
- on subsequent STUN responses
- vere compares the result to the saved result
- if the differ, vere injects an event sending a single formal ping
- if vere ever stops hearing STUN responses for too long
- vere clears the saved STUN result, cancels the retry timer
- vere injects an event resuming regular, formal pings
This procedure can be generalized to work with arbitrary ships, but it's not necessary that it be used for anyone other than our galaxy until we either start forwarding through stars, or need some keep-alive mechanism for "sticky" scry requests. Both of those are planned for the future. The current design keeps formal pings from /app/ping to the rest of the (non-galaxy) sponsorship hierarchy. These could be removed, since galaxies can statelessly forward packets from the sponsor star to the sponsee.
Initially only galaxies will need to act as STUN servers, but any ship could support STUN requests. These could be used to check connectivity with other ships outside of the event loop, and, if additionally listening on a standard port, could provide useful service to applications running outside of urbit (such as webrtc clients).
The current behavior of /app/ping pinging the sponsorship hierarchy will remain, and only clients that support STUN will send informal packets to galaxies. Only successful STUN responses will inject an event into %arvo that deactivates /app/ping, otherwise it will continue functioning as normal. Subsequently, STUN requests that stop being honored by the galaxy will re-activate /app/ping again
/app/ping will continue tracking changes in sponsorship and breaches, restarting the ping logic as necessary and emitting an effect to %vere to update the sponsor galaxy lane if the ship ever escapes.
TBD
Copyright and related rights waived via CC0.