Reliable End-to-End Application Layer Transactions #909

SmithSamuelM · 2024-12-15T18:01:20Z

SmithSamuelM
Dec 15, 2024
Maintainer

This is just notes for now will revise later.

When the time scales of committments are indefinite, the concept of replay attack protection becomes problematic. The security principle at work is liveness of control. Regardless of when the commitment was made, is the "presenter" presently in control of the Identifier associated with the commitment. This means that for applications such as access control or authorizations or what can be generalized as entitlements. Liveness of control is essential, so long time scale commitments must be augmented with short time scale commitments that prove liveness.

But for applications that are not entitlements where liveness of control is essential, then replay attacks are less critical in the sense of authorizing misbehavior and more critical in the sense of DoS attack. But the core principles of replay attack prevention still apply. These are timeliness and uniqueness. They just exhibit in different ways. Uniqueness looks like duplicate detection. And timeliness is largely bound by resources not any explicit "time", i.e. timelines is determined by relative resource availability. Its untimely when one no longer has the resources to accept it.

So with SAIDs we have a really good mechanism for uniqueness. Expecially if we require that the message include a salty nonce in its body over which the SAID is calculated. We can de-duplicate any message based on its SAID. We can use the SAID of the first message in any multi-message transaction as the unique identifier of the transaction and then chain all other messages to each other using references to the SAID of the prior message. I.e. a transaction set at the application layer is a verifiable data structure which is self-de-duplicating. So we have uniqueness covered. One can replay messages all one wants but if its already been recieved then its detectable and either is idempotent to the current transaction state or is "out-of-place" to the current transaction state and is dropped.

This then makes timeliness a factor of resources that are caching the transaction state. If transactions are indefinite in time, then the cache grows without bound (blockchain bloat anyone). So a reliable end-to-end application layer transaction benefits from a couple of services.

I have already mentioned that all messages are acknowledge. These are positive acks that signal to the sending party that a message has been received so the sending party can stop retrying the same message (at the application layer of the message). Now the receiving party is the one advancing the transaction state and at some time will be the new sending party. The former sending party now becomes the new receiving party and just waits indefinately for the new sender to advance the transaction state. Obviously this is problematic because transactions live forever.

To mitigate this problem, typically reliable end-to-end application layer transactions (which in the interest of brevity I will for the rest of this discussion just refer to as "transactions" without the qualifier) add negative acknowledgments NACK. One use of a NACK is to enable the receiving party to notify the sending party while waiting for the sending party to send to abort the transaction. There are more nuanced types of NACKs. But lets say that the receiving party i.e. the party who is waiting to recieve a message, runs out of resources and needs to prune "stale" transactions. Once the stale transaction is pruned, the transaction state is lost. Now the sending party who waited so long to advance the transaction that the receiving party pruned the transaction, finally gets around to sending the next message to advance the transaction. Becasue we have universally unique transaction identifiers (SAIDs), the reciever will not find any transaction that corresponds to the new message finally received from the sender. The receiver then NACKs abort which says, this transaction is null and void and this send and nack continues until the sender gets the NACK (so its reliable). Now the sender knows to also abort the transaction.

Another useful type of NACK is a "I am not ready" or "BUSY" nack which says the transaction is still live to me but I don't have the resources to respond right now to keep trying albeit at a lower frequency.

So timeliness is purely determined by the size of the transaction cache. And is enforced efficiently with NACKS.

So we have

UU transaction identifiers
UU message identifiers as part of transactions
A transaction cache
Reliable messaging using ACKs
Reliable timeliness using NACKs

Now when those transacting parties want to make verifiable commitments that are robust to the properties of of such transactions then they are best made by KEL anchors or indirectly by TEL anchors which are in turn KEL anchored.
This means that changes in key state during the lifespan of an indefinite duration transaction do not cause the transaction to fail.

But these transactions are more vulnerable to DoS attack because the attack is amplified by the fact that the packers can't be dropped at any layer lower than the application layer. Consequently a design that tunnels application layer transactions through lower layer transactions with properties like KRAM provides is much better overall.

Lets say you want to have application logic where a human makes a non-repudiable commitment to some data. This commitment needs to be valid for an indefinite period of time (or some defined period of time that is human scale like 30 days or 3 months or a year or a lifetime). All failures of any lower layers must not cause this commitment to fail. So a reliable application layer transaction will often use acknowledged messages with retries at the application layer. There is no limit to the number of retries, they go on forever unless and until an acknowledgment is received or the user explicitly aborts the transaction. They may use an exponentially increasing retry timer so that the network is not flooded with retries. But the only condition where the retries stop is either an acknowledgement is received or the originator of the message explicity interacts at the application layer to interrupt or abort the transaction (and that may only be possible after a certain human scale timer has expired).

The transaction itself then either lasts forever, waiting for a response from the other party or until the originator aborts the transaction.

This application layer commitment is embedded or transported as a payload inside of transport/session/presentation layer wrappers. So the application layer committment is on the payload not the wrapper of the over the wire packet. However, because the authentication at the presentation layer requires a signature of the wrapped packet for authentication to protect against impersonation, there is a non-repudiable signature on the wrapper that is transparently verified and stripped before the application layer payload with its own application layer commitment is delivered to the application layer. With KERI, instead of using bare signatures as non-repudiable application layer commitments, I suggest that anchoring the payload in the KEL is a more appropriate application layer commitment that provides better human time scale compatiblity. The anchor seal reference is also more compact than a signature. For example, should the keys be rotated during the time frame of a transaction that spans weeks, a bare signature would become invalid (unverifiable) and the transaction must be restarted from scratch. But a KEL anchor would still be valid (verifiable) because it is inimically tied to the keystate at the time is was anchored and is still valid after the keys are rotated. Hence my previous statement that to solve application layer human time scale transactions we are better off using kel anchors as commitments, not bare signatures as committments. Bare signatures are necessarily ephemeral and are only approapriate for ephemeral wrappers that exist as lower layers like IPEX EXNS than the application layer (the EXN payload).

So for me any discussion of bare signatures on packets should only be with reference to transaction in layers lower than the application layer. Whereas applicaiton layer transactions should be using kel anchored commitments.

Things like KRAM date-time stamps are in the EXN wrapper not in its payload. So a retry of the original message that gets a new EXN wrapper does not require a recommitment to its payload but merely a resigning of the new wrapper because the new wrapper has a different datetime stamp. The original payload and its associated payload (aka application layer) committment is unaffected by what happens to its wrapper.

This seems intuitively obvious to me.

EXN messages are authenticatible, replay attack protected, wrappers of their payload. Should the wrapper get dropped because the exn was not delivered in a timely fashion the application layer just resends a retry which will get a new wrapper as it decends the protocol stack. The replay attack protection windows only need to accomodate network latencies not human time scale committments. Whereas presentation want to be at ephemeral timescales for security reasons to prove that the entity making the presentation currently is that same entity that controls the keystate at the time of the presentation.

At some point if there are needed long time delay store and forward intermediaries for presentations that span longer time scales when immediacy of proof of control is less important, then SPAC/TSP is designed to solve that problem. Notwithstanding that TSP can solve this problem, when not using TSP, but using full KRAM it is not unreasonable to have replay attack protection caches that allow days or weeks long time windows without needing TSP class intermediaries.

In comparison, synchronous nonces that depend on the TCP connection to stay up for days or weeks as a replay attack mechanism will also fail. Making these reliable would require one to use a timed cache of nonces which cache must outlive any given TCP connection. This is essentially the same infrastructure of full KRAM but with double the number of packets.

From my perspective these sorts of human interactions challenges can only be met with what are called in the literature true reliable application-layer end-to-end transactions. These always sit on top of whatever transport/session/presentation layer features such as replay attack mechanisms, authenticity, encryption etc mechanisms etc. To be clear these are features that exist at lower layers of the protocols stack i.e. the transport, session, and presentation layers. Reliability at the application layer means that any failures at lower layers such as timeoutswill have no effect on the reliability of the application layer transaction. If they do then we have a layer violation. Your examples are larger layer violations because you want the human interaction timescales of the application layer logic to be met by lower layers.

We don't have an application layer end-to-end transaction protocol. IPEX is not an application layer end-to-end transaction protocol. As its name indicates it is a presentation layer protocol. So the use cases you point out above in my opinion are correctly not going to met by IPEX because there is no support for reliable application layer end-to-end transactions in IPEX. I whole-heartedly agree that IPEX is inadequate but not surprisingly so because that is not its purpose. Reliable application layer end-to-end transaction support is needed for the use cases you mention. And we don't have that anywhere.

Let me use an example. TCP, a transport layer protocol, can fail catastrophically and lose all in-flight packets, never to be recovered. It has no concept of restarting a prior connection that reliably restores inflight packets. So any protocol that sits above TCP has to have another reliability layer that detects if a TCP connection ever drops and then restarts whatever it was doing in way that is robust to the fact that some packets were irretrievably lost. It can't restart where it was because it can't know what packets were dropped when the connection dropped. It must have a higher layer concept of a reliable transaction. IP is weakest when one analyzes these sorts of problems because IP has only 4 layers. It doesnt have Session, Presentation and Application layers. So there is no normative concept of a reliable application layer end-to-end transaction. But ISO OSI protocols have this as a normative part of the stack. And that sits on top of session and presentation layers where encryption and authentication occur. So when using TCP/IP, often the top three layers are bespoke to a given application with a mashup of bespoke features. So for example, the whole Idea of replay attack protection is not an application layer feature. It belongs in a lower layer and it would never be something that would be observable at the human interaction application activity like "consent". So I think we are in agreement but maybe not in how to fix it. EXNs have payloads, those payload could include reliable application layer transaction state. EXNs are a transport layer mechanism that has session and presentation layer wrappers.

To be clear, a reliable application layer end-to-end transaction keeps the transaction state despite failures at any lower layer. So, for example, if the transaction pauses to wait for human time-scale consent, the transaction layer should not care that the transport layer times out or drops packets. It should not care that the curent session ends or is dropped or that keys were rotated. It should not care that a presentation times out or that a replay attack protection window drops a packet. They can all fail and a reliable application layer transaction will be robust to those failures and be able to resume the transaction despite such failures. Usually this means it can restart appropriately any and all activity needed at lower layers when and if the application layer decides a packet needs to be transported/sessioned/presented/authenticated/encrypted/replay attack protected.

So we should not be trying to shoehorn reliable application layer transaction logic into lower layers. We should design application layer transactions as application layer transactions with any and all appropriate reliability mechanisms for the needs of the application layer. If the layers below fail, the application layer transaction by design should still succeed.

A specific example, In the lontalk protocol ISO/IEC 14908.1 ANSI/CEA 709.1 is a full 7 layer OSI protocol. I wrote the C-Ref Imp that is in the standard. One of its features that I loved (which it had back in 1989 before the idea of IoT ever existed) was end-to-end application layer transactions. So if I needed to ensure that a transaction state machine progressed independent of any timers or timeouts or escrows or failures of any features of the lower layers, then I created an application layer end-to-end transaction with that state machine, and it just worked. Most people who have never used OSI seven layer protocols have never encountered a reliable application layer end-to-end transaction. They don't know what they are or how they work.

In my experience, human-scale interactions need reliable application layer end-to-end transactions. Not anything at any lower layer. But Ipex is not a reliable application layer end-to-end transaction protocol. Adding such transactions to IPEX would make it a different protocol. To recapitulate, IPEX (as its name indicates) was meant to be a presentation layer protocol not an application layer protocol.

To elaborate, IMHO a multi-sig group signature collection protocol is an example of either a tiered presentation layer protocol or a simple application layer protocol depending on how it manages failures at the IPEX layer (ie presentation layer).

IMHO, IP has created two generations of protocol designers who create bespoke protocols that are often poorly designed mashups of session/presentation/application layer functionality. For example, using special nonces meant for replay attack protection but repurposing them for application layer logic is one of them.

What is no doubt disappointing is that the KERI Suite of protocols does not have support for reliable application layer end-to-end transactions. So I agree with you that there is a clear need for such support but this is beyond the current KERI suite of protocols. It will require some effort to design and build.

When issuing where the issuer is multisig and there is no negotiation, the exns like the offer, grant etc don't have to be multisig because they are merely wrappers. The issuance itself has already been multiple signed by virture of the event that anchors the issuance being in the KEL. So the intent of the multisig-group its committment is established that way and having multi-sig on the EXN is superflouous. Andy member of the multisig group (in this case the leader) can sign single sig the EXN that conveys the issance. If you want the offer to be signed then you collect multiple signatures on the payload of the offer not the offer itself. Once again the offer is just single sig by the leader.

All the timeout problems were arising because the exns for the IPEX negotiantions EXNs were timing out trying to collect signatures. But the pre-protocol is collecting signatues on an offer (not a grant since the grant is post issuance) on the EXN payload which does not timeout. And it only becomes invalid if enough participants rotates keys during the transaction life span.

So I think we should propose that IPEX when used for issuance from a multi-sig group use a leader and all the issuer exns are single sig but with thresholded multi-sig on the payload.

This iss essentially creating two layers.

Top layer is the collecting signatures on the payload
Next layer down is the exn wrapper which is single sig by one member of the group multisig. (no race condition unless malicious group members who propose a competing issuance to one they already signed)

The same goes for presenting. The multisig needs to be on a reference to the issuance being "presented" which is the payload of the EXN not the EXN iteself. Any member of the presenting multi-sig can singly sign the exn. But there needs to be attached or embedded a threshold satisfycing number of signatures on the reference to the issuance being presented. These can be collected at a higher layer collection protocol. Indeed one similar to the collection protocol for issuances.

The difference in code is that in this proposal the EXN is sourced by the AID of a single group member not the AID of Group itself. The receiver then needs to validate that the sender AID is a member of the group, or at least that the signing public key of the sender's current key state is one of the signing keys of the Group Aids key state either as a Group AID of an issuer when its and issuance exchange or the current key state for of a Group AID of a presenter of the presented Issuance when its a presentation exchange.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reliable End-to-End Application Layer Transactions #909

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Reliable End-to-End Application Layer Transactions #909

SmithSamuelM Dec 15, 2024 Maintainer

Replies: 0 comments

SmithSamuelM
Dec 15, 2024
Maintainer