Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A concise description #916

Open
bmottershead opened this issue Jan 1, 2025 · 7 comments
Open

A concise description #916

bmottershead opened this issue Jan 1, 2025 · 7 comments
Labels
bug Something isn't working triage

Comments

@bmottershead
Copy link

bmottershead commented Jan 1, 2025

Version

$ kli version
Library version: 1.2.0-rc1

Environment

Linux Fedora 40, Python 3.12.8

Expected behavior

After executing the following kli commands, from the initial state,

$ kli init -n test --nopasscode
$ kli incept -n test -a allie --icount 1 --isith 1 --ncount 1 --nsith 1 --toad 0
$ kli export -n test -a allie

The result should be a valid JSON-encoded CSER stream

Actual behavior

The kli export commands outputs the following:
{"v":"KERI10JSON0000fd_","t":"icp","d":"EE7Uf-Kf6c81FlujBXpdewc_lWsxOf-LC61GztFpK8Ce","i":"BK1HPsMhM2O1SEmY8ig0bbnqtwQeUIBFEac3o_pNuFFL","s":"0","kt":"1","k":["BK1HPsMhM2O1SEmY8ig0bbnqtwQeUIBFEac3o_pNuFFL"],"nt":"0","n":[],"bt":"0","b":[],"c":[],"a":[]}-VAn-AABAADlS7aGNiF4sH9FY1F8NWPS-I-E981wfvQWnQ99EMbV4cfNF0T1RGrPa0TydVcSKNSTrH469M9_xjhljqhS3DEH-EAB0AAAAAAAAAAAAAAAAAAAAAAA1AAG2025-01-01T00c25c42d722479p00c00

This has invalid data appended to what seems like it might be valid JSON

Steps to reproduce

See above

@bmottershead bmottershead added bug Something isn't working triage labels Jan 1, 2025
@daidoji
Copy link
Contributor

daidoji commented Jan 1, 2025

This is a well-formed CESR stream. CESR streams can be composed of field maps (like the JSON output in this stream), op codes (which aren't specified yet), count codes which are a TLV scheme, and commented annotations. See https://trustoverip.github.io/tswg-cesr-specification/ for details.

@bmottershead
Copy link
Author

bmottershead commented Jan 1, 2025 via email

@dc7
Copy link
Contributor

dc7 commented Jan 1, 2025

The spec is a bit of a work in progress! I would suggest starting at Section 9.5: Cold start Stream parsing. Take the first 3 bits and look them up in this table. The values are cunningly chosen so that whether you start with a left brace, a dash, a msgpack map, a cbor map, etc. the first 3 bits are always unique. Your stream starts with a left brace so you parse it as a JSON object until you get to the matching right brace, then you go back to the table. Next you have a dash, so it must be a count code in the text domain. Look it up in this table (don't forget to use the table for the right genus and version!) and it tells you how many bytes to read to get the length of the payload. Once you know the length you can read the payload. Which is also a count code in this case, so recursively parse the payload as a cesr stream until the payload is exhausted, then continue parsing the original stream until that's exhausted.

I would parse the attachments as follows:

CD_V_AttachmentGroup: (160) bytes
  CD_A_ControllerIdxSigs: (1) count
    IDX_A_Ed25519_Sig: AADlS7aGNiF4sH9FY1F8NWPS-I-E981wfvQWnQ99EMbV4cfNF0T1RGrPa0TydVcSKNSTrH469M9_xjhljqhS3DEH
  CD_E_FirstSeenReplayCouples: (2) count
    CD_0A_Random_salt_seed_nonce: 0AAAAAAAAAAAAAAAAAAAAAAA
    CD_1AAG_DateTime: 1AAG2025-01-01T00c25c42d722479p00c00

In other words, the outer layer is a variable length attachment group describing a digest seal (for the JSON map). The first element is an indexed list of signatures containing a single Ed25519 signature. The second element is a replay couplet, containing a sequence number (0) and a datetime (with our "custom encoding" applied).

Calculating the length of things can get a little complicated... I'd suggest starting with Section 9: Text coding scheme design. You might also notice that some count codes (V) have their length specified in bytes and others (A & E) have their length specified in a number of primitives, so for those you need to parse the payloads until the proper number of primitives have been read.

@bmottershead
Copy link
Author

OK, thank you for the explanation. But what is this attachment? The help for kli states that "export" outputs a stream of key events, presumably the KEL for the key, and you can see that there is one key event for this key, a JSON-encoded "icp" inception event. This is followed by an "attachment", encoded for some reason a different way. Is this attachment related to that "icp" event? Do events always have attachments, and what ties the attachment to the event? Do its attachments always immediately follow the event in the stream? Where is the KEL stream documented? The attachment as you have parsed it seems to be a signature, judging from the labels. Is it a signature on the icp event? If so, why aren't the signatures JSON-encoded and included with the rest of the event, which is JSON-encoded?

@daidoji
Copy link
Contributor

daidoji commented Jan 5, 2025

But what is this attachment? The help for kli states that "export" outputs a stream of key events, presumably the KEL for the key, and you can see that there is one key event for this key, a JSON-encoded "icp" inception event. This is followed by an "attachment", encoded for some reason a different way. Is this attachment related to that "icp" event?

This attachment is a replay attack prevention mechanism consisting of an indexed signature (of the key listed in the event in the order its listed, in this case one key in the first place) as well as the sequence number and datetime of the event. The event is JSON-encoded field map, the attachments are the authentication and replay attack protection mechanisms. In KERI (and ACDC and other protocols) the events or credentials are associated with the attachments necessary to do whatever they're trying to do securely. So in KERI, (in order to prevent all kinds of attacks, various attachments are added to events that the receiver of the event then has to verify for shenanigans).

Do events always have attachments, and what ties the attachment to the event?

In KERI yes as far as I'm aware. We have to prove we hold the keys in order to verify the event we're communicating hasn't been tampered with and we want to prove that no one else is doing stuff, etc... Most cryptographic systems will require some form of proof and in KERI the attachments are the general mechanism (as well as the well-formedness of the events themselves).

Do its attachments always immediately follow the event in the stream?

In KERI in the current implementation (mostly). Some things like delegated signatures may have chopped up CESR streams I think but most of the time these attachments will follow the events directly for ease of parsing by the receiever.

Where is the KEL stream documented?

The things that make up the attachments that correspond to the event stream's primitives and CESR encoding are specified in the CESR spec. The combination of those primitives and count codes into particular types of attachments have been deemed out of scope to the KERI, CESR, ACDC specs. Currently these must be reverse engineered from keripy output itself.

The attachment as you have parsed it seems to be a signature, judging from the labels. Is it a signature on the icp event? If so, why aren't the signatures JSON-encoded and included with the rest of the event, which is JSON-encoded?

Yes, as mentioned above it is a signature on the event of the key included in the event (proving that you hold the key and signed the event). The signatures aren't JSON-encoded and included in the rest of the event because it was a design choice by the spec authors not to do so in order to get around a bunch of difficult canonicalization and serialization problems that result in malleability attacks on other protocols. By serializing the event and then attaching signatures the verification process is much more straight forward and allows none of these attacks to be possible (at the expense of a stream with multiple data formats).

Hope this helps.

@bmottershead
Copy link
Author

bmottershead commented Jan 5, 2025

Thank you for the further detail. I know you are spending a fair amount of time writing these responses, and I appreciate it. But regarding this: The combination of those primitives and count codes into particular types of attachments have been deemed out of scope to the KERI, CESR, ACDC specs. Currently these must be reverse engineered from keripy output itself.. I am somewhat amazed by this statement. Why go to all the trouble of writing (or reading) a spec if it does not provide complete information that enables someone to implement it, or for differences between multiple implementations to be resolved? It isn't as if the precise contents and format of the KEL are side issues. Have I misunderstood something, or aren't KEL generation, discovery, and verification of the essence when it comes to KERI? But none of these are covered by the KERI spec, despite it not being particularly succinct. What is the other spec where these are deemed in scope?

@daidoji
Copy link
Contributor

daidoji commented Jan 5, 2025

@bmottershead that's an excellent question and one I had when first joining the community. Keripy's (and the associated KERI implementations in that family keripy/keria/signify-ts/signifypy) major deployments is the GLEIF ecosystem and the GLEIF Governance Framework describes what members of the GLEIF ecosystem must do with the keri tooling. KERI/ACDC/CESR specs describe the major components that make up KERI/ACDC/CESR and a variety of IETF (discontinued) specs describe other pieces that are used today like IPEX/OOBI and some other protocols and standards implemented in the main keripy family of tools.

However, there is no spec that describes how these tools work down to that level of detail that I'm aware of. This has been pointed out in the community discussions and while I'm for a specification down to that level of detail personally (because we're implementing KERI at vleida.net and would like to interoperate with the keripy family of tools) the community has chosen NOT to create specifications to that level of detail unfortunately.

We have some informal data sets and specifications we use at vleida, perhaps GLEIF or the QVIs have their own informal specifications but nothing that's public to the whole community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

3 participants