-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: Clarify Referrers Tag Schema vs. alternative algorithms #563
base: main
Are you sure you want to change the base?
spec: Clarify Referrers Tag Schema vs. alternative algorithms #563
Conversation
07483af
to
e0dd9a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonable proposal. Implementing that client-side would be a bit of a an extra hassle because the referrers tags may now have collisions for multiple manifests that share a digest prefix, but certainly doable.
Alternatively, would we consider outright forbidding the use of referrers tags for sha512, and mandating the use of the 1.1 API exclusively? Given the tag limit, it is possible to read today’s spec to already require that; and the referrers tag schema is a compatibility mechanism, so it is a bit surprising to keep adding new features to it.
But then again, I realize that not specifying referrers tags for sha512 would impose more requirements on both clients and registries to support sha512 — they would also have to support the referrers API as a precondition. Maybe we prefer wide and early adoption of sha512 support, and we don’t want to add this precondition.
The outgoing:
certainly looks like it is open to the idea of
as "truncate the
that would be another option. I'd guess that the focus on I don't see much benefit to restricting Referrers Tags to
But the existing distribution spec releases don't forbid registries from supporting |
With 64 characters, it's the same likelihood as a sha256 collision, extremely small. But clients are able to detect that when they pull an entry and the full subject digest doesn't match. |
A random collision, yes. When considering intentional attacks (the risk of which motivates the move away from sha256)… above my pay grade.
Yes; if we decide to document the truncate-to-64 behavior, I think an explicit “client SHOULD check the |
e0dd9a8
to
fffa895
Compare
I think checking |
To be explicit, I’m a nobody on this repo. |
The tag schema is considered place holder while registries implement referrers. I would rather we scope the usage to only that and even consider removing the tag schema section in 2.0 or leave it as backward compatible similar to docker media types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From today's call, the direction I'm leaning is two separate truncations. A 32 character truncation for the algorithm, and a 64 character truncation for the hash. And keep the recommended replacement any unsupported characters with a -
. That remains compatible with the v1.1 spec for registered algorithms, and leaves 31 characters in the future if we need to define another tag schema for something else.
From the referenced OCI spec: digest ::= algorithm ":" encoded algorithm ::= algorithm-component (algorithm-separator algorithm-component)* algorithm-component ::= [a-z0-9]+ algorithm-separator ::= [+._-] encoded ::= [a-zA-Z0-9=_-]+ But from the distribution-spec: Throughout this document, `<reference>` as a tag MUST be at most 128 characters in length and MUST match the following regular expression: `[a-zA-Z0-9_][a-zA-Z0-9._-]{0,127}` Happily, the fist character of algorithm must match algorithm-component, and its [a-z0-9] a subset of the tag regexp's opening [a-zA-Z0-9_]. And the colon separating algorithm from encoded was already addressed in the outgoing text. But the digest definition also allows + in the algorithm-separator and = in the encoded portion, which the tag regexp does not allow, so with the incoming wording I'm requiring that to be replaced by a - as well, so clients make consistent choices when deciding how to handle that character while forming distribution-spec referrer tags. We need some overall truncation to keep the tag under 128 characters, again so clients make consistent choices when trying to compress from the strings the digest specification allows to the strings tags allow. There is no requirement in the distribution spec as far as I can tell that registries support tags up to 128 characters, but given that the spec explicitly requires clients to not exceed that length, it seems likely that registries will allow tags of that length, and not require further truncation. I'm requiring clients to truncate the algothim to 32 characters and the encoded section to 64 characters, because that's one possible reading of the outgoing "limit of 64 characters" parenthetical, at least one client had implemented it that way [1], and Brandon explicitly requested the 32-and-64 approach [2]. And clients are obviously free to create whatever tags they like that the registry will accept. The MUST I'm adding does not forbid that. It only clarifies the single distribution-spec Referrers Tag associated with a given digest, because if there could be multiple Referrers Tag for each digest, all distribution-spec referrer-retrieving clients would have to iterate over that whole set of possibilities, in case some distribution-spec referrer-pushing client happened to use one of that digest's other Referrers Tag formats. [1]: https://github.com/regclient/regclient/blob/dbb1434fd4b8b650983e8c51933789712e05eeaa/types/referrer/referrer.go#L157 [2]: opencontainers#563 (review) Signed-off-by: W. Trevor King <[email protected]>
fffa895
to
e839150
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -721,14 +721,17 @@ A client querying the [referrers API](#listing-referrers) and receiving a `404 N | |||
|
|||
##### Referrers Tag Schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There have been two main debates associated with this pull request:
- These earlier comments discussing "do we even need a Referrers Tag Schema for
sha512
?". I'm starting this new thread to collect that discussion in one place, without spreading it around and interleaving with other discussion. - "If we clarify how the Referrers Tag Schema handles
sha512
and other algorithms, what should the tag for a given digest actually be?", which is getting hashed out in this prior thread.
Arguments in favor of not declaring Referrers Tag Schema tags for sha512
and other algorithms seem to mostly point out that the Referrers Tag Schema only comes up in 1.1.0 as a fallback for 404ing Referrers APIs. In registries that support the Referrers API, there is no need for the Referrers Tag Schema.
So the motivation for sha512
and other non-sha256
algorithm support in Referrers Tag Schema would be allowing distribution-spec 1.1.2 (or whatever) compatible clients to interact with a registry that does not support the Referrers Tag Schema, but where the client might plausibly want to look up referrers to a sha512:...
resource. Increasing the odds is this registry constraint:
A registry MUST initially accept an otherwise valid manifest with a subject field that references a manifest that does not exist in the repository, allowing clients to push a manifest and referrers to that manifest in either order.
I'm not clear on the scope of "initially" there; maybe it means that the registry is allowed to prune such referring manifests after some time if it feels like the referenced manifest has not appeared? But let's do some testing with a real registry!
Setting up some auth:
$ BASIC="$(jq -r '.auths["quay.io:443"].auth' ~/.config/containers/auth.json)"
$ TOKEN="$(curl -sH "Authorization: Basic ${BASIC}" 'https://quay.io/v2/auth?account=wking-red-hat%2Btesting&scope=repository%3Awking-red-hat%2Ftest%3Apull%2Cpush&service=quay.io' | jq -r .token)"
Building an example manifest using the empty descriptor so we have something to refer to:
$ cat example-manifest.json
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
},
"layers": [
{
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
}
]
}
$ sha256sum example-manifest.json
7bba6e1df3b71adc1e22cce2d2cd7706199e283d8a915170de1ffba6970ecb2c example-manifest.json
$ sha512sum example-manifest.json
3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017 example-manifest.json
$ wc -c example-manifest.json
456 example-manifest.json
Pushing the empty {}
blob to my test repository on the production Quay.io:
$ curl -iH "Authorization: Bearer ${TOKEN}" -H Content-Type:application/octet-stream --data-binary '{}' 'https://quay.io:443/v2/wking-red-hat/test/blobs/uploads/?digest=sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a'
HTTP/2 201
...
docker-content-digest: sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a
...
Pushing that manifest to my test repository on the production Quay.io:
$ curl -iH "Authorization: Bearer ${TOKEN}" -XPUT -H Content-Type:application/vnd.oci.image.manifest.v1+json --data-binary '@example-manifest.json' https://quay.io:443/v2/wking-red-hat/test/manifests/testing
HTTP/2 201
...
docker-content-digest: sha256:7bba6e1df3b71adc1e22cce2d2cd7706199e283d8a915170de1ffba6970ecb2c
location: /v2/wking-red-hat/test/manifests/sha256:7bba6e1df3b71adc1e22cce2d2cd7706199e283d8a915170de1ffba6970ecb2c
...
That's the current Quay saying "we expect you'll be wanting to reference this manifest via sha256
...". But we have our own opinions! Let's create a second manifest to reference the first via sha512
:
$ cat referencing-manifest.json
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
},
"layers": [
{
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
}
],
"subject": {
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha512:3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017",
"size": 456
}
}
$ sha256sum referencing-manifest.json
9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228 referencing-manifest.json
$ wc -c referencing-manifest.json
708 referencing-manifest.json
And pushing that too:
$ curl -iH "Authorization: Bearer ${TOKEN}" -XPUT -H Content-Type:application/vnd.oci.image.manifest.v1+json --data-binary '@referencing-manifest.json' https://quay.io:443/v2/wking-red-hat/test/manifests/testing-referrer
HTTP/2 201
...
docker-content-digest: sha256:9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228
location: /v2/wking-red-hat/test/manifests/sha256:9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228
...
Quay did not respond with the OCI-Subject
header, and we did have subject
set, so to be compliant with the 1.1.0 spec, my curl
-based distribution-spec client MUST:
- Pull the current referrers list using the referrers tag schema.
- ...
Here's my image index:
$ cat referrers-index.json
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.index.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228",
"size": 708
}
]
}
And pushing that index:
$ curl -iH "Authorization: Bearer ${TOKEN}" -XPUT -H Content-Type:application/vnd.oci.image.index.v1+json --data-binary '@referrers-index.json' https://quay.io:443/v2/wking-red-hat/test/manifests/sha512-3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d0
HTTP/2 201
...
docker-content-digest: sha256:8518dc8ebd4374d19bb7451b93d6fd7d8f7dfe1f784c75a3b30faac03c80ef62
location: /v2/wking-red-hat/test/manifests/sha256:8518dc8ebd4374d19bb7451b93d6fd7d8f7dfe1f784c75a3b30faac03c80ef62
...
So success, it's working today, against a production registry, and I do need the clarity on how to map the sha512:3e28...
subject
to a Referrers Tag that this pull is trying to deliver.
#543 is in flight asking registries of the future to be able to return Docker-Content-Digest
for manifests with manifest-pusher-requested hash algorithms. But you don't need that to make the Referrer Tag system useful for sha512
and other algorithms. Perhaps you have a system like the OpenShift Update Service, tags on your referenced resource are just a way to keep the registry from garbage collecting that image, and you're passing around by-digest references with your custom system:
$ curl -s 'https://api.openshift.com/api/upgrades_info/graph?arch=multi&channel=fast-4.17' | jq -r '.nodes[] | .version + " " + .payload' | sort -V | tail -n2
4.17.14 quay.io/openshift-release-dev/ocp-release@sha256:e749b5e6664e3ed5a1adce6b0b1d5a81fbdd83bcae866e5aa445c9818742584b
4.17.15 quay.io/openshift-release-dev/ocp-release@sha256:91badf6436123b93e0ba050a59d3b148215675db692cc513814e19e427736c25
If a by-digest system like that tells me that I am interested in quay.io/wking-red-hat/test@sha512:3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017
, and I want to pull referrers, I first check the Referrers API:
$ curl -iH "Authorization: Bearer ${TOKEN}" https://quay.io:443/v2/wking-red-hat/test/referrers/sha512:3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017
HTTP/2 400
date: Thu, 06 Feb 2025 22:02:57 GMT
content-type: application/json
content-length: 82
server: nginx/1.22.1
vary: Cookie
{"errors":[{"code":"MANIFEST_INVALID","detail":"","message":"manifest invalid"}]}
The current spec doesn't require me to fail out on that 400, so I can fall back to this pull request's Referrers Tag Schema to map to sha512-3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d0
and pull:
$ curl -sH "Authorization: Bearer ${TOKEN}" -H Accept:application/vnd.oci.image.index.v1+json https://quay.io:443/v2/wking-red-hat/test/manifests/sha512-3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d0
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.index.v1+json",
"manifests": [
{
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha256:9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228",
"size": 708
}
]
}
$ curl -sH "Authorization: Bearer ${TOKEN}" -H Accept:application/vnd.oci.image.manifest.v1+json https://quay.io:443/v2/wking-red-hat/test/manifests/sha256:9abb81db9d47792d52b7cf071b889b73605998ae832b1633b17e2a2613c8a228
{
"schemaVersion": 2,
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"config": {
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
},
"layers": [
{
"mediaType": "application/vnd.oci.empty.v1+json",
"digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
"size": 2
}
],
"subject": {
"mediaType": "application/vnd.oci.image.manifest.v1+json",
"digest": "sha512:3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017",
"size": 456
}
}
So all I needed to make this commit useful for referencing manifests sha512:...
vs. the current Quay.io implementation was "if the Referrers API 400s you, also fall back to the Referrers Tag schema, just in case". Which seems like not a terrible stretch for future clients trying to work compatibly with older/existing registries, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on the scope of "initially" there; maybe it means that the registry is allowed to prune such referring manifests after some time if it feels like the referenced manifest has not appeared?
Yes, without the subject manifest, a registry that supports referrers could make a GC policy that prunes these just like they prune blobs if a manifest referencing the blob isn't pushed after the blob is pushed.
In your example, can you pull the tagged image by the sha512 digest? Would a client that is pulling the tag have a way to discover the referrers to that manifest if it queries the digest the registry (using a HEAD request on the tag)? How do clients know when to fetch referrers with a different algorithm than the registry defaults to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example, can you pull the tagged image by the sha512 digest?
Ah, nope, current Quay rejects that:
$ curl -iH "Authorization: Bearer ${TOKEN}" -H Accept:application/vnd.oci.image.index.v1+json https://quay.io:443/v2/wking-red-hat/test/manifests/sha512:3e28ec27b08b5454f3fb07d8eb255be177dfeb6810947b3fe0b6099d6eb3f6d08d4a28cd6128a1d5bb12851c0037f8a72653a11342dbea6d52851571fbbe7017
HTTP/2 404
...
{"errors":[{"code":"MANIFEST_UNKNOWN","detail":{},"message":"manifest unknown"}]}
How do clients know when to fetch referrers with a different algorithm than the registry defaults to?
That's the external by-digest reference system, like the OpenShift Update Service in my previous comment. But good point about the registry not supporting by-sha512
-digest manifest retrieval. So to make it work, you'd need the external by-digest reference system to transmit both sha512
or whatever (if the users have a digest algorithm they prefer for security reasons) digests and also sha256
digests, to support looking up the referenced manifest in legacy registries that didn't support sha512
yet. Not impossible, but less likely than if registries did support sha512
digest retrieval for manifests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so pivoting to "could there be future registries that support sha512
but not support the Referrers API?". I can't speak to any established registries, so if there's room for linking those together with a MUST in #543, that's fine with me. I think it would be still worth adding language to restrict the Referrers Tag Schema to sha256
, if we go that path. Because of restrictions set on sha256
the text could just be something like:
The Referrers Tag associated with a Content Digest apdx-3 for the
sha256
algorithm MUST match the digest with the:
replaced by a-
.There is no Referrers Tag associated with Content Digests that use other algorithms. Client behavior for retrieving referrers from registries that do not support the Referrers API for other digest algorithms is undefined.
without having to talk about truncation or anything else.
But from the 1.1.0 spec:
All registries conforming to this specification MUST support, at a minimum, all APIs in the Pull category.
Registries SHOULD also support the Push, Content Discovery, and Content Management categories. A registry claiming conformance with one of these specification categories MUST implement all APIs in the claimed category.
$ grep '^#' spec.md
...
#### Content Discovery
##### Listing Tags
##### Listing Referrers
#### Content Management
...
So it seems like it's currently possible to have a 1.1-compliant registry that does not support the referrers API, as long as you do not claim conformance with Content Discovery. Which would make "how do Referrers Tags work for sha512
?" a useful question to answer for clients interacting with registries that implement the ability to pull manifests by sha512
digests under the Pull category, but chose not to implement the Referrers API in the Content Discovery category. Will any registries make that set of choices when implementing? I have no idea. No worries if you want to leave the current ambiguous Referrers Tag wording in place until there is an actual registry who makes that choice, although that would mean that clients might have to scramble to align with whatever Referrer Tag Schema is selected at that point.
From the referenced OCI spec:
But from #256's distribution-spec wording:
Happily, the fist character of
algorithm
must matchalgorithm-component
, and its[a-z0-9]
a subset of the tag regexp's opening[a-zA-Z0-9_]
. And the colon separatingalgorithm
fromencoded
was already addressed in the outgoing text. But thedigest
definition also allows+
in thealgorithm-separator
and=
in theencoded
portion, which the tag regexp does not allow, so with the incoming wording I'm requiring that to be replaced by a-
as well, so clients make consistent choices when deciding how to handle that character while forming distribution-spec referrer tags.And I'm requiring clients to truncate the tag to 128 characters, again so clients make consistent choices when trying to compress from the strings the digest specification allows to the strings tags allow. There is no requirement in the distribution spec as far as I can tell that registries support tags up to 128 characters, but given that the spec explicitly requires clients to not exceed that length, it seems likely that registries will allow tags of that length, and not require further truncation. And there is discussion in #256 that suggests that requiring compliant registries to support at least 128 characters in tag strings was intended, even if I can't find language for that requirement in the spec itself.
And clients are obviously free to create whatever tags they like that the registry will accept. The
MUST
I'm adding does not forbid that. It only clarifies the single distribution-spec Referrers Tag associated with a given digest, because if there could be multiple Referrers Tag for each digest, all distribution-spec referrer-retrieving clients would have to iterate over that whole set of possibilities, in case some distribution-spec referrer-pushing client happened to use one of that digest's other Referrers Tag formats.Spun out from this discussion.