-
Notifications
You must be signed in to change notification settings - Fork 233
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
IPIP-412: Signaling Block Order in CARs on Gateways
First draft based on various prior art and recent discussions cited in the header front matter.
- Loading branch information
Showing
2 changed files
with
241 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,240 @@ | ||
--- | ||
title: "IPIP-0412: Signaling Block Order in CARs on HTTP Gateways" | ||
date: 2023-05-15 | ||
ipip: proposal | ||
editors: | ||
- name: Marcin Rataj | ||
github: lidel | ||
url: https://lidel.org/ | ||
- name: Jorropo | ||
github: Jorropo | ||
relatedIssues: | ||
- https://github.com/ipfs/specs/issues/348 | ||
- https://github.com/ipfs/specs/pull/330 | ||
- https://github.com/ipfs/specs/pull/402 | ||
- https://github.com/ipfs/specs/pull/412 | ||
order: 412 | ||
tags: ['ipips'] | ||
--- | ||
|
||
## Summary | ||
|
||
Adds support for additional, optional content type options that allow the | ||
client and server to signal or negotiate a specific block order in the returned | ||
CAR. | ||
|
||
## Motivation | ||
|
||
We want to make it easier to build light-clients for IPFS. We want them to have | ||
low memory footprints on arbitrary sized files. The main pain point preventing | ||
this is the fact that CAR ordering isn't specified. | ||
|
||
This require to keeping some kind of reference either on disk, or in memory to | ||
previously seen blocks for two reasons. | ||
|
||
1. Blocks can arrive out of order, meaning when a block is consumed (data is | ||
red and returned to the consumer) and when it's received might not match. | ||
1. Blocks can be reused multiple times, this is handy for cases when you plan | ||
to cache on disk but not at all when you want to process a stream with use & | ||
forget policy. | ||
|
||
What we really want is for the gateway to help us a bit, and give us blocks in | ||
a useful order. | ||
|
||
The existing Trustless Gateway specification does not provide a mechanism for | ||
negotiating the order of blocks in CAR responses. | ||
|
||
This IPIP aims to improve the status quo. | ||
|
||
## Detailed design | ||
|
||
CAR content type | ||
([`application/vnd.ipld.car`](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)) | ||
already supports `version` parameter, which allows gateway to indicate which | ||
CAR flavour is returned with the response. | ||
|
||
The proposed solution introduces two new parameters for the content type headers | ||
in HTTP requests and responses: `order` and `dups`. | ||
|
||
The `order` parameter allows the client to indicate its preference for a | ||
specific block order in the CAR response, and the `dups` parameter specifies | ||
whether duplicate blocks are allowed in the response. | ||
|
||
### Signaling in Request | ||
|
||
Content type negotiation is based on section 12.5.1 of :cite[rfc9110]. | ||
|
||
Clients MAY indicate their preferred block order by sending an `Accept` header in | ||
the HTTP request. The `Accept` header format is as follows: | ||
|
||
``` | ||
Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y | ||
``` | ||
|
||
In the future, when more orders or parameters exist, clients will be able to | ||
specify a list of preferences, for example: | ||
|
||
``` | ||
Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5 | ||
``` | ||
|
||
The above example is a list of preferences, the client would really like to use | ||
the hypothetical `order=foo` however if this isn't available it would accept | ||
`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter, | ||
as noted in :cite[rfc9110]). | ||
|
||
#### `order` CAR content type parameter | ||
|
||
The `order` parameter accepts the following values: | ||
|
||
- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search) | ||
order, allows for streaming responses with minimal memory usage | ||
- `rnd`: Unknown (random) order, the implicit default when `order` parameter is missing. | ||
|
||
#### `dups` CAR content type parameter | ||
|
||
The `dups` parameter specifies whether duplicate blocks (the same block | ||
occuring multiple times in the requested DAG) will be present in the CAR | ||
response. | ||
|
||
It accepts two values: | ||
- `y`: duplicate blocks are allowed | ||
- `n`: duplicates are not allowed | ||
|
||
When allowed (`y`), light clients are able to discard blocks after | ||
reading them, removing the need for caching in-memory or on-disk. | ||
|
||
<!-- TODO: do we need a parameter for inclusion of identity CIDs? | ||
It seems to be only relevant in Filecoin due to legacy hiccup: | ||
https://github.com/ipfs/specs/pull/330#issuecomment-1274106892 --> | ||
|
||
### Signaling in Response | ||
|
||
The Trustless Gateway MUST always respond with a `Content-Type` header that includes | ||
information about all supported/known parameters, even if the client did not | ||
specify them in the request. | ||
|
||
The `Content-Type` header format is as follows: | ||
|
||
``` | ||
Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=y | ||
``` | ||
|
||
|
||
Gateway implementations are free to decide on the implicit default ordering or | ||
other parameters, and use it in responses when client did not explicitly | ||
specify, or requested unsupported or unknown query parameter. | ||
|
||
Implementations MAY choose to implement only some of the parameters. | ||
|
||
## Design rationale | ||
|
||
The proposed specification change aims to address the limitations of the | ||
existing Trustless Gateway specification by introducing a mechanism for | ||
negotiating the block order in CAR responses. | ||
|
||
By allowing clients to indicate their preferred block order, Trustless Gateways | ||
can cache CAR responses for popular content, resulting in improved performance | ||
and reduced network load. Clients benefit from more efficient data handling by | ||
deserializing blocks as they arrive, | ||
|
||
We reuse exiting HTTP content type negotiation, and the CAR content type, which | ||
already had the optional `version` parameter. | ||
|
||
### User benefit | ||
|
||
The proposed specification change brings several benefits to end users: | ||
|
||
1. Improved Performance: Gateways can decide on their implicit default ordering | ||
and cache CAR responses for popular content. In turn, clients can benefit | ||
from strong `Etag` in ordered (deterministic) responses. This reduces the | ||
response time for subsequent requests, resulting in faster content retrieval | ||
for users. | ||
|
||
2. Reduced Memory Usage: Clients no longer need to buffer the entire CAR | ||
response in memory until the deserialization of the requested entity is | ||
finished. With the ability to deserialize blocks as they arrive, users can | ||
conserve memory resources, especially when dealing with large CAR responses. | ||
|
||
3. Efficient Data Handling: By discarding blocks as soon as the CID is | ||
validated and data is deserialized, clients can efficiently process the data | ||
in real-time. This is particularly useful for light clients, IoT devices, | ||
mobile web browsers, and other streaming applications where immediate access | ||
to the data is required. | ||
|
||
4. Customizable Ordering: Clients can indicate their preferred block order in the | ||
`Accept` header, allowing them to prioritize specific ordering strategies that | ||
align with their use cases. This flexibility enhances the user experience | ||
and empowers users to optimize content retrieval according to their needs. | ||
|
||
### Compatibility | ||
|
||
The proposed specification change is backward compatible with existing client | ||
and server implementations. | ||
|
||
Trustless Gateways that do not support the negotiation of block order in CAR | ||
responses will continue to function as before, providing their existing default | ||
behavior, and the clients will be able to detect it by inspecting the | ||
`Content-Type` header present in HTTP response. | ||
|
||
Clients that do not send the `Accept` header or do not recognize the `order` | ||
and `dups` parameters in the `Content-Type` header will receive and process CAR | ||
responses as they did before: buffering/caching all blocks until done with the | ||
final deserialization. | ||
|
||
Existing implementations can choose to adopt the new specification and | ||
implement support for the negotiation of block order incrementally. This allows | ||
for a smooth transition and ensures compatibility with both new and old | ||
clients. | ||
|
||
### Security | ||
|
||
The proposed specification change does not introduce any negative security | ||
implications beyond those already present in the existing Trustless Gateway | ||
specification. It focuses on enhancing performance and data handling without | ||
affecting the underlying security model of IPFS. | ||
|
||
Light clients with support for `order` and `dups` CAR content type parameters | ||
will be able to detect malicious response faster, reducing risks of | ||
memory-based DoS attacks from malicious gateways. | ||
|
||
### Alternatives | ||
|
||
Several alternative approaches were considered before arriving at the proposed solution: | ||
|
||
1. Implicit Server-Side Configuration: Instead of negotiating the block order, | ||
in the CAR response, the Trustless Gateway could have a server-side | ||
configuration that specifies the default order. However, this approach would | ||
limit the flexibility for clients, requiring them to have prior knowledge | ||
about order supported by each gateway. | ||
|
||
2. Fixed Block Order: Another option was to enforce a fixed block order in the | ||
CAR responses. However, this approach would not cater to the varying needs | ||
and preferences of different clients and use cases, and is not backward | ||
compatible with the existing Trustless Gateways which return CAR responses | ||
with Weak `Etag` and unspecified block order. | ||
|
||
3. Separate `X-` HTTP Header: Introduction of a separate HTTP reader was | ||
rejected because we try to use HTTP semantics where possible, and gateways | ||
already use HTTP content type negotiation for CAR `version` and reusing it | ||
saves a few bytes in each round-trip. Also, :cite[rfc6648] advises against | ||
use of `X-` and similar constructs in new protocols. | ||
|
||
The proposed solution of negotiating the block order through headers si | ||
future-proof, allows for flexibility, interoperability, and customization while | ||
maintaining compatibility with existing implementations. | ||
|
||
## Test fixtures | ||
|
||
Implementation compliance can be determined by testing the negotiation process | ||
between clients and Trustless Gateways using various combinations of `order` and | ||
`dups` parameters. | ||
|
||
TODO: | ||
1. a CAR with blocks for a small file in DFS order | ||
2. a CAR with blocks for a small file with one block appearing twice | ||
|
||
|
||
### Copyright | ||
|
||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |