diff --git a/src/http-gateways/path-gateway.md b/src/http-gateways/path-gateway.md index 95249fa29..acb73e881 100644 --- a/src/http-gateways/path-gateway.md +++ b/src/http-gateways/path-gateway.md @@ -214,35 +214,13 @@ These are the equivalents: - `format=cbor` → `Accept: application/cbor` - `format=ipns-record` → `Accept: application/vnd.ipfs.ipns-record` -## Query Parameters for CAR Requests +### `dag-scope` (request query parameter) -The following query parameters are only available for requests made with either a `format=car` query parameter or an `Accept: application/vnd.ipld.car` request header. These parameters modify shape of the IPLD graph returned within the car file. +Only used on CAR requests, same as [dag-scope](/http-gateways/trustless-gateway/#dag-scope-request-query-parameter) from :cite[trustless-gateway] -### `car-scope` (request query parameter) +### `entity-bytes` (request query parameter) -Optional, `car-scope=(block|file|all)` with default value 'all', describes the shape of the dag fetched the terminus of the specified path whose blocks are included in the returned CAR file after the blocks required to traverse path segments. - -`block` - Only the root block at the end of the path is returned After blocks required to verify the specified path segments. - -`file` - For queries that traverse UnixFS data, `file` roughly means return blocks needed to verify the end of the path as a filesystem entity. In other words, all the blocks needed to 'cat' a UnixFS file at the end of the specified path, or to 'ls' a UnixFS directory at the end of the specified path. For all queries that do not reference non-UnixFS data, `file` is equivalent to `block` - -`all` - Transmit the entire contiguous DAG that begins at the end of the path query, after blocks required to verify path segments - -### `bytes` (request query parameter) - -Optional, `bytes=x:y` with default value `0:*`. When the entity at the end of the specified path can be intepreted as a contingous array of bytes (such as a UnixFS file), returns only the blocks required to verify the specified byte range of said entity. Put another way, the `bytes` parameters can serve as a trustless form of an HTTP range request. If the entity at the end of the path cannot be interpreted as a continguous array of bytes (such as a CBOR/JSON map), this parameter has no effect. Allowed values for `x` and `y` are positive integers where y >= x, which limit the return blocks to needed to satify the range [x, y]. In addition the following additional values are permitted: - -- `*` can be substituted for end-of-file - - `?bytes=0:*` is the entire file (i.e. to fulfill HTTP Range Request `x-` requests) -- Negative numbers can be used for referring to bytes from the end of a file - - `?bytes=-1024:*` is the last 1024 bytes of a file (i.e. to fulfill HTTP Range Request `-y` requests) - - It is also permissible (unlike with HTTP Range Requests) to ask for the range of 500 bytes from the beginning of the file to 1000 bytes from the end by `?bytes=499:-1000` - - +Only used on CAR requests, same as [entity-bytes](/http-gateways/trustless-gateway/#entity-bytes-request-query-parameter) from :cite[trustless-gateway] # HTTP Response diff --git a/src/http-gateways/trustless-gateway.md b/src/http-gateways/trustless-gateway.md index e184eb21d..974b68f9d 100644 --- a/src/http-gateways/trustless-gateway.md +++ b/src/http-gateways/trustless-gateway.md @@ -59,7 +59,7 @@ Same as GET, but does not return any payload. Same as in :cite[path-gateway], but with limited number of supported response types. -## HTTP Request Headers +## Request Headers ### `Accept` (request header) @@ -75,6 +75,63 @@ Below response types SHOULD to be supported: Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless mode (no deserialized responses) and `Accept` header is missing. +## Request Query Parameters + +### :dfn[dag-scope] (request query parameter) + +Optional, `dag-scope=(block|entity|all)` with default value `all`, only available for CAR requests. + +Describes the shape of the DAG fetched the terminus of the specified path whose blocks +are included in the returned CAR file after the blocks required to traverse +path segments. + +- `block` - Only the root block at the end of the path is returned after blocks + required to verify the specified path segments. + +- `entity` - For queries that traverse UnixFS data, `entity` roughly means return + blocks needed to verify the terminating element of the requested content path. + For UnixFS, all the blocks needed to read an entire UnixFS file, or enumerate a UnixFS directory. + For all queries that reference non-UnixFS data, `entity` is equivalent to `block` + +- `all` - Transmit the entire contiguous DAG that begins at the end of the path + query, after blocks required to verify path segments + +When present, returned `Etag` must include unique prefix based on the passed scope type. + +### :dfn[entity-bytes] (request query parameter) + +Optional, `entity-bytes=from:to` with the default value `0:*`, only available for CAR requests. +Serves as a trustless form of an HTTP Range Request. + +When the terminating entity at the end of the specified content path can be +interpreted as a continuous array of bytes (such as a UnixFS file), returns +only the minimal set of blocks required to verify the specified byte range of +said entity. + +Allowed values for `from` and `to` are positive integers where `to` >= `from`, which +limit the return blocks to needed to satisfy the range `[from,to]`: + +- `from` value gives the byte-offset of the first byte in a range. +- `to` value gives the byte-offset of the last byte in the range; that is, +the byte positions specified are inclusive. Byte offsets start at zero. + +If the entity at the end of the path cannot be interpreted as a continuous +array of bytes (such as a DAG-CBOR/JSON map, or UnixFS directory), this +parameter has no effect. + +The following additional values are supported: + +- `*` can be substituted for end-of-file + - `entity-bytes=0:*` is the entire file (a verifiable version of HTTP request for `Range: 0-`) +- Negative numbers can be used for referring to bytes from the end of a file + - `entity-bytes=-1024:*` is the last 1024 bytes of a file + (verifiable version of HTTP request for `Range: -1024`) + - It is also permissible (unlike with HTTP Range Requests) to ask for the + range of 500 bytes from the beginning of the file to 1000 bytes from the + end: `entity-bytes=499:-1000` + +When present, returned `Etag` must include unique prefix based on the passed range. + # HTTP Response Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gateway]. diff --git a/src/ipips/ipip-0402.md b/src/ipips/ipip-0402.md index ac4f69724..451777f9b 100644 --- a/src/ipips/ipip-0402.md +++ b/src/ipips/ipip-0402.md @@ -5,6 +5,10 @@ ipip: proposal editors: - name: Hannah Howard github: hannahhoward + - name: Adin Schmahmann + github: aschmahmann + - name: Rod Vagg + github: rvagg - name: Marcin Rataj github: lidel url: https://lidel.org/ @@ -39,11 +43,15 @@ Save round-trips, allow more efficient resume and parallel downloads. The solution is to allow the :cite[trustless-gateway] to support partial responses by: + - allowing for requesting sub-paths within a DAG, and getting blocks necessary for traversing all path segments for end-to-end verification -- opt-in `car-scope` parameter that allows for narrowing down returned blocks - to a `block`, `file` (aka logical IPLD entity), or `all` (default) -- opt-in `bytes` parameter that allows for returning only a subset of blocks + +- opt-in `dag-scope` parameter that allows for narrowing down returned blocks + to a `block`, `entity` (a logical IPLD entity, such as a file, directory, + CBOR document), or `all` (default) + +- opt-in `entity-bytes` parameter that allows for returning only a subset of blocks within a logical IPLD entity Details are in :cite[trustless-gateway]. @@ -66,14 +74,15 @@ Terse rationale for each feature: - The ability to narrow down CAR response based on logical scope or specific byte range within an entity comes directly from the types of requests existing path gateways need to handle. - - `car-scope=block` allows for resolving content paths to the final CID, and + - `dag-scope=block` allows for resolving content paths to the final CID, and learn its type (unixfs file/directory, or a custom codec) - - `car-scope=file` covers the majority of website hosting needs (returning a - file, or enumerating directory contents) - - `car-scope=all` returns all blocks in a DAG: was the existing behavior and + - `dag-scope=entity` covers the majority of website hosting needs (returning a + file, enumerating directory contents, or any other IPLD entity) + - `dag-scope=all` returns all blocks in a DAG: was the existing behavior and remains the implicit default - - `bytes=from:to` enables efficient, verifiable analog to HTTP Range Requests + - `entity-bytes=from:to` enables efficient, verifiable analog to HTTP Range Requests (resuming downloads or seeking within bigger files, such as videos) + - `from` and `to` match the behavior of HTTP Range Requests. ### User benefit @@ -121,7 +130,7 @@ introduce additional blocks required for verifying. As long the client was written in a trustless manner, and follows ring and was discarding unexpected blocks, this will be a backward-compatible change. -#### CAR format with `bytes` and `car-scope` parameters +#### CAR format with `entity-bytes` and `dag-scope` parameters These parameters are opt-in, which means no breaking changes. @@ -159,7 +168,7 @@ risks, and weak value proposition, as [discussed during IPFS Thing 2022](https:/ #### Additional "Web" Scope A request for -`/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/?format=car&car-scope=file` +`/ipfs/bafybeiaysi4s6lnjev27ln5icwm6tueaw2vdykrtjkwiphwekaywqhcjze/wiki/?format=car&dag-scope=entity` returns all blocks required for enumeration of the big HAMT `/wiki` directory, and then an additional request for `index.html` needs to be issued. @@ -181,7 +190,7 @@ It is impossible to know if some entity on a sub-path is a file or a directory, without sending a probe for the root block, which introduces one round-trip overhead per entity. -This problem is not present in the case of `car-scope=file`, which shifts the +This problem is not present in the case of `dag-scope=entity`, which shifts the decision to the server, and allows for fetching unknown UnixFS entity with a single request. @@ -197,7 +206,7 @@ The main utility of this scope is saving round-trips when retrieving a specific entity as a member of a bigger DAG. To test, request a small file that fits in a single block from a sub-path. The -returned CAR MUST include both the block with the `file` data and blocks +returned CAR MUST include both the block with the file data and all blocks necessary for traversing from the root CID to the terminating element (all parents, root CID and a subdirectory below it). @@ -213,7 +222,7 @@ Fixtures: ::: -### Testing `car-scope=block` +### Testing `dag-scope=block` The main utility of this scope is resolving content paths. This means a CAR response with blocks related to path traversal, and the root block of the @@ -227,13 +236,13 @@ Fixtures: :::example -- TODO(gateway-conformance): `/ipfs/cid/parent/directory?format=car&car-scope=block` (UnixFS directory on a path) +- TODO(gateway-conformance): `/ipfs/cid/parent/directory?format=car&dag-scope=block` (UnixFS directory on a path) -- TODO(gateway-conformance): `/ipfs/cid/parent1/parent2/file?format=car&car-scope=block` (UnixFS file on a path) +- TODO(gateway-conformance): `/ipfs/cid/parent1/parent2/file?format=car&dag-scope=block` (UnixFS file on a path) ::: -### Testing `car-scope=file` +### Testing `dag-scope=entity` The main utility of this scope is retrieving all blocks related to a meaningful IPLD entity. Currently, the most popular entity types are: @@ -252,48 +261,48 @@ Fixtures: :::example -- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&car-scope=file` +- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&dag-scope=entity` - Request a `chunked-dag-pb-file` (UnixFS file encoded with `dag-pb` with more than one chunk). Returned blocks MUST be enough to deserialize the file. -- TODO(gateway-conformance): `/ipfs/cid/dag-cbor-with-link?format=car&car-scope=file` +- TODO(gateway-conformance): `/ipfs/cid/dag-cbor-with-link?format=car&dag-scope=entity` - Request a `dag-cbor-with-link` (DAG-CBOR document with CBOR Tag 42 pointing at a third-party CID). The response MUST include the terminating entity (DAG-CBOR) and MUST NOT include the CID from the Tag 42 (IPLD Link). -- TODO(gateway-conformance): `/ipfs/cid/flat-directory/file?format=car&car-scope=file` +- TODO(gateway-conformance): `/ipfs/cid/flat-directory/file?format=car&dag-scope=entity` - Request UnixFS `flat-directory`. The response MUST include the minimal set of blocks required for enumeration of directory contents, and no blocks that belong to child entities. -- TODO(gateway-conformance): `/ipfs/cid/hamt-directory/file?format=car&car-scope=file` +- TODO(gateway-conformance): `/ipfs/cid/hamt-directory/file?format=car&dag-scope=entity` - Request UnixFS `hamt-directory`. The response MUST include the minimal set of blocks required for enumeration of directory contents, and no blocks that belong to child entities. ::: -### Testing `car-scope=all` +### Testing `dag-scope=all` -This is the implicit default used when `car-scope` is not present, +This is the implicit default used when `dag-scope` is not present, and explicitly used in the context of proxy gateway supporting :cite[ipip-0288]. Fixtures: :::example -- TODO(gateway-conformance): `/ipfs/cid-of-a-directory?format=car&car-scope=all` +- TODO(gateway-conformance): `/ipfs/cid-of-a-directory?format=car&dag-scope=all` - Request a CID of UnixFS `directory` which contains two files. The response MUST contain all blocks that can be accessed by recursively traversing all IPLD Links from the root CID. -- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&car-scope=all` +- TODO(gateway-conformance): `/ipfs/cid/chunked-dag-pb-file?format=car&dag-scope=all` - Request a CID of UnixFS `file` encoded with `dag-pb` codec and more than one chunk. The response MUST contain blocks for all `file` chunks. ::: -### Testing `bytes=from:to` +### Testing `entity-bytes=from:to` This type of CAR response is used for facilitating HTTP Range Requests and byte seek within bigger entities. @@ -302,7 +311,7 @@ byte seek within bigger entities. Properly testing this type of response requires synthetic DAG that is only partially retrievable. This ensures systems that perform internal caching -won't pass the test due to the entire DAG being cached. +won't pass the test due to the entire DAG being precached, or fetched in full. ::: @@ -310,12 +319,17 @@ Use of the below fixture is highly recommended: :::example -- TODO(gateway-conformance): `/ipfs/dag-pb-file?format=car&bytes=40000000000-40000000002` +- TODO(gateway-conformance): `/ipfs/dag-pb-file?format=car&entity-bytes=40000000000-40000000002` - Request a byte range from the middle of a big UnixFS `file`. The response MUST contain only the minimal set of blocks necessary for fullfilling the range request. +- TODO(gateway-conformance): `/ipfs/10-bytes-cid?format=car&entity-bytes=4:-2` + + - Request a byte range from the middle of a small file, to -2 bytes from the end. + - (TODO confirm we want keep this -- added since it was explicitly stated as a supported thing in path-gateway.md) + ::: ### Copyright