-
Notifications
You must be signed in to change notification settings - Fork 108
codec: pathed link #352
Comments
I take it that's just everything concatenated? Works for me! Also note: utf8( Also related: multiformats/multiformats#55. |
Oh that is awesome! |
For clarity, can you describe the sections of bytes that end up forming the final CID? I'm not quite clear on how you're getting to the end product. Is it just the |
Sure. It’s also worth pointing out that the format is essentially just a CID without the proceeding Here’s a fully inline pathed CID.
I should also note that we’ll need to apply some rules to the path in order to ensure determinism (no leading or trailing slash). |
Wait, so you're not just concatenating a CID and a path? You're suggesting a new object type, stored as an "inline/identity" CID? I mean, that works, but it seems like just extending the CID format to allow tacking on a path would be cleaner. |
The goal here is to add this functionality in a generic way to IPLD (in other words, it should work for links to/from any existing block format) without actually breaking the IPLD Data Model (which extending the feature set of links would do). This is “just a new block format” specifically for pathed links. That means it has a representation that conforms to the existing IPLD Data Model as it is today without any changes. Since it’s implemented as a block format but is intended to be a link itself, the sane thing to do is to embed it in an identity multihash. It may seem a little hacky but it’s only 2 extra bytes of identity multihash overhead, which you actually gain back in the block format when compared to encoding the same data in CBOR. The important thing is that there is an identifier (multicodec) in any link that you can use to identify pathed links. This would allow any IPLD user to add pathed link support to their implementation and have it work across all codecs without changing or breaking the existing data model and it would still produce graphs that contain all the relevant linking information in just the Data Model representation. In practice, I don’t think there’s much difference between this and “extending the CID format” other than the fact that this is reverse compatible with systems that don’t understand pathed links. If you imagine extending the format, you’d end up putting bytes somewhere that say “this is a pathed link,” which we’re effectively doing with CID’s existing codec field, we’re just then eating two bytes for the identity multihash which we might have avoided had we gone a route that wasn’t reverse compatible. |
I guess... My concerns are:
In terms of not breaking things, yeah, I get that. I'm just concerned about this feature having limited use if it lives outside the core data model. |
We sort of have to pick one of these. If it changes the core data model we break everything, including the existing codec definitions, so that ship has sailed. That said, pretty much everything we’ve built w/ IPLD includes things beyond the data model. IPLD Schemas are the obvious example, and I’m curious to know if there’s a way that we could get pathed links into IPLD Schemas. |
We should enumerate some reasonable use-cases for this so we can figure out if this proposal would make sense for those. It seems to me that there's going to be special-casing no matter how we implement such a thing, this one has the benefit of reusing the "inline CID" pattern which I think we've agreed needs to be baked into our stack. But there's going to be additional "is this a |
Wanted to open up a discussion about this particular idea.
We’ve had conversations for a while about how to represent a link as a (CID + Path) but haven’t agreed on anything stable yet.
One thought I had was to create a codec and simple block format for pathed links.
You could use the identity multicodec to inline the relevant data into a single CID and end up with a “pathed link.” Of course, the data model representation would not automatically traverse unless configured to do so but that’s ok, we need the data model to remain stable anyway. This would give us a link level indicator of how to traverse and we could instrument whatever special traversal logic we might need when and where we need it and are ready for it.
We also get a very compact representation since we’re able to shave some bytes in the block format.
The text was updated successfully, but these errors were encountered: