Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Draft of Registry EIP #1

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

[WIP] Draft of Registry EIP #1

wants to merge 11 commits into from

Conversation

cgewecke
Copy link

@cgewecke cgewecke commented Jul 30, 2018

This is an initial draft for an EIP to support the reference registry. Have tried to reduce it to the smallest set of methods possible and keep the language simple and clear. Not sure I've been successful there. This is definitely WIP and needs lots of input.

There are 4 suggested methods - one for releasing and three getters that should allow tooling to pull all data from a registry. Additionally there is a prescription about how package names and version information should be hashed together.

Link to a human readable version

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a single comment in this review suggesting a way to decouple the generation of the releaseHash with semver. Upon reading the remainder of the spec, I'm convinced this would result in a spec that's easy to use with or without semver which I think is important.

It might be good to go a step further:

  1. Everywhere that you have bytes32 packageNameHash, change to string packageName. The hashing can happen within the function call.
  2. Everywhere that you have bytes32 releaseVersionHash, change to string version. The parsing and hashing of the version can happen within the function call.
  3. Instead of the term releashHash we use the term bytes32 releaseId.

Each registry is free to implement their own scheme for computing a releaseId, but that scheme must generate a unique identifier for each packageName, version pair.

The resulting API looks roughly like:

  • release(string name, string version)
  • getReleaseId(string name, string version)
  • getAllReleaseIds(string name)
  • getReleaseData(bytes32 releaseId)

I think that this will greatly reduce complexity in the spec, while still maintaining all of the necessary functionality.

This EIP specifies an interface for publishing to and retrieving assets from smart contract package registries. It is a companion EIP to [1123](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md) which defined a standard for smart contract package manifests.

## Motivation
The goal is to establish a framework that allows smart contract publishers to design and deploy code registries of arbitrary complexity which expose standard endpoints to tooling that retrieves assets for contract package consumers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace complexity with business logic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

+ a **registry** is a deployed contract which manages a collection of **packages**.
+ a **package** is a collection of **releases**
+ a **package** is identified by a unique string name within a given **registry**
+ a **release** is identified by a bytes32 **releaseHash** which is the keccak256 hash of the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the exact mechanism for generating the release hash should be left un-specified, and instead, contracts should expose a view method like getReleaseHash(string packageName, string version) returns bytes32

This would appropriately decouple the spec from semver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam Yes this is a really nice simplification - seems like the perfect minimum.

And getReleaseData just returns the inputs of a release?

getReleaseData(bytes32 releaseId) returns (string name, string version)

Copy link
Author

@cgewecke cgewecke Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually the sigs need to be like this I think...

release(string name, string version, string manifestURI);
getReleaseData(bytes32 releaseId) returns (string name, string version, string manifestURI);

@cgewecke
Copy link
Author

@pipermerriam Have made revisions (I think we're still in the early stages though... 🙂 )

Couple things:

  • Added a method requiring that registries declare whether they are semver compliant or not. Am concerned that tooling should know whether or not a registry supports range requests. If it does, this means it's possible for say web3 to pull all the releases in a package, order them and figure out how to comply. Our canonical example is:
const SimpleToken = await web3.packaging
                              .registry('packages.ethpm.eth')
                              .getPackage('SimpleToken')
                              .getVersion('^1.1.5');
  • Implemented getReleaseHash as generateReleaseId. I think this disambiguates from the 'real' getters a little?
  • Finally I'm preserving a method that lets you grab all the package names in a registry because I think that will be helpful for searching and aggregation across an eco-system of registries.

Obviously everything is up for grabs, thanks so much for looking this over.

function generateReleaseId(string packageName, string version) pure returns (bytes32);

// Declares whether a registry maintains its releases in semver compliant order.
function usesSemver() public pure returns (bool);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this a method because I'd like it to be compatible with EIP 165. Not sure this is absolutely necessary though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I'm inclined to push back on this.

  • I'm not aware of broad adoption of EIP165.
  • It should be completely valid for a registry to allow multiple version schemes.
  • Client side consumers will have to implement error handling and version string parsing whether this is present or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough. . . although without some guidance about how registries version aren't we telling clients to just forward whatever version string they're given? If a client fields this request:

.getVersion('^1.1.5');

then the registry needs to manage it because the client has no way of knowing what the caret means. It could be talking to a registry where 1.1.5, ^1.1.5 and orange77 or even are all legitimate versions according to the manifest spec.

Was kind of hoping that usesSemver would tell the client that it should analyze a list of releases and figure out which one matches the request. Or error if that kind of request doesn't make sense.

However this idea isn't very clearly expressed and probably misguided as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes directly against the intent of the spec, which is to allow arbitrary business logic. If people want to use orange77, aka, sem-fruit versioning then I suspect very few people will use their registry.

At their core, registries can be blind data stores. Some contracts will implement things like enforcing semver for version strings. Some libraries will be really good about versioning, and others may choose obscure versioning schemes. In the end, it's what the consumers of these registries support that will drive how they are used. Initially, I suspect that most libraries will ignore or throw errors when they encounter version strings that they cannot parse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sounds good.

+ a **package** is identified by a unique string name within a given **registry**
+ a **release** is identified by a `bytes32` **releaseId** which must be unique for a given package name and release version string pair.
+ a **releaseId** maps to a set of data that includes a **manifestURI** string which describes the location of an [EIP 1123 package manifest](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md). This manifest contains data about the release including the location of its component code assets.
+ a **manifestURI** string contains a cryptographic hash which can be used to verify the integrity of the content found at the URI. The URI format is defined in [RFC3986](https://tools.ietf.org/html/rfc3986).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably worth either:

Maybe both, or including the definition here as an excerpt. Not sure what the right thing to do, but the current description hints at what the URI is supposed to be but isn't clear enough that I think it'll be confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this is opaque.

FWIW it's worth the definition is verbatim copy of the text at the glossary link - possibly flesh it out there as well?


```solidity
// Retrieves all the packages published to a registry.
function getAllPackageNames() public view returns (string[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth taking some time to explore the upper bounds for how many names this is capable of returning. I suspect is is quite large, but.... there will likely be registries with alot of names. Might be good to paginate this API with a limit and an offset. This would ensure that it's possible to page through extremely large lists.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice.

How do you feel about using a cursor instead of pagination? Idea is:

By returning a “cursor”, the API guarantees that it will return the exactly the next entry in the list, regardless of what changes happen to the collection between API calls. Think of the cursor as permanent marker in the list that says “we left off here”.

source

getAllPackageNames(uint cursor) public view returns (string[], uint cursor);

Something like this on the client side.

registry.getAllPackageNames(0)
> { ['zeppelin', 'maker',...], 100 } // records 0-99, next index is position 100.
registry.getAllPackageNames(100)
> { ['etc', ...]}

In this schema the registry is responsible for picking the page size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd push to merge the ideas and use cursor and limit. I do however think the idea of cursor is a bit opaque and less intuitive than offset/limit and offset can have the same functionality what you are looking for with cursor, and we can add a should recommendation to this function that using offset and limit a client library should always be able to resume scraping at a later date.

Use your best judgement, I think this is potentially bike shedding territory.

function getReleaseId(string packageName, string version) public view returns (bytes32);

// Retrieves all release ids for a package
function getAllReleaseIds(string packageName) public view returns (bytes32[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought about pagination through limit and offset

function generateReleaseId(string packageName, string version) pure returns (bytes32);

// Declares whether a registry maintains its releases in semver compliant order.
function usesSemver() public pure returns (bool);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I'm inclined to push back on this.

  • I'm not aware of broad adoption of EIP165.
  • It should be completely valid for a registry to allow multiple version schemes.
  • Client side consumers will have to implement error handling and version string parsing whether this is present or not.

@cgewecke
Copy link
Author

cgewecke commented Aug 3, 2018

Hii @pipermerriam, round 3. Apologies for the delay.

Have removed the semver stuff, added pagination per your original recommendation and added notes about:

  • pagination
  • cryptographic hash integrity checks, using GitHub SHA-1 as an example.

@gnidan said he might peek at this tomorrow (Friday) as well.

Please continue to identify defects - complaints about gaps in clarity or stuff that doesn't read well etc are more than welcome.

Also @pipermerriam - could I ask your advice about what direction to take the reference implementation of the registry in? There's a bit of a fork in the road here.

We could simplify it quite a bit (which might make it more legible to someone looking for a reference). Or we could go the other direction and make it an example of the kind of fully semver compliant registry that can be spun out of the simplicity of the spec. Which of these do you think is best?

@cgewecke cgewecke force-pushed the registry-eip-draft branch from 5befdf4 to 3350338 Compare August 3, 2018 07:03
[ Needs table header ... ]

## Abstract
This EIP specifies an interface for publishing to and retrieving assets from smart contract package registries. It is a companion EIP to [1123](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md) which defines a standard for smart contract package manifests.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor thought)

I'm a bit ambivalent about how to word this abstract. Part of me prefers simply to write this EIP as saying it specifies "a package registry interface", vs. how you have it. I see the value in this canonical definition as "what the interface does", but I also think there's value in expressing the contents as the noun "registry". I don't know. This is just muttering.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not thrilled by the wording here either although it does help to define the EIP's scope. Some of your comments below about further work allude the possibility later EIPs might address things like how registries are linked together etc.

This EIP specifies an interface for publishing to and retrieving assets from smart contract package registries. It is a companion EIP to [1123](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md) which defines a standard for smart contract package manifests.

## Motivation
The goal is to establish a framework that allows smart contract publishers to design and deploy code registries with arbitrary business logic while exposing a set of common endpoints that tooling can use to retrieve assets for contract package consumers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/code registries/package registries I think is more accurate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted for contract consumers

**Pagination**

`getAllPackages` and `getAllReleaseIds` support paginated requests because it's possible that the return values for these methods could become quite large. The methods should return an `offset` that is a pointer to the next available item in a list of all items such that a caller can use it to pick up from where the previous request left off. (See [here](https://mixmax.com/blog/api-paging-built-the-right-way) for a discussion of the merits and demerits of various pagination strategies.) The `limit` parameter defines the maximum number of items a registry should return per request.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is value in adding a new interface section Supported Interfaces API (ie. heading at the same level as Write API and Read API), to require the use of EIP-165.

My reasoning for this is mainly for forward-compatibility reasons. Due to this effort's nature of aiming to be functionally unopinionated, it seems plausible to expect that future EIPs will build on (and possibly supersede) this standard. EIP-165 provides a lightweight mechanism that allows a contract the capability of stating which standard interfaces it supports.

By requiring the use of EIP-165 in this standard, it simultaneously sets a clear/consistent precedent for future registry standards, as well as ensuring that all standard registries adhere to this practice for capability expressiveness from the very beginning. Should future work to improve this standard, then this adherence will make implementation easier for all registry consumers.

I understand that it's prescriptive to require this behavior, and adds coupling to 165, but 165 is already finalized, and it does not add a significant gas or implementation time overhead. If we merely "recommend" 165 instead of requiring it (should instead of must), we lose our easiest opportunity to ensure conformance across the board.


## Backwards Compatibility
The standard modifies the interface of the existing EthPM package registry in such a way that the currently deployed version would not comply with standard since it implements only one of the method signatures described in the specification.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, it might be worth adding a "future work" section, to "bait" the community into pursuing advances in this area.

Some ideas that come to mind that I think would be cool:

  • If there's a SemVer standard ERC at some point, there could be a SemVerRegistry standard
  • Standard(s) for authorized registries
  • Standard(s) for registry federation

@cgewecke
Copy link
Author

cgewecke commented Aug 7, 2018

After working up a PR for the bulk getters at the reference registry, have made some changes to function signatures and added a required method. It looks like returning string arrays is problematic - they're only supported the experimental V2 encoder and not recommended for production contracts. There are also issues extracting these arrays using web3.

I think these methods are important - tooling needs to be able to list a registry. Would welcome discussion about it.

The change is from getAllPackageNames to getAllPackageIds and another method (getPackageName) is required to associate the id with the string name.

Also made some small wording changes. Have not addressed the 'Further work' or EIP 165 suggestions in the most recent changes.

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

```javascript
const SimpleToken = await web3.packaging
.registry('packages.ethpm.eth')
.getPackage('SimpleToken')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package name should probably updated to be compliant with the package naming scheme as simple-token

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

+ a **package** is identified by a unique string name and unique bytes32 **packageId** within a given **registry**
+ a **release** is identified by a `bytes32` **releaseId** which must be unique for a given package name and release version string pair.
+ a **releaseId** maps to a set of data that includes a **manifestURI** string which describes the location of an [EIP 1123 package manifest](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md). This manifest contains data about the release including the location of its component code assets.
+ a **manifestURI** string contains a cryptographic hash which can be used to verify the integrity of the content found at the URI. The URI format is defined in [RFC3986](https://tools.ietf.org/html/rfc3986).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a manifestURI is a URI as defined by RFC3986 which can be used to retrieve the contents of the package manifest. In addition to validation against RFC3986, each manifestURI must also be contain a hash of the content as specified in the EIP1123

I don't know if this is any easier to grok, but I tried. Your call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yours is definitely clearer, nice.

**Package Names / Release Versions**

```shell
"SimpleToken" # package name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated to be a valid package name: simple-token

version identifier for the release, and a URI specifying the location of a manifest which
details the contents of the release.
```solidity
function release(string packageName, string version, string manifestURI) public;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return the releaseId?

Copy link
Author

@cgewecke cgewecke Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam @gnidan has made a couple of suggestions as well:

  • mandating ERC 165 as part of the EIP (basically to help with future compatibility - case is made here)
  • adding a 'future work' section that mentions the possibility of extending the interface at some point for more complex registry implementations (case here)

Do you have any views about either of those?

Also do you have any views about the direction to go in with the reference registry? Would you prefer to see it simplified? Are you comfortable preserving it as relatively fully featured? Other...?

---
eip: unknown
title: Contract Package Registry Interface
author: Piper Merriam <[email protected]>, Christopher Gewecke <[email protected]>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chris your name should be first

@pipermerriam
Copy link
Member

could I ask your advice about what direction to take the reference implementation of the registry in? There's a bit of a fork in the road here.

We could simplify it quite a bit (which might make it more legible to someone looking for a reference). Or we could go the other direction and make it an example of the kind of fully semver compliant registry that can be spun out of the simplicity of the spec. Which of these do you think is best?

I think it may be best to start with a very simple registry that accepts whatever. As much as the idea of on-chain validation of semver appeals to the part of me that wants exactness, I don't think it actually adds a lot of business value. If we are assuming that each team will have their own registry then it will be up to each team to either enforce this in their own internal processes, or to use a registry implementation which has semver enforcement built-in. In my experience, the former works quite well and I'm not seeing any critical problems that would arise from that. Thoughts?

Basically, a dead simple registry with absolute minimal validation of version strings (like maybe that they are non-empty?).

@cgewecke
Copy link
Author

@pipermerriam Sorry for the delay getting back to you on this.

Simple sounds good to me! I'll line up a string of PRs doing that at the registry over the next few days and probably ping you next week for review.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants