Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Draft of Registry EIP #1

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
110 changes: 110 additions & 0 deletions EIPS/eip-ethpm-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
[ Needs table header ... ]

## Abstract
This EIP specifies an interface for publishing to and retrieving assets from smart contract package registries. It is a companion EIP to [1123](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md) which defines a standard for smart contract package manifests.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor thought)

I'm a bit ambivalent about how to word this abstract. Part of me prefers simply to write this EIP as saying it specifies "a package registry interface", vs. how you have it. I see the value in this canonical definition as "what the interface does", but I also think there's value in expressing the contents as the noun "registry". I don't know. This is just muttering.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not thrilled by the wording here either although it does help to define the EIP's scope. Some of your comments below about further work allude the possibility later EIPs might address things like how registries are linked together etc.


## Motivation
The goal is to establish a framework that allows smart contract publishers to design and deploy code registries with arbitrary business logic while exposing a set of common endpoints that tooling can use to retrieve assets for contract package consumers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/code registries/package registries I think is more accurate

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opted for contract consumers


A clear standard would help the existing EthPM Package Registry evolve from a centralized, single-project community resource into a decentralized multi-registry system whose constituents are bound together by the proposed interface. In turn, these registries could be ENS name-spaced, enabling installation conventions familiar to users of `npm` and other package managers.

**Examples**
```shell
$ ethpm install packages.zeppelin.eth/Ownership
```

```javascript
const SimpleToken = await web3.packaging
.registry('packages.ethpm.eth')
.getPackage('SimpleToken')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package name should probably updated to be compliant with the package naming scheme as simple-token

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.getVersion('^1.1.5');
```

## Specification
The specification describes a small read/write API whose components are mandatory. It allows registries to manage versioned releases using the conventions of [semver](https://semver.org/) without imposing this as a requirement. It assumes registries will share the following structure and conventions:

+ a **registry** is a deployed contract which manages a collection of **packages**.
+ a **package** is a collection of **releases**
+ a **package** is identified by a unique string name within a given **registry**
+ a **release** is identified by a `bytes32` **releaseId** which must be unique for a given package name and release version string pair.
+ a **releaseId** maps to a set of data that includes a **manifestURI** string which describes the location of an [EIP 1123 package manifest](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md). This manifest contains data about the release including the location of its component code assets.
+ a **manifestURI** string contains a cryptographic hash which can be used to verify the integrity of the content found at the URI. The URI format is defined in [RFC3986](https://tools.ietf.org/html/rfc3986).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably worth either:

Maybe both, or including the definition here as an excerpt. Not sure what the right thing to do, but the current description hints at what the URI is supposed to be but isn't clear enough that I think it'll be confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this is opaque.

FWIW it's worth the definition is verbatim copy of the text at the glossary link - possibly flesh it out there as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a manifestURI is a URI as defined by RFC3986 which can be used to retrieve the contents of the package manifest. In addition to validation against RFC3986, each manifestURI must also be contain a hash of the content as specified in the EIP1123

I don't know if this is any easier to grok, but I tried. Your call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yours is definitely clearer, nice.


An example of a package name and release version string pair is:
```shell
"SimpleToken" # package name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be updated to be a valid package name: simple-token

"1.0.1" # version string
```

Implementations are free to choose any scheme for generating a **releaseId**. A common approach would be to hash the strings together as below.

```solidity
// Hashes package name and a release version string
function generateReleaseId(string packageName, string version)
public
pure
returns (bytes32)
{
return keccak256(abi.encodePacked(packageName, version));
}
```

Implementations **must** expose this id generation logic as part of their public `read` API so
tooling can easily map a string based release query to the registry's unique identifier for that release.

**Write API Specification**
The write API consists of a single method, `release`. It passes the registry the package name, a
version identifier for the release, and a URI specifying the location of a manifest which
details the contents of the release.
```solidity
function release(string packageName, string version, string manifestURI) public;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return the releaseId?

Copy link
Author

@cgewecke cgewecke Aug 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam @gnidan has made a couple of suggestions as well:

  • mandating ERC 165 as part of the EIP (basically to help with future compatibility - case is made here)
  • adding a 'future work' section that mentions the possibility of extending the interface at some point for more complex registry implementations (case here)

Do you have any views about either of those?

Also do you have any views about the direction to go in with the reference registry? Would you prefer to see it simplified? Are you comfortable preserving it as relatively fully featured? Other...?

```
**Read API Specification**

The read API consists of a set of methods that allows tooling to extract all consumable data from a registry.

```solidity
// Retrieves all the packages published to a registry.
function getAllPackageNames() public view returns (string[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth taking some time to explore the upper bounds for how many names this is capable of returning. I suspect is is quite large, but.... there will likely be registries with alot of names. Might be good to paginate this API with a limit and an offset. This would ensure that it's possible to page through extremely large lists.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice.

How do you feel about using a cursor instead of pagination? Idea is:

By returning a “cursor”, the API guarantees that it will return the exactly the next entry in the list, regardless of what changes happen to the collection between API calls. Think of the cursor as permanent marker in the list that says “we left off here”.

source

getAllPackageNames(uint cursor) public view returns (string[], uint cursor);

Something like this on the client side.

registry.getAllPackageNames(0)
> { ['zeppelin', 'maker',...], 100 } // records 0-99, next index is position 100.
registry.getAllPackageNames(100)
> { ['etc', ...]}

In this schema the registry is responsible for picking the page size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd push to merge the ideas and use cursor and limit. I do however think the idea of cursor is a bit opaque and less intuitive than offset/limit and offset can have the same functionality what you are looking for with cursor, and we can add a should recommendation to this function that using offset and limit a client library should always be able to resume scraping at a later date.

Use your best judgement, I think this is potentially bike shedding territory.


// Retrieves the registry's unique identifier for an existing release of a package.
function getReleaseId(string packageName, string version) public view returns (bytes32);

// Retrieves all release ids for a package
function getAllReleaseIds(string packageName) public view returns (bytes32[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought about pagination through limit and offset


// Retrieves package name, release version and URI location data for a release id.
function getReleaseData(bytes32 releaseId) public view
returns (
string packageName,
string version,
string manifestURI
);

// Retrieves the release id a registry *would* generate for a package name and version pair
// when executing a release.
function generateReleaseId(string packageName, string version) pure returns (bytes32);

// Declares whether a registry maintains its releases in semver compliant order.
function usesSemver() public pure returns (bool);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this a method because I'd like it to be compatible with EIP 165. Not sure this is absolutely necessary though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, I'm inclined to push back on this.

  • I'm not aware of broad adoption of EIP165.
  • It should be completely valid for a registry to allow multiple version schemes.
  • Client side consumers will have to implement error handling and version string parsing whether this is present or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fair enough. . . although without some guidance about how registries version aren't we telling clients to just forward whatever version string they're given? If a client fields this request:

.getVersion('^1.1.5');

then the registry needs to manage it because the client has no way of knowing what the caret means. It could be talking to a registry where 1.1.5, ^1.1.5 and orange77 or even are all legitimate versions according to the manifest spec.

Was kind of hoping that usesSemver would tell the client that it should analyze a list of releases and figure out which one matches the request. Or error if that kind of request doesn't make sense.

However this idea isn't very clearly expressed and probably misguided as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes directly against the intent of the spec, which is to allow arbitrary business logic. If people want to use orange77, aka, sem-fruit versioning then I suspect very few people will use their registry.

At their core, registries can be blind data stores. Some contracts will implement things like enforcing semver for version strings. Some libraries will be really good about versioning, and others may choose obscure versioning schemes. In the end, it's what the consumers of these registries support that will drive how they are used. Initially, I suspect that most libraries will ignore or throw errors when they encounter version strings that they cannot parse.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sounds good.

```

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is value in adding a new interface section Supported Interfaces API (ie. heading at the same level as Write API and Read API), to require the use of EIP-165.

My reasoning for this is mainly for forward-compatibility reasons. Due to this effort's nature of aiming to be functionally unopinionated, it seems plausible to expect that future EIPs will build on (and possibly supersede) this standard. EIP-165 provides a lightweight mechanism that allows a contract the capability of stating which standard interfaces it supports.

By requiring the use of EIP-165 in this standard, it simultaneously sets a clear/consistent precedent for future registry standards, as well as ensuring that all standard registries adhere to this practice for capability expressiveness from the very beginning. Should future work to improve this standard, then this adherence will make implementation easier for all registry consumers.

I understand that it's prescriptive to require this behavior, and adds coupling to 165, but 165 is already finalized, and it does not add a significant gas or implementation time overhead. If we merely "recommend" 165 instead of requiring it (should instead of must), we lose our easiest opportunity to ensure conformance across the board.

## Rationale
The proposal hopes to accomplish the following:

+ Define the smallest set of inputs necessary to allow registries to map package names to a set of
release versions while allowing them to use any versioning schema they choose.
+ Provide the minimum set of getter methods needed to retrieve package data from a registry so that registry aggregators can read all of their data.
+ Define a standard query that synthesizes a release identifier from a package name and version pair so that tooling can resolve specific package version requests without needing to query a registry about all of a package's releases.
+ Mandate that registries indicate whether or not they enforce semver so that tooling can determine whether consumer requests for packages at a semver range are possible.

Registries may offer more complex `read` APIs that manage requests for packages within a semver range or at `latest` etc. This EIP is agnostic about how tooling or registries might implement these. It recommends that registries implement [EIP 165](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-165.md) and avail themselves of resources to publish more complex interfaces such as [EIP 926](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-926.md).

## Backwards Compatibility
The standard modifies the interface of the existing EthPM package registry in such a way that the currently deployed version would not comply with standard since it implements only one of the method signatures described in the specification.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, it might be worth adding a "future work" section, to "bait" the community into pursuing advances in this area.

Some ideas that come to mind that I think would be cool:

  • If there's a SemVer standard ERC at some point, there could be a SemVerRegistry standard
  • Standard(s) for authorized registries
  • Standard(s) for registry federation

## Implementation
A reference implementation of this proposal can be found at the EthPM organization on Github [here](https://github.com/ethpm/escape-truffle).

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).