Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Draft of Registry EIP #1

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
124 changes: 124 additions & 0 deletions EIPS/eip-ethpm-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
[ Needs table header ... ]

## Abstract
This EIP specifies an interface for publishing to and retrieving assets from smart contract package registries. It is a companion EIP to [1123](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md) which defined a standard for smart contract package manifests.

## Motivation
The goal is to establish a framework that allows smart contract publishers to design and deploy code registries of arbitrary complexity which expose standard endpoints to tooling that retrieves assets for contract package consumers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace complexity with business logic?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


A clear standard would help the existing EthPM Package Registry evolve from a centralized, single-project community resource into a decentralized multi-registry system whose constituents are bound together by the proposed interface. In turn, these registries could be ENS name-spaced, enabling installation conventions familiar to users of `npm` and other package managers.

**Examples**
```shell
$ ethpm install packages.zeppelin.eth/Ownership
```

```javascript
const SimpleToken = await web3.packaging
.registry('packages.ethpm.eth')
.getPackage('SimpleToken')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package name should probably updated to be compliant with the package naming scheme as simple-token

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

.getVersion('^1.1.5');
```

## Specification
The specification describes a small read/write API whose components are mandatory. It encourages (without enforcing) the management of versioned releases using the conventions of [semver](https://semver.org/). It assumes registries will share the following structure and encoding conventions:

+ a **registry** is a deployed contract which manages a collection of **packages**.
+ a **package** is a collection of **releases**
+ a **package** is identified by a unique string name within a given **registry**
+ a **release** is identified by a bytes32 **releaseHash** which is the keccak256 hash of the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the exact mechanism for generating the release hash should be left un-specified, and instead, contracts should expose a view method like getReleaseHash(string packageName, string version) returns bytes32

This would appropriately decouple the spec from semver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam Yes this is a really nice simplification - seems like the perfect minimum.

And getReleaseData just returns the inputs of a release?

getReleaseData(bytes32 releaseId) returns (string name, string version)

Copy link
Author

@cgewecke cgewecke Jul 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually the sigs need to be like this I think...

release(string name, string version, string manifestURI);
getReleaseData(bytes32 releaseId) returns (string name, string version, string manifestURI);

+ the keccak256 hash of a package's string name *WITH*
+ the keccak256 hash of its semver components
+ uint32 major
+ uint32 minor
+ uint32 patch
+ string preRelease
+ string build
+ a **releaseHash** maps to a set of data that includes a **manifestURI** string which describes the location of an [EIP 1123 package manifest](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1123.md). This manifest contains data about the release including the location of its component code assets.
+ a **manifestURI** string contains a cryptographic hash which can be used to verify the integrity of the content found at the URI. The URI format is defined in [RFC3986](https://tools.ietf.org/html/rfc3986).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably worth either:

Maybe both, or including the definition here as an excerpt. Not sure what the right thing to do, but the current description hints at what the URI is supposed to be but isn't clear enough that I think it'll be confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this is opaque.

FWIW it's worth the definition is verbatim copy of the text at the glossary link - possibly flesh it out there as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a manifestURI is a URI as defined by RFC3986 which can be used to retrieve the contents of the package manifest. In addition to validation against RFC3986, each manifestURI must also be contain a hash of the content as specified in the EIP1123

I don't know if this is any easier to grok, but I tried. Your call.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yours is definitely clearer, nice.


The canonical way to generate a **releaseHash** is the using the following methods (in Solidity):
```solidity
// Hashes package name
function hashPackageName(string packageName)
public
pure
returns (bytes32)
{
return keccak256(abi.encodePacked(packageName));
}

// Hashes version components
function hashReleaseVersion( uint32 major, uint32 minor, uint32 patch, string preRelease, string build)
public
pure
returns (bytes32)
{
return keccak256(abi.encodePacked(major, minor, patch, preRelease, build));
}

// Hashes package name hash and version components hash together
function hashRelease(bytes32 packageNameHash, bytes32 releaseVersionHash)
public
pure
returns (bytes32)
{
return keccak256(abi.encodePacked(packageNameHash, releaseVersionHash));
}
```
(See *Rationale* below for more information about the purpose of this hashing strategy.)

**Write API**
The write API consists of a single method, `release` which passes the registry the release information described above and allows it to create a unique, retrievable, semver compliant record of a versioned code package.
```solidity
function release(
string name,
uint32 major,
uint32 minor,
uint32 patch,
string preRelease,
string build,
string manifestURI
)
public
returns (bool);
```
**Read API Specification**

The read API consists of a minimal set of methods that allows tooling to extract all consumable data from a registry.

```solidity
// Retrieves all packages published to a registry
function getAllPackageNames() public view returns (string[]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth taking some time to explore the upper bounds for how many names this is capable of returning. I suspect is is quite large, but.... there will likely be registries with alot of names. Might be good to paginate this API with a limit and an offset. This would ensure that it's possible to page through extremely large lists.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice.

How do you feel about using a cursor instead of pagination? Idea is:

By returning a “cursor”, the API guarantees that it will return the exactly the next entry in the list, regardless of what changes happen to the collection between API calls. Think of the cursor as permanent marker in the list that says “we left off here”.

source

getAllPackageNames(uint cursor) public view returns (string[], uint cursor);

Something like this on the client side.

registry.getAllPackageNames(0)
> { ['zeppelin', 'maker',...], 100 } // records 0-99, next index is position 100.
registry.getAllPackageNames(100)
> { ['etc', ...]}

In this schema the registry is responsible for picking the page size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd push to merge the ideas and use cursor and limit. I do however think the idea of cursor is a bit opaque and less intuitive than offset/limit and offset can have the same functionality what you are looking for with cursor, and we can add a should recommendation to this function that using offset and limit a client library should always be able to resume scraping at a later date.

Use your best judgement, I think this is potentially bike shedding territory.


// Retrieves all releases for a given package
function getAllPackageReleaseHashes(string name) public view returns (bytes32[]);

// Retrieves version and manifestURI data for a given release hash
function getReleaseData(bytes32 releaseHash) public view
returns (
uint32 major,
uint32 minor,
uint32 patch,
string preRelease,
string build,
string manifestURI
);
```

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is value in adding a new interface section Supported Interfaces API (ie. heading at the same level as Write API and Read API), to require the use of EIP-165.

My reasoning for this is mainly for forward-compatibility reasons. Due to this effort's nature of aiming to be functionally unopinionated, it seems plausible to expect that future EIPs will build on (and possibly supersede) this standard. EIP-165 provides a lightweight mechanism that allows a contract the capability of stating which standard interfaces it supports.

By requiring the use of EIP-165 in this standard, it simultaneously sets a clear/consistent precedent for future registry standards, as well as ensuring that all standard registries adhere to this practice for capability expressiveness from the very beginning. Should future work to improve this standard, then this adherence will make implementation easier for all registry consumers.

I understand that it's prescriptive to require this behavior, and adds coupling to 165, but 165 is already finalized, and it does not add a significant gas or implementation time overhead. If we merely "recommend" 165 instead of requiring it (should instead of must), we lose our easiest opportunity to ensure conformance across the board.

## Rationale
The proposal is meant to accomplish the following:
+ Establish a publication norm that helps registries implement semver and store package data in mappings that reflect a two-tiered hierarchy of packages that are collections of releases. This is the rationale behind the two-phased hashing of package and version components together into a single release hash identifier.
+ Provide the minimum set of getter methods needed to retrieve all package data from a registry so that registry aggregators can read all of their data.
+ Define a standard way of generating a release hash so that tooling can resolve specific package version requests *without* needing to query a registry about its entire contents.

In practice registries may offer more complex `read` APIs that manage consumers requests for packages within a semver range or at `latest` etc. This EIP is agnostic about how tooling or registry contracts implement these. It recommends that registries implement [EIP 165](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-165.md) and avail themselves of resources to publish more complex interfaces such as [EIP 926](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-926.md).

## Backwards Compatibility
The standard simplifies the interface of the existing EthPM package registry in such a way that the currently deployed version would not comply with proposed standard. Specifically, the deployed version lacks the `getAllPackageNames` method.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, it might be worth adding a "future work" section, to "bait" the community into pursuing advances in this area.

Some ideas that come to mind that I think would be cool:

  • If there's a SemVer standard ERC at some point, there could be a SemVerRegistry standard
  • Standard(s) for authorized registries
  • Standard(s) for registry federation

## Implementation
A reference implementation of the proposed standard can be found at the EthPM organization on Github [here](https://github.com/ethpm/escape-truffle).

## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).