Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize contentIdentifier for SoftwareArtifact integrity verification #611

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions model/Software/Classes/ContentIdentifier.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
SPDX-License-Identifier: Community-Spec-1.0

# ContentIdentifier

## Summary

A canonical, unique, immutable identifier

## Description

A ContentIdentifier is a canonical, unique, immutable identifier of the content of a software artifact, such as a package, a file, or a snippet.
It can be used for verifying its identity and integrity.

## Metadata

- name: ContentIdentifier
- SubclassOf: /Core/IntegrityMethod
- Instantiability: Concrete

## Properties

- contentIdentifierValue
- type: xsd:string
- minCount: 1
- maxCount: 1
- contentIdentifierType
- type: ContentIdentifierType
- minCount: 1
- maxCount: 1

4 changes: 1 addition & 3 deletions model/Software/Classes/SoftwareArtifact.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,7 @@ such as a package, a file, or a snippet.
## Properties

- contentIdentifier
- type: xsd:anyURI
- minCount: 0
- maxCount: 1
- type: ContentIdentifier
- primaryPurpose
- type: SoftwarePurpose
- minCount: 0
Expand Down
17 changes: 4 additions & 13 deletions model/Software/Properties/contentIdentifier.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,16 @@ SPDX-License-Identifier: Community-Spec-1.0

## Summary

Provides a place to record the canonical, unique, immutable identifier for each software artifact using the artifact's gitoid.
A canonical, unique, immutable identifier of the artifact content, that can be used for verifying its identity and integrity.

## Description

The contentIdentifier provides a canonical, unique, immutable artifact identifier for each software artifact. SPDX 3.0 describes software artifacts as Snippet, File, or Package Elements. The ContentIdentifier can be calculated for any software artifact and can be recorded for any of these SPDX 3.0 Elements using Omnibor, an attempt to standardize how software artifacts are identified independent of which programming language, version control system, build tool, package manager, or software distribution mechanism is in use.

The contentIdentifier is defined as the [Git Object Identifier](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) (gitoid) of type `blob` of the software artifact. The use of a git-based version control system is not necessary to calculate a contentIdentifier for any software artifact.

The gitoid is expressed in the ContentIdentifier property by using the IANA [gitoid URI scheme](https://www.iana.org/assignments/uri-schemes/prov/gitoid).

```
Scheme syntax: gitoid":"<git object type>":"<hash algorithm>":"<hash value>
```

The OmniBOR ID for the OmniBOR Document associated with a software artifact should not be recorded in this field. Rather, OmniBOR IDs should be recorded in the SPDX Element's ExternalIdentifier property. See [https://omnibor.io](https://omnibor.io) for more details.
A contentIdentifier is a canonical, unique, immutable identifier of the content of a software artifact, such as a package, a file, or a snippet.
It can be used for verifying its identity and integrity.

## Metadata

- name: contentIdentifier
- Nature: DataProperty
- Range: xsd:anyURI
- Range: ContentIdentifier

18 changes: 18 additions & 0 deletions model/Software/Properties/contentIdentifierType.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
SPDX-License-Identifier: Community-Spec-1.0

# contentIdentifierType

## Summary

Specifies the type of the content identifier.

## Description

An contentIdentifierType specifies the type of the content identifier.

## Metadata

- name: contentIdentifierType
- Nature: ObjectProperty
- Range: ContentIdentifierType

18 changes: 18 additions & 0 deletions model/Software/Properties/contentIdentifierValue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
SPDX-License-Identifier: Community-Spec-1.0

# contentIdentifierValue

## Summary

Specifies the value of the content identifier.

## Description

A contentIdentifierValue specifies the value of a content identifier.

## Metadata

- name: contentIdentifierValue
- Nature: DataProperty
- Range: xsd:string

21 changes: 21 additions & 0 deletions model/Software/Vocabularies/ContentIdentifierType.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
SPDX-License-Identifier: Community-Spec-1.0

# ContentIdentifierType

## Summary

Specifies the type of a content identifier.

## Description

ContentIdentifierType specifies the type of a content identifier.

## Metadata

- name: ContentIdentifierType

## Entries

- gitoid: https://www.iana.org/assignments/uri-schemes/prov/gitoid Gitoid stands for [Git Object ID](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) and a gitoid of type blob is a unique hash of a binary artifact. A gitoid may represent the software [Artifact ID](https://github.com/omnibor/spec/blob/main/spec/SPEC.md#artifact-id) or the [OmniBOR Identifier](https://github.com/omnibor/spec/blob/main/spec/SPEC.md#omnibor-identifier) for the software artifact's associated [OmniBOR Document](https://github.com/omnibor/spec/blob/main/spec/SPEC.md#omnibor-document); this ambiguity exists because the OmniBOR Document is itself an artifact, and the gitoid of that artifact is its valid identifier. Omnibor is a minimalistic schema to describe software [Artifact Dependency Graphs](https://github.com/omnibor/spec/blob/main/spec/SPEC.md#artifact-dependency-graph-adg). Gitoids calculated on software artifacts (Snippet, File, or Package Elements) should be recorded in the SPDX 3.0 SoftwareArtifact's ContentIdentifier property. Gitoids calculated on the OmniBOR Document (OmniBOR Identifiers) should be recorded in the SPDX 3.0 Element's ExternalIdentifier property.
- swhid: SoftWare Hash IDentifier, persistent intrinsic identifiers for digital artifacts. The syntax of the identifiers is defined in the [SWHID specification](https://www.swhid.org/specification/v1.1/4.Syntax) and they typically look like `swh:1:cnt:94a9ed024d3859793618152ea559a168bbcbb5e2`.