-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve software integrity and verification documentation #602
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion and one question for @zvr on spec parser support for the subheadings.
Bob and Dick to take a pass through this, per call today. |
I concur with Gary's proposed language. It may be worth noting that component ordering can effect OmniBor results. SBOM's have no implied ordering requirement and may not be easily "verifiable" using OmniBor due to this mismatch in ordering requirements. |
@@ -4,11 +4,14 @@ SPDX-License-Identifier: Community-Spec-1.0 | |||
|
|||
## Summary | |||
|
|||
Provides a place to record the canonical, unique, immutable identifier for each software artifact using the artifact's gitoid. | |||
Used by SPDX producers to record the artifact’s gitoid: a canonical, unique, immutable identifier that can be used for software integrity verification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously is not only a "gitoid", but any content identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be more appropriate to include an identifier tag in the syntax, like IANA uses to qualify identifier types - similar to what SPDX does today : dns:www.foo.com SHA256:.......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other content identifiers out there, as Alexios points out. Let's handle the broadening as a separate PR though, and move this forward. The original had gitoid in it, and is just restructured here.
@@ -4,11 +4,14 @@ SPDX-License-Identifier: Community-Spec-1.0 | |||
|
|||
## Summary | |||
|
|||
Provides a place to record the canonical, unique, immutable identifier for each software artifact using the artifact's gitoid. | |||
Used by SPDX producers to record the artifact’s gitoid: a canonical, unique, immutable identifier that can be used for software integrity verification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used by SPDX producers to record the artifact’s gitoid: a canonical, unique, immutable identifier that can be used for software integrity verification. | |
Used to record a canonical, unique, immutable identifier of the artifact, that can be used for verification its identity and integrity. |
Provides a place to record the canonical, unique, immutable identifier for each software artifact using the artifact's gitoid. | ||
Used by SPDX producers to record the artifact’s gitoid: a canonical, unique, immutable identifier that can be used for software integrity verification. | ||
|
||
Used by SPDX consumers to verify the integrity of a software artifact they received. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used by SPDX consumers to verify the integrity of a software artifact they received. |
No need to have it separate for producers/consumers.
|
||
## Description | ||
|
||
The contentIdentifier provides a canonical, unique, immutable artifact identifier for each software artifact. SPDX 3.0 describes software artifacts as Snippet, File, or Package Elements. The ContentIdentifier can be calculated for any software artifact and can be recorded for any of these SPDX 3.0 Elements using Omnibor, an attempt to standardize how software artifacts are identified independent of which programming language, version control system, build tool, package manager, or software distribution mechanism is in use. | ||
### SPDX Producers | ||
The contentIdentifier is a canonical, unique, immutable artifact identifier for each software artifact. The ContentIdentifier for any software artifact can be calculated and recorded in SPDX 3.0 Snippet, File, or Package Elements. For additional information, see [OmniBOR](https://omnibor.io): an attempt to standardize how software artifacts are identified independent of which programming language, version control system, build tool, package manager, or software distribution mechanism is in use. | ||
|
||
The contentIdentifier is defined as the [Git Object Identifier](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) (gitoid) of type `blob` of the software artifact. The use of a git-based version control system is not necessary to calculate a contentIdentifier for any software artifact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree that contentIdentifier
has to be a gitoid. We should leave it for any kind of unique identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that not all content identifiers support use in verification. We would also need to somehow specify which content identifier is used (e.g. the enumeration of supported content identifiers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That appears similar to how SPDX handles this today with the SHA* identifying tags accompanying a hash value in an SBOM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zvr - Let's move forward with this definition and open a separate issue to include better support for SWHID after RC2 - just trying to get RC2 released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove gitoid and leave generic "unique, immutable artifact identifier" or we can mention both gitoid and swhid that we already know of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not objecting just because I don't want SWHIDs to be forgotten:
The most important part is, given that we know there might be more than one format, modeling it as a single value (of type URI) will not work. There has to be a type defined somewhere (maybe with a vocabulary), etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection of the discussions on this topic from last year were that for v3.0 we would specify gitoid as the default but that the structures and definitions would be generalized (as @zvr is suggesting above) to enable addition of other possible types post v3.0 without breaking backward compatibility.
In other words, I agree that the definitions should be generalized and not locked to one approach.
Signed-off-by: Jeff Schutt <[email protected]> Co-authored-by: Gary O'Neall <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the updated changes - all looks good
Agree that the changes addressed the concerns for RC2. |
@@ -4,11 +4,14 @@ SPDX-License-Identifier: Community-Spec-1.0 | |||
|
|||
## Summary | |||
|
|||
Provides a place to record the canonical, unique, immutable identifier for each software artifact using the artifact's gitoid. | |||
Used by SPDX producers to record the artifact’s gitoid: a canonical, unique, immutable identifier that can be used for software integrity verification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other content identifiers out there, as Alexios points out. Let's handle the broadening as a separate PR though, and move this forward. The original had gitoid in it, and is just restructured here.
|
||
## Description | ||
|
||
The contentIdentifier provides a canonical, unique, immutable artifact identifier for each software artifact. SPDX 3.0 describes software artifacts as Snippet, File, or Package Elements. The ContentIdentifier can be calculated for any software artifact and can be recorded for any of these SPDX 3.0 Elements using Omnibor, an attempt to standardize how software artifacts are identified independent of which programming language, version control system, build tool, package manager, or software distribution mechanism is in use. | ||
### SPDX Producers | ||
The contentIdentifier is a canonical, unique, immutable artifact identifier for each software artifact. The ContentIdentifier for any software artifact can be calculated and recorded in SPDX 3.0 Snippet, File, or Package Elements. For additional information, see [OmniBOR](https://omnibor.io): an attempt to standardize how software artifacts are identified independent of which programming language, version control system, build tool, package manager, or software distribution mechanism is in use. | ||
|
||
The contentIdentifier is defined as the [Git Object Identifier](https://git-scm.com/book/en/v2/Git-Internals-Git-Objects) (gitoid) of type `blob` of the software artifact. The use of a git-based version control system is not necessary to calculate a contentIdentifier for any software artifact. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not objecting just because I don't want SWHIDs to be forgotten:
The most important part is, given that we know there might be more than one format, modeling it as a single value (of type URI) will not work. There has to be a type defined somewhere (maybe with a vocabulary), etc.
@zvr - Since the |
This should be resolved before RC2, since it most probably will result in changes in the model. |
In an effort to resolve the different discussions here, I've broken down the different parts and submitted a number of PRs:
and then, an option of two approaches:
Note that there is no mention of "preferred", "recommended", "suggested", etc. anywhere. Policy is something different than model definition and can be decided later -- and expressed with various conformance tests, profile definitions, etc. @goneall , @jeff-schutt , @kestewart and everyone else, please review these new PRs. |
Thanks @zvr - having the specific pull requests definitely makes it easier to compare / review / decide. I'm OK with either of your two choices in the model changes. I'm also OK with the current state, but you do raise a good point about I do like PR #602 as a better starting point for the documentation for reasons noted in the review comments. We should probably decide on the 3 choices of the model -1) keep as is; 2) PR #610 rename the property or 3) PR #611 generalize @jeff-schutt @zvr Since I'm OK with all three approaches, perhaps you can connect and decide which works OK for both of you. |
that maps data of arbitrary size to a bit string (the hash) and is a one-way function, that is, | ||
a function which is practically infeasible to invert. This is commonly used for integrity checking of data. | ||
|
||
The recommended method to verify the integrity of `SoftwareArtifacts` Elements (including `Files`, `Snippets`, and `Packages`) is to use the SoftwareArtifact’s `contentIdentifier` property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that these sort of case specific details should be placed here.
This is the general structure to be used across SPDX.
Placing case specific variations here is likely to lead to an eventual laundry list of such specializations.
I propose that it would be more appropriate to add this note to the markdown files for the case specific structures (SoftwareArtifact, File, Package, Snippet).
@@ -13,6 +13,8 @@ of a specific Element that correlates to the data in this SPDX document. This id | |||
a recipient to determine if anything in the original Element has been changed and eliminates | |||
confusion over which version or modification of a specific Element is referenced. | |||
|
|||
The recommended method to verify the integrity of `SoftwareArtifacts` Elements (including `Files`, `Snippets`, and `Packages`) is to use the SoftwareArtifact’s `contentIdentifier` property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that these sort of case specific details should be placed here.
This is the general structure to be used across SPDX.
Placing case specific variations here is likely to lead to an eventual laundry list of such specializations.
I propose that it would be more appropriate to add this note to the markdown files for the case specific structures (SoftwareArtifact, File, Package, Snippet).
@@ -10,6 +10,8 @@ Provides an IntegrityMethod with which the integrity of an Element can be assert | |||
|
|||
VerifiedUsing provides an IntegrityMethod with which the integrity of an Element can be asserted. | |||
|
|||
The recommended method to verify the integrity of `SoftwareArtifacts` Elements (including `Files`, `Snippets`, and `Packages`) is to use the SoftwareArtifact’s `contentIdentifier` property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think that these sort of case specific details should be placed here.
This is the general structure to be used across SPDX.
Placing case specific variations here is likely to lead to an eventual laundry list of such specializations.
I propose that it would be more appropriate to add this note to the markdown files for the case specific structures (SoftwareArtifact, File, Package, Snippet).
This is a bit stale with the renaming to gitoid - retargeting to post RC2 for discussion |
This is now quite stale and there are other PR's which overlap. I'm going to go ahead and close this PR. @jeff-schutt @zvr - Please feel free to open a fresh PR if you think the documentation needs to be updated. |
Improves documentation in the SPDX 3 model to direct users towards existing support for package verification.
RE: #345 (comment)
CC: @goneall