Skip to content

Commit

Permalink
Check all rpms in sboms have a known repo id
Browse files Browse the repository at this point in the history
For all components in all sboms that are rpms, require they have a
repository_id value in their purl, and require that the
repository_id value is in in the big list of known repository_ids.

Includes a bit of extra work to limit the number of violations
produced, since I'm expecting there to be either none of many
hundreds.

As mentioned in a Todo in the comments, this is not yet added to the
redhat collection, but it will be added soon in an upcoming PR.

Ref: https://issues.redhat.com/browse/EC-848
  • Loading branch information
simonbaird committed Sep 17, 2024
1 parent 26792e1 commit ed2aef0
Show file tree
Hide file tree
Showing 5 changed files with 366 additions and 0 deletions.
33 changes: 33 additions & 0 deletions antora/docs/modules/ROOT/pages/release_policy.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ Rules included:
* xref:release_policy.adoc#provenance_materials__git_clone_source_matches_provenance[Provenance Materials: Git clone source matches materials provenance]
* xref:release_policy.adoc#provenance_materials__git_clone_task_found[Provenance Materials: Git clone task found]
* xref:release_policy.adoc#quay_expiration__expires_label[Quay expiration: Expires label]
* xref:release_policy.adoc#rpm_repos__rule_data_provided[RPM Repos: Known repo id list provided]
* xref:release_policy.adoc#rpm_signature__allowed[RPM Signature: Allowed RPM signature key]
* xref:release_policy.adoc#rpm_signature__result_format[RPM Signature: Result format]
* xref:release_policy.adoc#rpm_signature__rule_data_provided[RPM Signature: Rule data provided]
Expand Down Expand Up @@ -1009,6 +1010,38 @@ Verify an attestation created by the RHTAP Jenkins build pipeline is present.
* Code: `rhtap_jenkins.attestation_found`
* https://github.com/enterprise-contract/ec-policies/blob/{page-origin-refhash}/policy/release/rhtap_jenkins.rego#L17[Source, window="_blank"]

[#rpm_repos_package]
== link:#rpm_repos_package[RPM Repos]

This package defines rules to confirm that all RPM packages listed in SBOMs specify a known and permitted repository id.

* Package name: `rpm_repos`
* Package full path: `policy.release.rpm_repos`

[#rpm_repos__ids_known]
=== link:#rpm_repos__ids_known[All rpms have known repo ids]

Each RPM package listed in an SBOM must specify the repository id that it comes from, and that repository id must be present in the list of known and permitted repository ids.

*Solution*: Ensure every rpm comes from a known and permitted repository, and that the data in the SBOM correctly records that.

* Rule type: [rule-type-indicator failure]#FAILURE#
* FAILURE message: `RPM repo id check failed: %s`
* Code: `rpm_repos.ids_known`
* https://github.com/enterprise-contract/ec-policies/blob/{page-origin-refhash}/policy/release/rpm_repos.rego#L32[Source, window="_blank"]

[#rpm_repos__rule_data_provided]
=== link:#rpm_repos__rule_data_provided[Known repo id list provided]

A list of known and permitted repository ids should be available in the rule data.

*Solution*: Include a data source that provides a list of known repository ids under the 'known_rpm_repositories' key under the top level 'rule_data' key.

* Rule type: [rule-type-indicator failure]#FAILURE#
* FAILURE message: `Rule data '%s' has unexpected format: %s`
* Code: `rpm_repos.rule_data_provided`
* https://github.com/enterprise-contract/ec-policies/blob/{page-origin-refhash}/policy/release/rpm_repos.rego#L14[Source, window="_blank"]

[#rpm_signature_package]
== link:#rpm_signature_package[RPM Signature]

Expand Down
3 changes: 3 additions & 0 deletions antora/docs/modules/ROOT/partials/release_policy_nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@
*** xref:release_policy.adoc#rhtap_jenkins_package[RHTAP Jenkins]
**** xref:release_policy.adoc#rhtap_jenkins__invocation_id_found[RHTAP Jenkins SLSA Invocation ID present]
**** xref:release_policy.adoc#rhtap_jenkins__attestation_found[RHTAP Jenkins SLSA Provenance Attestation Found]
*** xref:release_policy.adoc#rpm_repos_package[RPM Repos]
**** xref:release_policy.adoc#rpm_repos__ids_known[All rpms have known repo ids]
**** xref:release_policy.adoc#rpm_repos__rule_data_provided[Known repo id list provided]
*** xref:release_policy.adoc#rpm_signature_package[RPM Signature]
**** xref:release_policy.adoc#rpm_signature__allowed[Allowed RPM signature key]
**** xref:release_policy.adoc#rpm_signature__result_format[Result format]
Expand Down
24 changes: 24 additions & 0 deletions example/data/known_rpm_repositories.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
# Copyright The Enterprise Contract Contributors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# SPDX-License-Identifier: Apache-2.0

# See also https://github.com/release-engineering/rhtap-ec-policy/blob/main/data/known_rpm_repositories.yml
rule_data:
known_rpm_repositories:
- "rhel-9-for-x86_64-appstream-rpms"
- "rhel-9-for-x86_64-appstream-source-rpms"
- "rhel-9-for-x86_64-baseos-rpms"
- "rhel-9-for-x86_64-baseos-source-rpms"
151 changes: 151 additions & 0 deletions policy/release/rpm_repos.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
#
# METADATA
# title: RPM Repos
# description: >-
# This package defines rules to confirm that all RPM packages listed
# in SBOMs specify a known and permitted repository id.
#
package policy.release.rpm_repos

import rego.v1

import data.lib

# METADATA
# title: Known repo id list provided
# description: >-
# A list of known and permitted repository ids should be available in the rule data.
# custom:
# short_name: rule_data_provided
# failure_msg: "Rule data '%s' has unexpected format: %s"
# solution: >-
# Include a data source that provides a list of known repository ids under the
# 'known_rpm_repositories' key under the top level 'rule_data' key.
# collections:
# - redhat
#
deny contains result if {
some error in _rule_data_errors
result := lib.result_helper(rego.metadata.chain(), [_rule_data_key, error])
}

# METADATA
# title: All rpms have known repo ids
# description: >-
# Each RPM package listed in an SBOM must specify the repository id that it comes from,
# and that repository id must be present in the list of known and permitted repository ids.
# custom:
# short_name: ids_known
# failure_msg: 'RPM repo id check failed: %s'
# solution: >-
# Ensure every rpm comes from a known and permitted repository, and that the data in the
# SBOM correctly records that.
# # Todo: Until the sbom generation is upated this will always fail, so don't include it
# # in the redhat collection yet. See https://issues.redhat.com/browse/STONEBLD-2638
# #collections:
# #- redhat
#
deny contains result if {
# Don't bother with this unless we have valid rule data
count(_rule_data_errors) == 0

some error in _repo_id_errors
result := lib.result_helper(rego.metadata.chain(), [error])
}

_rule_data_errors contains msg if {
# match_schema expects either a marshaled JSON resource (String) or an Object. It doesn't
# handle an Array directly.
value := json.marshal(_known_repo_ids)
some violation in json.match_schema(
value,
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "array",
"items": {"type": "string"},
"uniqueItems": true,
"minItems": 1,
},
)[1]
msg := violation.error
}

_repo_id_errors contains msg if {
bad_purls := all_rpm_purls - _plain_purls(all_purls_with_repo_ids)
count(bad_purls) > 0

some bad_purl in _truncated_msg_list(bad_purls)
msg := sprintf("An RPM component in the SBOM did not specify a repository_id value in its purl: %s", [bad_purl])
}

_repo_id_errors contains msg if {
bad_purls := all_purls_with_repo_ids - all_purls_with_known_repo_ids
count(bad_purls) > 0

some bad_purl in _truncated_msg_list(_plain_purls(bad_purls))
msg := sprintf("An RPM component in the SBOM specified an unknown or disallowed repository_id: %s", [bad_purl])
}

all_purls_with_known_repo_ids contains p if {
some p in all_purls_with_repo_ids
p.repo_id in _known_repo_ids
}

# Assemble the purl and repo id in an object with two keys.
# Note that the repo_id could be null.
all_purls_with_repo_ids contains p if {
some purl in all_rpm_purls
ec.purl.is_valid(purl)
parsed_purl := ec.purl.parse(purl)

# This returns null if the repo id is not present
p := {
"purl": purl,
"repo_id": _purl_qualifier("repository_id", parsed_purl),
}
}

all_rpm_purls contains p if {
some sbom in _all_sboms
some component in sbom.components
p := component.purl

# I'm assuming this is faster than parsing it and checking the type
startswith(p, "pkg:rpm")
}

# In future there will be SPDX sboms also
_all_sboms := lib.sbom.cyclonedx_sboms

_known_repo_ids := lib.rule_data(_rule_data_key)

_rule_data_key := "known_rpm_repositories"

# Conveniently convert a list of objects back to a plain list of purl strings
_plain_purls(objects_with_purl_key) := {p.purl | some p in objects_with_purl_key}

# Extract any named qualifier from a parsed purl
_purl_qualifier(key, parsed_purl) := result if {
some qualifier in parsed_purl.qualifiers
qualifier.key == key
result := qualifier.value
}

# SBOMs often list many hundreds of components. Let's avoid producing that
# many violations if none of the purls are passing this test. (In future we
# might move this to a shared library or to the ec-cli.)

# If there are more than this then truncate the list
_truncate_threshold := 10

# ...but not if the N in the "N more" is less than this
_min_remainder_count := 4

_truncated_msg_list(all_msgs) := truncated_msgs if {
remainder_count := count(all_msgs) - _truncate_threshold
remainder_count >= _min_remainder_count
truncated_msgs := array.concat(
array.slice(lib.to_array(all_msgs), 0, _truncate_threshold),
[sprintf("%d additional similar violations not listed explicitly", [remainder_count])],
)
} else := all_msgs
155 changes: 155 additions & 0 deletions policy/release/rpm_repos_test.rego
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
package policy.release.rpm_repos_test

import rego.v1

import data.lib
import data.policy.release.rpm_repos

test_repo_id_data_empty if {
expected := {
"code": "rpm_repos.rule_data_provided",
"msg": "Rule data 'known_rpm_repositories' has unexpected format: (Root): Array must have at least 1 items",
}

lib.assert_equal_results({expected}, rpm_repos.deny) with data.rule_data.known_rpm_repositories as []
}

test_repo_id_data_not_an_array if {
expected := {
"code": "rpm_repos.rule_data_provided",
"msg": sprintf("%s %s", [
"Rule data 'known_rpm_repositories' has unexpected format:",
"(Root): Invalid type. Expected: array, given: object",
]),
}

lib.assert_equal_results({expected}, rpm_repos.deny) with data.rule_data.known_rpm_repositories as {"chunky": "bacon"}
}

test_repo_id_data_not_strings if {
expected := {
"code": "rpm_repos.rule_data_provided",
"msg": "Rule data 'known_rpm_repositories' has unexpected format: 1: Invalid type. Expected: string, given: integer",
}

lib.assert_equal_results({expected}, rpm_repos.deny) with data.rule_data.known_rpm_repositories as ["spam", 42]
}

test_repo_id_all if {
lib.assert_equal_results(
{p1, p2, p3, p4, p5},
rpm_repos.all_rpm_purls,
) with rpm_repos._all_sboms as fake_sboms
}

test_repo_id_all_with_repo_id if {
lib.assert_equal_results(
{p1, p2, p3},
rpm_repos._plain_purls(rpm_repos.all_purls_with_repo_ids),
) with rpm_repos._all_sboms as fake_sboms
}

test_repo_id_all_known if {
lib.assert_equal_results(
{p1, p2},
rpm_repos._plain_purls(rpm_repos.all_purls_with_known_repo_ids),
) with rpm_repos._all_sboms as fake_sboms with data.rule_data.known_rpm_repositories as fake_repo_id_list
}

test_repo_id_purls_missing_repo_ids if {
expected := {
{
"code": "rpm_repos.ids_known",
"msg": sprintf("%s %s", [
"RPM repo id check failed: An RPM component in the SBOM did not specify a repository_id value in its purl:",
"pkg:rpm/redhat/[email protected]?arch=amd64&pastry_id=unknown",
]),
},
{
"code": "rpm_repos.ids_known",
"msg": sprintf("%s %s", [
"RPM repo id check failed: An RPM component in the SBOM did not specify a repository_id value in its purl:",
"pkg:rpm_borken",
]),
},
}

lib.assert_equal_results(expected, rpm_repos.deny) with rpm_repos._all_sboms as [fake_sbom({p1, p2, p4, p5, p6})]
with data.rule_data.known_rpm_repositories as fake_repo_id_list
}

test_repo_id_purls_missing_repo_ids_truncated if {
expected := {
{
"code": "rpm_repos.ids_known",
"msg": sprintf("%s %s", [
"RPM repo id check failed: An RPM component in the SBOM did not specify a repository_id value in its purl:",
"pkg:rpm/redhat/[email protected]?arch=amd64&pastry_id=unknown",
]),
},
{
"code": "rpm_repos.ids_known",
"msg": sprintf("%s %s", [
"RPM repo id check failed: An RPM component in the SBOM did not specify a repository_id value in its purl:",
"1 additional similar violations not listed explicitly",
]),
},
}

lib.assert_equal_results(expected, rpm_repos.deny) with rpm_repos._all_sboms as [fake_sbom({p1, p2, p4, p5, p6})]
with data.rule_data.known_rpm_repositories as fake_repo_id_list
with rpm_repos._truncate_threshold as 1 with rpm_repos._min_remainder_count as 0
}

test_repo_id_purls_unknown_repo_ids if {
expected := {
"code": "rpm_repos.ids_known",
"msg": sprintf("%s %s", [
"RPM repo id check failed: An RPM component in the SBOM specified an unknown or disallowed repository_id:",
"pkg:rpm/redhat/[email protected]?arch=amd64&repository_id=rhel-23-unrecognized-2-rpms",
]),
}

lib.assert_equal_results({expected}, rpm_repos.deny) with rpm_repos._all_sboms as [fake_sbom({p1, p2, p3, p6})]
with data.rule_data.known_rpm_repositories as fake_repo_id_list
}

test_clamp_violation_strings if {
lib.assert_equal(
["a", "b", "c", "2 additional similar violations not listed explicitly"],
rpm_repos._truncated_msg_list(["a", "b", "c", "d", "e"]),
) with rpm_repos._truncate_threshold as 3 with rpm_repos._min_remainder_count as 0

lib.assert_equal(
["a", "b", "c", "d", "e"],
rpm_repos._truncated_msg_list(["a", "b", "c", "d", "e"]),
) with rpm_repos._truncate_threshold as 5

lib.assert_equal(
["a", "b", "3 additional similar violations not listed explicitly"],
rpm_repos._truncated_msg_list(["a", "b", "c", "d", "e"]),
) with rpm_repos._truncate_threshold as 2 with rpm_repos._min_remainder_count as 3
}

test_all_sboms if {
# (Needed for 100% coverage)
lib.assert_equal("spam-1000", rpm_repos._all_sboms) with lib.sbom.cyclonedx_sboms as "spam-1000"
}

fake_sboms := [fake_sbom({p1, p2, p3, p4, p5, p6})]

fake_sbom(fake_purls) := {"components": [{"purl": p} | some p in fake_purls]}

fake_repo_id_list := ["rhel-23-for-spam-9-rpms", "rhel-42-for-bacon-12-rpms"]

p1 := "pkg:rpm/redhat/[email protected]?arch=amd64&repository_id=rhel-23-for-spam-9-rpms"

p2 := "pkg:rpm/redhat/[email protected]?arch=amd64&repository_id=rhel-42-for-bacon-12-rpms"

p3 := "pkg:rpm/redhat/[email protected]?arch=amd64&repository_id=rhel-23-unrecognized-2-rpms"

p4 := "pkg:rpm/redhat/[email protected]?arch=amd64&pastry_id=unknown"

p5 := "pkg:rpm_borken"

p6 := "pkg:golang/gitplanet.com/[email protected]?arch=amd64"

0 comments on commit ed2aef0

Please sign in to comment.