Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine CVE check in check script for k8s version policy #779

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
d96edd7
adding functions for getting more information & debug
piobig2871 Oct 10, 2024
0caf99d
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Oct 11, 2024
ebfe951
solving conflicts
piobig2871 Oct 16, 2024
e635411
fixing git inset
piobig2871 Oct 16, 2024
a298678
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Oct 16, 2024
cfc3dc6
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Oct 16, 2024
54ee694
feat: Add Kubernetes pod image scanning and improve error handling
piobig2871 Oct 18, 2024
cc87097
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Oct 21, 2024
38921f1
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Oct 22, 2024
12987e5
removing comments
piobig2871 Oct 23, 2024
1d332ae
reseting standard to it's original form
piobig2871 Nov 4, 2024
11daeac
reverting ClusterInfo to its original shape, removing kubeconfig fiel…
piobig2871 Nov 4, 2024
e62b347
removing unused kubeconfig variable
piobig2871 Nov 4, 2024
5ab2bd0
fixing pylint and docstring formatting
piobig2871 Nov 4, 2024
5e2b764
Update Tests/kaas/k8s-version-policy/k8s_version_policy.py
piobig2871 Nov 4, 2024
3348960
fixing pylint and resolving conflict which appeard after review
piobig2871 Nov 4, 2024
4a59878
fixing the script with providing proper images list to check out
piobig2871 Nov 7, 2024
cea9840
femoving unused lines
piobig2871 Nov 7, 2024
c676f5f
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Nov 13, 2024
f1674fe
Update Tests/kaas/k8s-version-policy/k8s_version_policy.py
piobig2871 Nov 21, 2024
f51a46c
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Nov 21, 2024
48ba4d7
Merge branch 'main' into 526-refine-cve-check-in-scs-0210-v2-test-script
piobig2871 Nov 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 25 additions & 8 deletions Standards/scs-0210-v2-k8s-version-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,31 @@ window period.
In order to keep up-to-date with the latest Kubernetes features, bug fixes and security improvements,
the provided Kubernetes versions should be kept up-to-date with new upstream releases:

- The latest minor version MUST be provided no later than 4 months after release.
- The latest patch version MUST be provided no later than 2 weeks after release.
- This time period MUST be even shorter for patches that fix critical CVEs.
In this context, a critical CVE is a CVE with a CVSS base score >= 8 according
to the CVSS version used in the original CVE record (e.g., CVSSv3.1).
It is RECOMMENDED to provide a new patch version in a 2-day time period after their release.
- New versions MUST be tested before being rolled out on productive infrastructure;
at least the [CNCF E2E tests][cncf-conformance] should be passed beforehand.
1. Minor Versions:
- The latest minor version MUST be provided no later than 4 months after release.

2. Patch Versions:
- The latest patch version MUST be provided no later than 1 week after release.
piobig2871 marked this conversation as resolved.
Show resolved Hide resolved
- This time period MUST be even shorter for patches that fix critical CVEs.
In this context, a critical CVE is a CVE with a CVSS base score >= 8 according
to the CVSS version used in the original CVE record (e.g., CVSSv3.1).
It is RECOMMENDED to provide a new patch version in a 2-day time period after their release.
- New versions MUST be tested before being rolled out on productive infrastructure;
at least the [CNCF E2E tests][cncf-conformance] should be passed beforehand.
piobig2871 marked this conversation as resolved.
Show resolved Hide resolved

3. CI Integration
* Trivy
- Providers should integrate Trivy into their CI pipeline to automatically scan Kubernetes cluster components,
including kubelet, apiserver, and others.
- The CI job MUST fail if critical vulnerabilities (CVSS >= 8) are detected in the cluster components.
- JSON reports from Trivy scans should be reviewed, and Trivy's experimental status should be monitored for changes
in output formats.
* nvdlib (Fallback):
- If Trivy fails or cannot meet requierements, nvdlib MUST be used as a fallback to query CVE data for Kubernetes
versions, laveraging CPE-based searches to track vunerabilities for specific versions.
- Providers using nvdlib MUST periodically query for critical cunerabilities affecting the Kubernetes version in production.

4. TBD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This standard has been stabilized already, for better or worse. New requirements can only be introduced in a new major version (then v3). However, I'm not sure that this was the original objective of this PR; here, we mainly wanted some tooling for the compliance check, and the providers are free to use whatever tools they want. (We can put these items into the implementation notes though, but only as non-authoritative recommendation!)

Copy link
Author

@piobig2871 piobig2871 Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does that mean that I should restore original version of standard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make your research results available. We should just reframe them as guidelines for operators. We could write a blog post. I would then ask you to get feedback from Team Container. It would be good to talk to people who already use Trivy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I have done right now is restore the original standard text and drop the changes.

According to the code, there were several changes made:

  • Integrated Trivy for scanning Kubernetes pod images for security vulnerabilities.
  • Fixed issue with ClusterInfo object being incorrectly passed where kubeconfig path was expected.
  • Added logging improvements to provide clearer insights during version compliance checks.
  • Refined the code structure to handle K8s image scanning and cluster versioning in an async manner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I have done right now is restore the original standard text and drop the changes

This is not what I see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would then ask you to get feedback from Team Container

Have you done that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I have done right now is restore the original standard text and drop the changes

@mbuechse I do apologize, I have reverted it now, it was lost somewhere on my git in the mess with the branches

Copy link
Author

@piobig2871 piobig2871 Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would then ask you to get feedback from Team Container

Have you done that?

I have not, I will bring that topic on the nearest container call(last week there was not a container call at all).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!


At the same time, providers must support Kubernetes versions at least as long as the
official sources as described in [Kubernetes Support Period][k8s-support-period]:
Expand Down
93 changes: 79 additions & 14 deletions Tests/kaas/k8s-version-policy/k8s_version_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
(c) Hannes Baum <[email protected]>, 6/2023
(c) Martin Morgenstern <[email protected]>, 2/2024
(c) Matthias Büchse <[email protected]>, 3/2024
(c) Piotr Bigos <[email protected]>, 10/2024
SPDX-License-Identifier: CC-BY-SA-4.0
"""

Expand Down Expand Up @@ -188,7 +189,6 @@ class K8sBranch:

def previous(self):
if self.minor == 0:
# FIXME: this is ugly
return self
return K8sBranch(self.major, self.minor - 1)

Expand Down Expand Up @@ -257,19 +257,26 @@ class VersionRange:
upper_version: K8sVersion = None
inclusive: bool = False

def __post_init__(self):
if self.lower_version is None:
raise ValueError("lower_version must not be None")
if self.upper_version and self.upper_version < self.lower_version:
raise ValueError("lower_version must be lower than upper_version")

def __contains__(self, version: K8sVersion) -> bool:
if self.upper_version is None:
return self.lower_version == version
if self.inclusive:
return self.lower_version <= version <= self.upper_version
return self.lower_version <= version < self.upper_version

# def __post_init__(self):
# if self.lower_version is None:
# raise ValueError("lower_version must not be None")
# if self.upper_version and self.upper_version < self.lower_version:
# raise ValueError("lower_version must be lower than upper_version")
#
# def __contains__(self, version: K8sVersion) -> bool:
# if self.upper_version is None:
# return self.lower_version == version
# if self.inclusive:
# return self.lower_version <= version <= self.upper_version
# return self.lower_version <= version < self.upper_version


@dataclass
class ClusterInfo:
Expand Down Expand Up @@ -337,19 +344,32 @@ async def collect_cve_versions(session: aiohttp.ClientSession) -> set:

# CVE fix versions
cfvs = set()
cve_patch_data = dict()

# # Request latest version
# async with session.get(
# "https://kubernetes.io/docs/reference/issues-security/official-cve-feed/index.json",
# headers={"Accept": "application/json"}
# ) as resp:
# cve_list = await resp.json()

# Request latest version
async with session.get(
"https://kubernetes.io/docs/reference/issues-security/official-cve-feed/index.json",
headers={"Accept": "application/json"}
"https://kubernetes.io/docs/reference/issues-security/official-cve-feed/index.json",
headers={"Accept": "application/json"}
) as resp:
if resp.status != 200:
logger.error(f"Failed to fetch CVE data, status code: {resp.status}")
return cve_patch_data

cve_list = await resp.json()

tasks = [request_cve_data(session=session, cveid=cve['id'])
for cve in cve_list['items']]
for cve in cve_list.get('items', [])]

cve_data_list = await asyncio.gather(*tasks, return_exceptions=True)

cve_data_list = [data for data in cve_data_list if not isinstance(data, Exception)]

for cve_data in cve_data_list:
try:
cve_cna = cve_data['containers']['cna']
Expand Down Expand Up @@ -381,6 +401,44 @@ async def collect_cve_versions(session: aiohttp.ClientSession) -> set:
return cfvs


def is_critical_cve(cve_metrics: list) -> bool:
"""Checks if the CVE is considered critical based on CVSS score."""
for metric in cve_metrics:
if metric.get('cvssV3', {}).get('baseScore', 0) >= CVE_SEVERITY:
return True
return False


def parse_cve_version_information_new(version_info: dict) -> str:
"""Extracts the affected Kubernetes version from CVE data."""
return version_info.get('version')


def parse_patch_release_date(version_info: dict) -> datetime:
"""Extracts the release date of the patch from the CVE version info."""
patch_release_str = version_info.get('patchReleaseDate', None)
if patch_release_str:
return datetime.strptime(patch_release_str, "%Y-%m-%d")
return None


async def check_patch_deployment(session: aiohttp.ClientSession, current_version: str, deployed_date: datetime) -> None:
"""Check if the latest patch targeting a critical CVE was deployed within the allowed time."""
cve_patch_data = await collect_cve_versions(session)

for cve_id, version_data in cve_patch_data.items():
for version, patch_release_date in version_data:
if version == current_version:
if patch_release_date:
allowed_timeframe = patch_release_date + CVE_VERSION_CADENCE
if deployed_date > allowed_timeframe:
logger.error(f"Patch for {cve_id} affecting version {version} was not deployed in time!")
else:
logger.info(f"Patch for {cve_id} affecting version {version} deployed in time.")
else:
logger.warning(f"Patch release date for {cve_id} affecting version {version} is missing.")


async def get_k8s_cluster_info(kubeconfig, context=None) -> ClusterInfo:
"""Get the k8s version of the cluster under test."""
cluster_config = await kubernetes_asyncio.config.load_kube_config(kubeconfig, context)
Expand Down Expand Up @@ -416,9 +474,16 @@ def check_k8s_version_recency(
if my_version.patch >= release.version.patch:
continue
# at this point `release` has the same major.minor, but higher patch than `my_version`
if release.age > PATCH_VERSION_CADENCE:
# whoops, the cluster should have been updated to this (or a higher version) already!
return False
if my_version == release.version:
if release.age <= MINOR_VERSION_CADENCE:
logger.info(f"Version {my_version} is recent (within cadence).")
return True
else:
logger.error(f"Version {my_version} is too old.")
return False
# if release.age > PATCH_VERSION_CADENCE:
# # whoops, the cluster should have been updated to this (or a higher version) already!
# return False
ranges = [_range for _range in cve_affected_ranges if my_version in _range]
if ranges and release.age > CVE_VERSION_CADENCE:
# -- two FIXMEs:
Expand Down
Loading