Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft proposal on aggregation API and extension API server for policy reports #50

Closed

Conversation

vishal-chdhry
Copy link
Member

No description provided.

Signed-off-by: Vishal Choudhary <[email protected]>
Signed-off-by: Vishal Choudhary <[email protected]>
Copy link
Contributor

@chipzoller chipzoller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a draft but you still requested my review so here are some suggestions.

proposals/policy_reports_api_aggregation.md Outdated Show resolved Hide resolved

**etcd**: etcd (pronounced et-see-dee) is an open source, distributed, consistent key-value store for shared configuration, service discovery, and scheduler coordination of distributed systems or clusters of machines. It is the primary datastore of Kubernetes.

**Kine**: Kine is the component of k3s that allows it to use various RDBMS as an etcd replacement. It provides an implementation of the GRPC functions that Kubernetes relies upon.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please link.

proposals/policy_reports_api_aggregation.md Outdated Show resolved Hide resolved
proposals/policy_reports_api_aggregation.md Outdated Show resolved Hide resolved
proposals/policy_reports_api_aggregation.md Outdated Show resolved Hide resolved
proposals/policy_reports_api_aggregation.md Outdated Show resolved Hide resolved

# Migration (OPTIONAL)

// TODO
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. I have questions:

  1. How will upgrades work especially on large clusters? Will Kyverno migrate all the data from etcd into the aggregated API service?
  2. Will this be the only implementation of the Policy Reports infrastructure or can users still elect to use the conventional method of storing Policy Reports (i.e., etcd)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. I have questions:

  1. How will upgrades work especially on large clusters? Will Kyverno migrate all the data from etcd into the aggregated API service?
  2. Will this be the only implementation of the Policy Reports infrastructure or can users still elect to use the conventional method of storing Policy Reports (i.e., etcd)?

This will most likely be a sub-project of Kyverno.

Users will need to install Kyverno without reporting enabled, and then install this project for reporting across Kyverno and other tools.

There will continue to be a built-in reporting option.

There may not be a direct upgrade path, and more likely a mult-step process to enable this option.

JimBugwadia and others added 5 commits November 2, 2023 20:54
Co-authored-by: Chip Zoller <[email protected]>
Signed-off-by: Jim Bugwadia <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
Signed-off-by: Jim Bugwadia <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
Signed-off-by: Jim Bugwadia <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
Signed-off-by: Jim Bugwadia <[email protected]>
Co-authored-by: Chip Zoller <[email protected]>
Signed-off-by: Jim Bugwadia <[email protected]>
# Overview
[overview]: #overview

Policy reports are used by Kyverno and serveral other data sources as a source of information. But, in large Kubernetes clusters, we run up against storage limitations of etcd. This Proposal aims to solve the issue related to limitations in etcd with relations to policy reports.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Policy reports are used by Kyverno and serveral other data sources as a source of information. But, in large Kubernetes clusters, we run up against storage limitations of etcd. This Proposal aims to solve the issue related to limitations in etcd with relations to policy reports.
PolicyReport resources can be produced by Kyverno and other data sources for reporting security and compliance findings. But, in large Kubernetes clusters, we run up against storage limitations of etcd. This proposal aims to solve the issue related to limitations in etcd with relations to policy reports.


The database will not act as the store of historic data, only current information. This database can also be used by policy reporter so that it does not have to copy data from etcd.

![aggregation-api-architecture](./images/policy-reports-aggregation-api-architecure.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an image credit for the slides, "Presented by Zach Stone at a Kyverno contributor's meeting."


# Migration (OPTIONAL)

// TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. I have questions:

  1. How will upgrades work especially on large clusters? Will Kyverno migrate all the data from etcd into the aggregated API service?
  2. Will this be the only implementation of the Policy Reports infrastructure or can users still elect to use the conventional method of storing Policy Reports (i.e., etcd)?

This will most likely be a sub-project of Kyverno.

Users will need to install Kyverno without reporting enabled, and then install this project for reporting across Kyverno and other tools.

There will continue to be a built-in reporting option.

There may not be a direct upgrade path, and more likely a mult-step process to enable this option.


# Proposal

The API server will proxy the request to an extension API server which will have its own database to store policy reports. Kine can be used as the backend behind an extension API server. Kine will expose an etcd-like interface and use a relational database to store data.
Copy link
Member

@JimBugwadia JimBugwadia Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add a requirement for a pluggable storage layer.

We can explore if Kine can help with that.

@stone-z
Copy link

stone-z commented Nov 3, 2023

Hi folks,
First -- obligatory sorry for the delay. I wanted to do this for you but I cooked it too long.
This draft summarizes my current understanding and the presentation at the contributors' meeting quite nicely -- thanks, Vishal 🙏

Some of the review comments may be addressed in my version but I didn't want to make a huge PR review diff, so I opened #51 just to submit the state of my draft. Please take whatever elements make sense.

I'll otherwise be following this PR and happy to address any questions that arise. (And maybe see some of you at kubecon?)

@vishal-chdhry
Copy link
Member Author

Thanks @stone-z for the KDP, I am closing this PR as its better to have one PR containing the entire discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants