Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1. Add CRDs for MetadataProfile #1434

Merged
merged 8 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions design/MetadataProfile.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Metadata Profile

The metadata profile contains a list of queries used to retrieve datasource metadata such as list of namespaces, workloads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, describe the user flow for modifying queries and utilizing API requests and responses, or provide a reference link if this information is documented elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. MetadataProfile.md readme aims to explain what the profile is about and different fields and metadata queries supported. I'll be adding a new readme MetadataProfileAPI.md as and when the REST APIs are merged which will cover user flow for modifying queries and API requests and responses.

and containers. Users can create metadata profiles based on their cluster or datasource provider, such as Prometheus or
Thanos. These profiles can be tagged to import metadata API, which will then fetch metadata according to the metadata
profile, which further helps to create experiments followed by generating recommendations.

This document describes the fields of Metadata Profile and the different set of queries supported by Kruize.
Documentation still in progress stay tuned.

## Attributes

- **apiVersion** \
A string representing version of the Kubernetes API to create metadata profile
- **kind** \
A string representing type of kubernetes object
- **metadata** \
A JSON object containing Data that helps to uniquely identify the metadata profile, including a name string
- **name** \
A unique string name for identifying each metadata profile.
- **profile_version** \
A double value specifying the current version of the profile.
- **datasource** \
A string representing the datasource to import metadata from
- **query_variables** \
Define the query variables to be used
- **name** \
name of the variable
- **datasource** \
datasource of the query
- **value_type** \
can be double or integer
- **query** \
one of the query or _aggregation_functions_ is mandatory. Both can be present.
- **kubernetes_object** \
k8s object that this query is tied to: "_deployment_", "_pod_" or "_container_"
- **aggregation_functions** \
aggregate functions associated with this variable
- **function** \
can be '_avg_', '_sum_', '_min_', '_max_'
- **query** \
corresponding query
- **version** \
Any specific version that this query is tied to

### Different set of metadata queries

#### Queries to import metadata across the cluster

These set of queries fetch list of all the namespaces, workloads and containers present across the cluster

| Name | Query |
|-------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| namespacesAcrossCluster | sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!=""}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| workloadsAcrossCluster | sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=""}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| containersAcrossCluster | sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!=""}[$MEASUREMENT_DURATION_IN_MIN$m])<br/> * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=""}[$MEASUREMENT_DURATION_IN_MIN$m])) |


<br>

#### Queries to import metadata for specific org_id and cluster_id

These set of queries fetch list of namespaces, workloads and containers for specific `org_id` and `cluster_id`

| Name | Query |
|------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| namespacesForOrgAndClusterId | sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!="", org_id="$ORG_ID$", cluster_id="$CLUSTER_ID$"}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| workloadsForOrgAndClusterId | sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="", org_id="$ORG_ID$", cluster_id="$CLUSTER_ID$"}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| containersForOrgAndClusterId | sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!="", org_id="$ORG_ID$", cluster_id="$CLUSTER_ID$"}[$MEASUREMENT_DURATION_IN_MIN$m]) <br/> * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="", org_id="$ORG_ID$", cluster_id="$CLUSTER_ID$"}[$MEASUREMENT_DURATION_IN_MIN$m])) |

<br>

#### Queries to import metadata for ADDITIONAL_LABEL

These set of queries fetch list of namespaces, workloads and containers for specific `ADDITIONAL_LABEL` - currently used by bulk and thanos demos

| Name | Query |
|------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| namespacesForAdditionalLabel | sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| namespacesForAdditionalLabel | sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m])) |
| namespacesForAdditionalLabel | sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]) <br/> * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m])) |
dinogun marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"apiVersion": "recommender.com/v1",
"kind": "KruizeMetadataProfile",
"metadata": {
"name": "cluster-metadata-local-monitoring"
},
"profile_version": 1,
"k8s_type": "openshift",
"datasource": "prometheus",
"query_variables": [
{
"name": "namespacesForAdditionalLabel",
"datasource": "prometheus",
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!=\"\" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
},
{
"name": "workloadsForAdditionalLabel",
"datasource": "prometheus",
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
},
{
"name": "containersForAdditionalLabel",
"datasource": "prometheus",
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!=\"\" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]) * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
apiVersion: "recommender.com/v1"
kind: "KruizeMetadataProfile"
metadata:
name: "cluster-metadata-local-monitoring"
profile_version: 1.0
k8s_type: openshift
datasource: prometheus
query_variables:

- name: namespacesForAdditionalLabel
datasource: prometheus
value_type: "double"
kubernetes_object: "namespace"
aggregation_functions:
- function: sum
query: 'sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))'

- name: workloadsForAdditionalLabel
datasource: prometheus
value_type: "double"
kubernetes_object: "container"
aggregation_functions:
- function: sum
query: 'sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))'

- name: containersForAdditionalLabel
datasource: prometheus
value_type: "double"
kubernetes_object: "container"
aggregation_functions:
- function: sum
query: 'sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]) * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!="" ADDITIONAL_LABEL}[$MEASUREMENT_DURATION_IN_MIN$m]))'
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
{
"apiVersion": "recommender.com/v1",
"kind": "KruizeMetadataProfile",
"metadata": {
"name": "cluster-metadata-local-monitoring"
},
"profile_version": 1,
"k8s_type": "openshift",
"datasource": "prometheus",
"query_variables": [
{
"name": "namespacesAcrossCluster",
"datasource": "prometheus",
dinogun marked this conversation as resolved.
Show resolved Hide resolved
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!=\"\"}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
},
{
"name": "workloadsAcrossCluster",
"datasource": "prometheus",
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\"}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
},
{
"name": "containersAcrossCluster",
"datasource": "prometheus",
"value_type": "double",
"kubernetes_object": "container",
"aggregation_functions": [
{
"function": "sum",
"query": "sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!=\"\"}[$MEASUREMENT_DURATION_IN_MIN$m]) * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=\"\"}[$MEASUREMENT_DURATION_IN_MIN$m]))"
}
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
apiVersion: "recommender.com/v1"
kind: "KruizeMetadataProfile"
metadata:
name: "cluster-metadata-local-monitoring"
profile_version: 1.0
k8s_type: openshift
datasource: prometheus
query_variables:

- name: namespacesAcrossCluster
datasource: prometheus
value_type: "double"
kubernetes_object: "namespace"
aggregation_functions:
- function: sum
query: 'sum by (namespace) (avg_over_time(kube_namespace_status_phase{namespace!=""}[$MEASUREMENT_DURATION_IN_MIN$m]))'

- name: workloadsAcrossCluster
datasource: prometheus
value_type: "double"
kubernetes_object: "container"
aggregation_functions:
- function: sum
query: 'sum by (namespace, workload, workload_type) (avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=""}[$MEASUREMENT_DURATION_IN_MIN$m]))'

- name: containersAcrossCluster
datasource: prometheus
value_type: "double"
kubernetes_object: "container"
aggregation_functions:
- function: sum
query: 'sum by (container, image, workload, workload_type, namespace) (avg_over_time(kube_pod_container_info{container!=""}[$MEASUREMENT_DURATION_IN_MIN$m]) * on (pod, namespace) group_left(workload, workload_type) avg_over_time(namespace_workload_pod:kube_pod_owner:relabel{workload!=""}[$MEASUREMENT_DURATION_IN_MIN$m]))'
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
apiVersion: "recommender.com/v1"
kind: CustomResourceDefinition
metadata:
#name must match the spec fields below, and be in the form: <plural>.<group>
name: kruizemetadataprofiles.recommender.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: "recommender.com"
names:
plural: kruizemetadataprofiles
singular: kruizemetadataprofile
#types can be identified with this tag
kind: KruizeMetadataProfile
scope: Namespaced
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/
community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/
community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
profile_version:
description: 'Version of the profile'
type: number
k8s_type:
description: 'minikube or openshift'
type: string
datasource:
description: 'datasource to import metadata from for eg. Prometheus, Thanos, Datadog etc'
type: string
query_variables:
description: 'Query variables to be used'
type: array
items:
type: object
properties:
name:
description: 'name of the variable'
type: string
datasource:
description: 'datasource of the query'
type: string
value_type:
description: 'can be double or integer'
type: string
kubernetes_object:
description: 'k8s object that this query is tied to: "deployment", "pod", "namespace" or "container"'
type: string
query:
description: 'one of the query or aggregation_functions is mandatory'
type: string
aggregation_functions:
description: 'one of the query or aggregation_functions is mandatory'
type: array
items:
type: object
properties:
function:
description: 'aggregate functions associated with this variable'
type: string
query:
description: 'query'
type: string
version:
description: 'Any specific version that this query is tied to'
type: string
required:
- function
- query
required:
- name
- datasource
- value_type
required:
- query_variables
Loading