Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(examples,metrics,kube-state-metrics): add configmap and promethe… #10919

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 29 additions & 24 deletions documentation/assemblies/metrics/assembly-metrics-config-files.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -27,37 +27,42 @@ metrics
│ └── strimzi-zookeeper.json
├── grafana-install
│ └── grafana.yaml <2>
├── kube-state-metrics <3>
│ ├── configmap.yaml
│ ├── ksm.yaml
│ └── prometheus-rules.yaml
├── prometheus-additional-properties
│ └── prometheus-additional.yaml <3>
│ └── prometheus-additional.yaml <4>
├── prometheus-alertmanager-config
│ └── alert-manager-config.yaml <4>
│ └── alert-manager-config.yaml <5>
├── prometheus-install
│ ├── alert-manager.yaml <5>
│ ├── prometheus-rules.yaml <6>
│ ├── prometheus.yaml <7>
│ └── strimzi-pod-monitor.yaml <8>
├── kafka-bridge-metrics.yaml <9>
├── kafka-connect-metrics.yaml <10>
├── kafka-cruise-control-metrics.yaml <11>
├── kafka-metrics.yaml <12>
├── kafka-mirror-maker-2-metrics.yaml <13>
└── oauth-metrics.yaml <14>
│ ├── alert-manager.yaml <6>
│ ├── prometheus-rules.yaml <7>
│ ├── prometheus.yaml <8>
│ └── strimzi-pod-monitor.yaml <9>
├── kafka-bridge-metrics.yaml <10>
├── kafka-connect-metrics.yaml <11>
├── kafka-cruise-control-metrics.yaml <12>
├── kafka-metrics.yaml <13>
├── kafka-mirror-maker-2-metrics.yaml <14>
└── oauth-metrics.yaml <15>

--
<1> Example Grafana dashboards for the different Strimzi components.
<2> Installation file for the Grafana image.
<3> Additional configuration to scrape metrics for CPU, memory and disk volume usage, which comes directly from the Kubernetes cAdvisor agent and kubelet on the nodes.
<4> Hook definitions for sending notifications through Alertmanager.
<5> Resources for deploying and configuring Alertmanager.
<6> Alerting rules examples for use with Prometheus Alertmanager (deployed with Prometheus).
<7> Installation resource file for the Prometheus image.
<8> PodMonitor definitions translated by the Prometheus Operator into jobs for the Prometheus server to be able to scrape metrics data directly from pods.
<9> Kafka Bridge resource with metrics enabled.
<10> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka Connect.
<11> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Cruise Control.
<12> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka and ZooKeeper.
<13> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for MirrorMaker 2.
<14> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for OAuth 2.0.
<3> Kube-state-metrics configuration for custom resource monitoring.
<4> Additional configuration to scrape metrics for CPU, memory and disk volume usage, which comes directly from the Kubernetes cAdvisor agent and kubelet on the nodes.
<5> Hook definitions for sending notifications through Alertmanager.
<6> Resources for deploying and configuring Alertmanager.
<7> Alerting rules examples for use with Prometheus Alertmanager (deployed with Prometheus).
<8> Installation resource file for the Prometheus image.
<9> PodMonitor definitions translated by the Prometheus Operator into jobs for the Prometheus server to be able to scrape metrics data directly from pods.
<10> Kafka Bridge resource with metrics enabled.
<11> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka Connect.
<12> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Cruise Control.
<13> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for Kafka and ZooKeeper.
<14> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for MirrorMaker 2.
<15> Metrics configuration that defines Prometheus JMX Exporter relabeling rules for OAuth 2.0.

//Example Prometheus metrics files
include::../../modules/metrics/ref-prometheus-metrics-config.adoc[leveloffset=+1]
Expand Down
4 changes: 3 additions & 1 deletion documentation/assemblies/metrics/assembly-metrics.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,6 @@ include::assembly-metrics-config-files.adoc[leveloffset=+1]
//How to set up Prometheus
include::assembly_metrics-prometheus-setup.adoc[leveloffset=+1]
//How to add Grafana dashboards
include::../../modules/metrics/proc_metrics-grafana-dashboard.adoc[leveloffset=+1]
include::../../modules/metrics/proc_metrics-grafana-dashboard.adoc[leveloffset=+1]
//How to montitor custom resources managed by Strimzi
include::../../modules/metrics/proc_metrics-custom-resource-monitoring.adoc[leveloffset=+1]
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// This assembly is included in the following assemblies:
//
// metrics/assembly_metrics-custom-resource-monitoring.adoc

[id='proc-metrics-custom-resource-monitoring-{context}']

= Custom resource monitoring

[role="_abstract"]
Use kube-state-metrics to provide custom resource monitoring.
link:https://github.com/kubernetes/kube-state-metrics/[Kube-state-metrics^](KSM) is a scalable Kubernetes native service which listens to the Kubernetes API server and generates metrics about the state of the objects.
Strimzi provides monitoring for the following custom resources via KSM: `KafkaUser` and `KafkaTopic`.

You can use your own KSM deployment or deploy KSM using the xref:assembly-metrics-config-files-{context}[example metrics configuration files] provided by Strimzi.
The example files include a configuration file for a KSM deployment

* `examples/metrics/kube-state-metrics/ksm.yaml`

Strimzi also provides xref:ref-metrics-custom-resource-monitoring-{context}[example configuration ConfigMap for KSM].

* `examples/metrics/kube-state-metrics/configmap.yaml`

This procedure uses the example KSM deployment and configuration file.

.Prerequisites
* xref:assembly-metrics-prometheus-{context}[Prometheus and Prometheus Alertmanager are deployed]

.Procedure

. Deploy KSM.
+
[source,shell,subs="+quotes,attributes"]
kubectl apply -f configmap.yaml
kubectl apply -f ksm.yaml

. Get the details of the KSM service.
+
[source,shell]
----
kubectl get service strimzi-kube-state-metrics
----
+
For example:
+
[table,stripes=none]
|===
|NAME |TYPE |CLUSTER-IP |PORT(S)

|strimzi-kube-state-metrics |ClusterIP |172.40.156.40 |8080/TCP
|===
+
Note the port number for port forwarding.

. Use `port-forward` to redirect the KSM metrics endpoint to `localhost:8080`:
+
[source,shell]
----
kubectl port-forward svc/strimzi-kube-state-metrics 8080:8080
----

. In a web browser, access the KSM metrics page using the URL `http://localhost:8080/metrics`.
+
The Prometheus endpoint page appears.
All of these metrics also get scraped by Prometheus via the provided `ServiceMonitor` in order to act on in Prometheus.

Please check the provided `PrometheusRule` resources for alerting on these metrics:

* `examples/metrics/kube-state-metrics/prometheus-rules.yaml`
9 changes: 9 additions & 0 deletions packaging/examples/metrics/kube-state-metrics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Kube-state-metrics

This folder contains examples of how Strimzi integrates [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics)(KSM) for custom resources monitoring and demonstrates how they can be used.

[ConfigMap](./configmap.yaml):
* Contains the KSM configuration represented as `ConfigMap`
[PrometheusRules](./prometheus-rules.yaml)
* Contains the alerting based on metrics produced by KSM and collected by Prometheus
* Compatible with [Prometheus-Operator](https://github.com/prometheus-operator/prometheus-operator)
51 changes: 51 additions & 0 deletions packaging/examples/metrics/kube-state-metrics/configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: strimzi-kube-state-metrics-config
data:
config.yaml: |
spec:
resources:
- groupVersionKind:
group: kafka.strimzi.io
version: v1beta2
kind: KafkaTopic
metricNamePrefix: strimzi_kafka_topic
metrics:
- name: resource_info
help: "The current state of a Strimzi kafka topic resource"
each:
type: Info
info:
labelsFromPath:
name: [ metadata, name ]
labelsFromPath:
exported_namespace: [ metadata, namespace ]
partitions: [ spec, partitions ]
replicas: [ spec, replicas ]
ready: [ status, conditions, "[type=Ready]", status ]
sebastiangaiser marked this conversation as resolved.
Show resolved Hide resolved
deprecated: [ status, conditions, "[reason=DeprecatedFields]", type ]
generation: [ status, observedGeneration ]
topicId: [ status, topicId ]
topicName: [ status, topicName ]
- groupVersionKind:
group: kafka.strimzi.io
version: v1beta2
kind: KafkaUser
metricNamePrefix: strimzi_kafka_user
metrics:
- name: resource_info
help: "The current state of a Strimzi kafka user resource"
each:
type: Info
info:
labelsFromPath:
name: [ metadata, name ]
labelsFromPath:
exported_namespace: [ metadata, namespace ]
ready: [ status, conditions, "[type=Ready]", status ]
deprecated: [ status, conditions, "[reason=DeprecatedFields]", type ]
secret: [ status, secret ]
generation: [ status, observedGeneration ]
username: [ status, username ]
141 changes: 141 additions & 0 deletions packaging/examples/metrics/kube-state-metrics/ksm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
---
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
name: strimzi-kube-state-metrics
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: strimzi-kube-state-metrics
rules:
- apiGroups: ["apiextensions.k8s.io"]
resources:
- customresourcedefinitions
verbs: ["list", "watch"]
- apiGroups:
- kafka.strimzi.io
resources:
- kafkatopics
- kafkausers
verbs: ["list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: strimzi-kube-state-metrics
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: strimzi-kube-state-metrics
subjects:
- kind: ServiceAccount
name: strimzi-kube-state-metrics
namespace: myproject
---
apiVersion: v1
kind: Service
metadata:
name: strimzi-kube-state-metrics
spec:
type: "ClusterIP"
ports:
- name: "http"
protocol: TCP
port: 8080
targetPort: 8080
selector:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: strimzi-kube-state-metrics
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
spec:
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
replicas: 1
template:
metadata:
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
spec:
automountServiceAccountToken: true
serviceAccountName: strimzi-kube-state-metrics
securityContext:
fsGroup: 65534
runAsGroup: 65534
runAsNonRoot: true
runAsUser: 65534
seccompProfile:
type: RuntimeDefault
containers:
- name: kube-state-metrics
args:
- --custom-resource-state-only=true
- --port=8080
- --custom-resource-state-config-file=/etc/customresourcestate/config.yaml
volumeMounts:
- name: strimzi-kube-state-metrics-config
mountPath: /etc/customresourcestate
readOnly: true
imagePullPolicy: IfNotPresent
image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.14.0
ports:
- containerPort: 8080
name: "http"
livenessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
path: /livez
port: 8080
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
path: /readyz
port: 8081
scheme: HTTP
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
readOnlyRootFilesystem: true
volumes:
- name: strimzi-kube-state-metrics-config
configMap:
name: strimzi-kube-state-metrics-config
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: strimzi-kube-state-metrics
labels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
spec:
jobLabel: app.kubernetes.io/name
selector:
matchLabels:
app.kubernetes.io/name: kube-state-metrics
app.kubernetes.io/instance: strimzi-kube-state-metrics
endpoints:
- port: http
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: strimzi-kube-state-metrics
spec:
groups:
- name: strimzi-kube-state-metrics
rules:
- alert: KafkaUserDeprecated
expr: strimzi_kafka_user_resource_info{deprecated="Warning"}
for: 15m
labels:
severity: warning
annotations:
message: "Strimzi KafkaUser {{ $labels.username }} has a deprecated configuration"
- alert: KafkaUserNotReady
expr: strimzi_kafka_user_resource_info{ready!="True"}
for: 15m
labels:
severity: warning
annotations:
message: "Strimzi KafkaUser {{ $labels.username }} is not ready"
- alert: KafkaTopicDeprecated
expr: strimzi_kafka_topic_resource_info{deprecated="Warning"}
for: 15m
labels:
severity: warning
annotations:
message: "Strimzi KafkaTopic {{ $labels.topicName }} has a deprecated configuration"
- alert: KafkaTopicNotReady
sebastiangaiser marked this conversation as resolved.
Show resolved Hide resolved
expr: strimzi_kafka_topic_resource_info{ready!="True"}
for: 15m
labels:
severity: warning
annotations:
message: "Strimzi KafkaTopic {{ $labels.topicName }} is not ready"
Loading