Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.18] Enable NROP metrics to be to scraped securely by Prometheus #1140

Conversation

rbaturov
Copy link
Contributor

@rbaturov rbaturov commented Jan 7, 2025

This PR is a combined backport of #1106 and #1133.

This PR encompasses all the required changes to reintegrate NROP metrics with Prometheus. This implementation uses the metrics server API from the controller-runtime.
The majority of the changes in this PR follow the guidelines outlined in the following guide:
https://rhobs-handbook.netlify.app/products/openshiftmonitoring/collecting_metrics.md/

The difference between this implementation and the guide's is that we use a bearerTokenFile for Prometheus authentication instead of tls.crt and tls.key. This approach uses TLS but does not implement mTLS.

A follow-up PR will be issued to ensure we implement this for RTE metrics as well.

To validate that this PR is functioning correctly, please follow these steps:

  1. build image of the operator (make docker-build docker-push)
  2. run: make deploy
  3. Attach to one of the prometheus pods oc exec -it prometheus-k8s-0 -n openshift-monitoring /bin/bash
  4. run:
curl -v \
--cacert /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt \
-H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
https://numaresources-controller-manager-metrics-service.numaresources.svc:8080/metrics

@openshift-ci openshift-ci bot requested review from ffromani and shajmakh January 7, 2025 08:07
@rbaturov rbaturov changed the title Enable secure metrics 4.18 [release-4.18] Enable NROP metrics to be to scraped securely by Prometheus Jan 7, 2025
@ffromani
Copy link
Member

ffromani commented Jan 7, 2025

/approve

Copy link
Contributor

openshift-ci bot commented Jan 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, rbaturov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2025
@ffromani
Copy link
Member

ffromani commented Jan 7, 2025

needs rebase

Signed-off-by: Ronny Baturov <[email protected]>
This commit consist of the following changes:

* Enabled secureserving in the metrics server.
* Added a Service for the deployment with a special annotation, which is monitored by the Service CA operator. This operator will generate the tls.key and tls.crt files inside the metrics-service-cert secret, which is used by manager container.
* Added volume and volumeMount to enable the deployment to consume the metrics secret generated with the tls.key and tls.crt.
* Added Role and RoleBinding resources to grant the necessary permissions to the Prometheus service account.
* modified ServiceMonitor resource to enable Prometheus pods to scrape metrics.

Most of this configuration was based on this guide:
https://rhobs-handbook.netlify.app/products/openshiftmonitoring/collecting_metrics.md/

Signed-off-by: Ronny Baturov <[email protected]>
* Added the openshift.io/cluster-monitoring: "true" label to the NROP namespace. This label is required to enable Prometheus to scrape metrics from the namespace.
* Added the annotation operatorframework.io/cluster-monitoring=true. This will cause the OpenShift console to display a checkbox during installation, allowing users to enable cluster monitoring for the operator.

Signed-off-by: Ronny Baturov <[email protected]>
Signed-off-by: Ronny Baturov <[email protected]>
* Added a Role and RoleBinding for the manager pod, allowing the operator to manage Services exclusively within the NROP namespace.
* Removed the Service-related RBAC permissions previously granted in the ClusterRole.

Signed-off-by: Ronny Baturov <[email protected]>
This ensures the informer can successfully watch and list Service objects.
  Without this restriction, the operator will encounter the following failures:
   reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:numaresources:numaresources-controller-manager" cannot list resource "services" in API group "" at the cluster scope
   reflector.go:158] "Unhandled Error" err="sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:numaresources:numaresources-controller-manager\" cannot list resource \"services\" in API group \"\" at the cluster scope" logger="UnhandledError"

Signed-off-by: Ronny Baturov <[email protected]>
@rbaturov rbaturov force-pushed the enable-secure-metrics-4.18 branch from e990e3b to 1904cfc Compare January 7, 2025 19:15
Copy link
Member

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jan 8, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 93834c9 into openshift-kni:release-4.18 Jan 8, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants