[lownodeutilization]: Actual utilization: integration with Prometheus #1533

ingvagabund · 2024-10-11T15:20:33Z

Extend the actual utilization awareness with Prometheus integration.

For testing purposes:

    apiVersion: "descheduler/v1alpha2"
    kind: "DeschedulerPolicy"
    prometheus:
      url: http://prometheus-kube-prometheus-prometheus.prom.svc.cluster.local
        authToken:
          secretReference:
            namespace: "kube-system"
            name: "authtoken"
    profiles:
      - name: ProfileName
        pluginConfig:
        - name: "LowNodeUtilization"
          args:
            thresholds:
              "MetricResource": 20
            targetThresholds:
              "MetricResource": 70
            metricsUtilization:
              prometheus:
                query: instance:node_cpu:rate:sum
        plugins:
          balance:
            enabled:
              - "LowNodeUtilization"

k8s-ci-robot · 2024-10-11T15:20:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from ingvagabund. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

fanhaouu · 2024-10-12T02:59:04Z

Hello, master. Due to the company's busy schedule previously, I only managed to complete half of the related KEP. I'm glad to see that you're working on this. It looks like you're aiming to reuse the current Node utilization logic. I have a few suggestions:

It should support different data sources, similar to PayPal's load-watcher.
It should support various real-time data processing algorithms. For instance, real-time calculations, using rate averages, or predictions based on EWMA + P95, similar to the approach used by autoscaler.
If the goal is to address real-time CPU hotspots, perhaps there’s no need to calculate the number of nodes below or above a certain threshold. Of course, you could also provide a switch to control this behavior.

Hope these suggestions help!

ingvagabund · 2024-10-14T11:47:53Z

Hello sir :)

thank you for taking part in composing the out-of-tree descheduling plugin KEP.

It should support different data sources, similar to PayPal's load-watcher.

You are on the right track here. I'd like to get in touch with load-watcher maintainers and extend the codebase to provide a generic interface for accessing metrics related to pod utilization as well. Currently, only actual node utilization gets collected. Meantime, I am forming the code here to be able to better integrate with other utilization sources like metrics.

It should support various real-time data processing algorithms. For instance, real-time calculations, using rate averages, or predictions based on EWMA + P95, similar to the approach used by autoscaler.

This is where we can debate more. Thank you for sharing the specifics. There's an open issue for the pod autoscaler suggesting to introduce EMA: kubernetes/kubernetes#62235. Are you aware if there's a similar issue or a discussion for the cluster autoscaler? I'd love to learn more about how it's implemented there. Ultimately, the current plugin just needs to know which pod, when evicted, will improve the overall node/workload utilization when properly re-scheduled. I could see various ways to produce the utilization snapshot using various methods.

If the goal is to address real-time CPU hotspots, perhaps there’s no need to calculate the number of nodes below or above a certain threshold. Of course, you could also provide a switch to control this behavior.

I can see how evicting hotspot pods is related to consuming the metrics/real-time node utilization. In the current plugin context this is more suitable for a new/different plugin. I can also see how RemoveDuplicates can be extended to evict based on overall node utilization instead of the current counting approach. Not every plugin will need to consume metrics. Though, there can be common pieces shared across them through the descheduling framework.

ingvagabund · 2024-11-07T13:27:32Z

kubernetes/kubernetes#128663 to address the discrepancy in the fake metrics client node/pod metricses resource name.

ingvagabund · 2024-11-08T16:50:29Z

/test pull-descheduler-verify-master

ingvagabund · 2024-11-15T15:10:25Z

Integration with kubernetes metrics in #1555.

ingvagabund · 2024-11-17T14:28:05Z

/retest-required

ingvagabund · 2024-11-17T15:31:06Z

/retest-required

ingvagabund · 2024-11-18T15:09:16Z

/hold

ingvagabund · 2024-11-19T15:24:56Z

Waiting for #1555 to merge

ingvagabund · 2024-11-20T15:20:35Z

/hold cancel

atiratree · 2024-11-25T19:11:29Z

pkg/api/types.go

 	MetricsCollector MetricsCollector
+
+	// Prometheus enables metrics collection through Prometheus
+	Prometheus Prometheus


Shouldn't new fields be optional and allow nil value?

might be good to also consider changing fields introduced in #1555

This is tricky. The versioned type adds the omitempty tag. Yet given Prometheus is a struct one needs to turn the field into a pointer to get it omitted in the serialized json. For the actual setting you can ignore the field. The field is not required.

atiratree · 2024-11-25T19:20:24Z

pkg/framework/plugins/nodeutilization/types.go

@@ -57,4 +57,14 @@ type MetricsUtilization struct {
 	// metricsServer enables metrics from a kubernetes metrics server.


same question about MetricsUtilization field: shouldn't it be nullable?

atiratree · 2024-11-25T19:21:14Z

pkg/framework/plugins/nodeutilization/types.go

@@ -57,4 +57,14 @@ type MetricsUtilization struct {
 	// metricsServer enables metrics from a kubernetes metrics server.
 	// Please see https://kubernetes-sigs.github.io/metrics-server/ for more.
 	MetricsServer bool `json:"metricsServer,omitempty"`


This seems to be better expressed by an enum

You are right. Enums can be easily extended with other values besides true/false.

atiratree · 2024-11-25T19:24:51Z

pkg/api/types.go

+
+type AuthToken struct {
+	// raw for a raw authentication token
+	Raw string


Have we considered disadvantages of using Raw tokens? Wouldn't it be better to start with the secret ref? And revisit the raw token topic in the future?

atiratree · 2024-11-25T19:25:24Z

pkg/api/types.go

+type Prometheus struct {
+	URL                string
+	AuthToken          AuthToken
+	InsecureSkipVerify bool


Same question about supporting the InsecureSkipVerify.

atiratree · 2024-11-25T21:20:04Z

pkg/framework/plugins/nodeutilization/usageclients.go

+	client._nodeUtilization = make(map[string]map[v1.ResourceName]*resource.Quantity)
+	client._pods = make(map[string][]*v1.Pod)
+
+	results, warnings, err := promv1.NewAPI(client.promClient).Query(context.TODO(), client.promQuery, time.Now())


it would be nicer to have the context from above

atiratree · 2024-11-25T21:25:33Z

pkg/framework/plugins/nodeutilization/usageclients.go

+func (client *prometheusUsageClient) resourceNames() []v1.ResourceName {
+	return []v1.ResourceName{ResourceMetrics}
+}


do we need this method?

atiratree · 2024-11-25T21:28:55Z

pkg/framework/plugins/nodeutilization/nodeutilization.go

@@ -34,6 +34,8 @@ import (
 	"sigs.k8s.io/descheduler/pkg/utils"
 )

+const ResourceMetrics = v1.ResourceName("MetricResource")


I would expect the value to be the same as the variable name. Any reason why is it different?

atiratree · 2024-11-25T21:34:12Z

pkg/framework/plugins/nodeutilization/nodeutilization.go

+		if len(resourceNames) == 1 && resourceNames[0] == ResourceMetrics {
+			// Make ResourceMetrics 100% => 100 points
+			nodeCapacity[ResourceMetrics] = *resource.NewQuantity(int64(100), resource.DecimalSI)


nit: might be safer to double fail here. It expects that the plugin above does the validation correctly.

atiratree · 2024-11-25T21:40:28Z

pkg/framework/plugins/nodeutilization/lownodeutilization.go

@@ -95,6 +106,11 @@ func NewLowNodeUtilization(args runtime.Object, handle frameworktypes.Handle) (f
 			return nil, fmt.Errorf("metrics client not initialized")
 		}
 		usageClient = newActualUsageClient(resourceNames, handle.GetPodsAssignedToNodeFunc(), handle.MetricsCollector())
+	} else if lowNodeUtilizationArgsArgs.MetricsUtilization.Prometheus.Query != "" {


as mentioned, enum would look better here

k8s-ci-robot · 2024-12-11T13:44:48Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot requested review from JaneLiuL and jklaw90 October 11, 2024 15:20

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 11, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch from c889a53 to 1f55c4d Compare October 15, 2024 10:18

k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 5, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch 3 times, most recently from d744a96 to 800c92c Compare November 6, 2024 18:34

ingvagabund force-pushed the node-utilization-util-snapshot branch from f30f8a1 to 2e63411 Compare November 7, 2024 15:40

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 7, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch 4 times, most recently from 0330902 to baa6650 Compare November 8, 2024 15:52

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 13, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch from baa6650 to 2442967 Compare November 16, 2024 09:09

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 16, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch 4 times, most recently from 668f0fb to 477104c Compare November 16, 2024 17:11

ingvagabund force-pushed the node-utilization-util-snapshot branch 4 times, most recently from 425f449 to d16e2b0 Compare November 17, 2024 13:41

ingvagabund mentioned this pull request Nov 17, 2024

Sync with upstream openshift/descheduler#121

Merged

ingvagabund force-pushed the node-utilization-util-snapshot branch 6 times, most recently from fbf6fb2 to 3ec47f7 Compare November 18, 2024 14:47

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 18, 2024

ingvagabund mentioned this pull request Nov 19, 2024

Sync with upstream openshift/descheduler#120

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 19, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch from 4a21180 to a4186f1 Compare November 19, 2024 16:00

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 19, 2024

ingvagabund force-pushed the node-utilization-util-snapshot branch from d30044e to c478ce8 Compare November 19, 2024 17:18

ingvagabund added 2 commits November 20, 2024 15:18

Update vendor for prometheus deps

a36484e

[nodeutilization]: prometheus usage client with prometheus metrics

d143aad

ingvagabund force-pushed the node-utilization-util-snapshot branch from c478ce8 to d143aad Compare November 20, 2024 14:19

ingvagabund changed the title ~~Node utilization util snapshot~~ [lownodeutilization]: Actual utilization: integration with Prometheus Nov 20, 2024

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 20, 2024

atiratree reviewed Nov 25, 2024

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[lownodeutilization]: Actual utilization: integration with Prometheus #1533

[lownodeutilization]: Actual utilization: integration with Prometheus #1533

ingvagabund commented Oct 11, 2024 •

edited

Loading

k8s-ci-robot commented Oct 11, 2024

fanhaouu commented Oct 12, 2024

ingvagabund commented Oct 14, 2024

ingvagabund commented Nov 7, 2024

ingvagabund commented Nov 8, 2024

ingvagabund commented Nov 15, 2024

ingvagabund commented Nov 17, 2024

ingvagabund commented Nov 17, 2024

ingvagabund commented Nov 18, 2024

ingvagabund commented Nov 19, 2024

ingvagabund commented Nov 20, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

ingvagabund Dec 3, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

ingvagabund Dec 3, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

atiratree Nov 25, 2024

k8s-ci-robot commented Dec 11, 2024

		@@ -57,4 +57,14 @@ type MetricsUtilization struct {
		// metricsServer enables metrics from a kubernetes metrics server.

[lownodeutilization]: Actual utilization: integration with Prometheus #1533

Are you sure you want to change the base?

[lownodeutilization]: Actual utilization: integration with Prometheus #1533

Conversation

ingvagabund commented Oct 11, 2024 • edited Loading

k8s-ci-robot commented Oct 11, 2024

fanhaouu commented Oct 12, 2024

ingvagabund commented Oct 14, 2024

ingvagabund commented Nov 7, 2024

ingvagabund commented Nov 8, 2024

ingvagabund commented Nov 15, 2024

ingvagabund commented Nov 17, 2024

ingvagabund commented Nov 17, 2024

ingvagabund commented Nov 18, 2024

ingvagabund commented Nov 19, 2024

ingvagabund commented Nov 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Dec 11, 2024

ingvagabund commented Oct 11, 2024 •

edited

Loading