api: add `PoolName` and `NodeGroupStatus` #1018

shajmakh · 2024-09-20T11:58:17Z

This PR introduces a new API option for selecting nodes on which the RTE will be running, exposing the nodes' topologies, as a preamble to supporting TAS on HCP (hosted control planes) known as Hypershift.

This PR has two main parts
Part 1: new PoolName field under NodeGroup
Part 2: new NodeGroupStatus type and appending it under the operator
status.
For more details on each part, please look at the corresponding commits.

Signed-off-by: Shereen Haj [email protected]

ffromani

I think we can use some trajectory adjustements, but looks a nice preparation step and I agree with the general direction

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

api/numaresourcesoperator/v1/numaresourcesoperator_types.go

ffromani

partial review

Dockerfile.bundle

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go

api/numaresourcesoperator/v1/numaresourcesoperator_types.go

config/crd/bases/nodetopology.openshift.io_numaresourcesoperators.yaml

ffromani · 2024-09-23T14:15:23Z

I like the new direction.
Something that bothers me is that on HCP we're actually breaking backward compatibility, technically speaking, because we are not populating previously-mandatory fields.

OTOH HCP is different enough that smooth 1:1 transition from OCP is not possible (lacking machineconfigpools for example) so this is probably OK, and not something we can fix anyway.

ffromani · 2024-09-25T13:04:34Z

the security lane failure is real, the other are due to a known issue @Tal-or is already working on

ffromani

partial review, but looks nice.

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

pkg/validation/validation.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go

api/numaresourcesoperator/v1/numaresourcesoperator_types.go

controllers/numaresourcesoperator_controller.go

pkg/objectnames/objectnames.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go

pkg/validation/validation.go

ffromani

I'm inclined to leave all the fields but PoolName optional in the NodeGroupStatus even if we will always populate them. There are pending comments to address, so let's think one last time about it before to make the final decision and than merge.

Changing from/to pointer to value fields should be a quick change, so hopefully it won't be too much churn if we change direction.

Everything else LGTM. Once the few pending comments are addresed and once we settle the status fields types conversation, we can merge.

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

controllers/numaresourcesoperator_controller_test.go

shajmakh · 2024-10-03T11:43:55Z

I'm inclined to leave all the fields but PoolName optional in the NodeGroupStatus even if we will always populate them. There are pending comments to address, so let's think one last time about it before to make the final decision and than merge.

I agree with this direction (i.e. making all NodeGroupStatus fields as optional except the PoolName) for mainly below two reasons:

if we set a NodeGroupStatus it has to have a PoolName which we in fact copy from the MCPs status
having a PoolName set doesn't assume that the config and the daemonset would be set, because according to the reconcile loop this would require the MC to be updated as the resource of the corresponding MCP.

ffromani

can we remove WIP from the PR title?
possibly the last comment inside.
Still making my mind about the status, reviewing the recommendations

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go

Tal-or · 2024-10-06T13:09:16Z

controllers/numaresourcesoperator_controller.go

+
+func syncNodeGroupsStatusPerMCPWithOperatorStatus(instance *nropv1.NUMAResourcesOperator, syncConfig bool) []nropv1.NodeGroupStatus {
+	ngStatuses := []nropv1.NodeGroupStatus{}
+	for _, mcp := range instance.Status.MachineConfigPools {


so this logic is only valid for OpenShift and we'll add the HyperShift logic later on?

correct, this will be handled here based on our conversations to separate the content of the OCP vs HCP:
#1027

Tal-or · 2024-10-06T13:12:40Z

controllers/numaresourcesoperator_controller.go

@@ -222,29 +222,34 @@ func (r *NUMAResourcesOperatorReconciler) reconcileResourceMachineConfig(ctx con
 	// It can take a while.
 	mcpStatuses, allMCPsUpdated := syncMachineConfigPoolsStatuses(instance.Name, trees, r.ForwardMCPConds)
 	instance.Status.MachineConfigPools = mcpStatuses
+	instance.Status.NodeGroups = syncNodeGroupsStatusPerMCPWithOperatorStatus(instance, false)


sorry for my lack of understanding, but why do we need to call this function twice (here and in line 232) and one time with false and then with true?
this deserve at least a comment IMO

good catch. This deserves a comment and another honest attempt at simplifying the code.

good point. I was following the rules by which the MCP status would get updated, which is only if the MC is updated then update the corresponding NodeGroupConfig in the status, hence the forwarded boolean value.
I agree the code in this area needs can be improved and also adding a way to test the config status update in the controller tests. I'll open an issue to track this work.

Glancing at the open issues, I see this:#1031 which sounds to me it includes the subject raised here?

#1031 partially overlaps, yes. I'll need to write some notes about how the status is meant to be updated. The proposed fix for #1031 would only help partially here though. I think the best way forward is to reorganize the code in this PR.

Adjusted in the latest updates and updated controller logic when publishing NodeGroupStatus to be after all DSs and MCP are in sync, that is the only way to guarantee that a nodeGroupStatus would have all fields and because in openshift we depend on MCP, this will need to be updated to adapt HCP. please let me know if this stands on both your expectations. Thanks!

ffromani

I reviewed again https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md and come to a conclusion regarding the new API types.

I realized we (mostly me) made some mistakes in the previous API iteration, but this is no reason to keep making the same mistakes. Let's do better from now on.

api/numaresourcesoperator/v1/numaresourcesoperator_types.go

This PR introduces a new API option for selecting nodes on which the RTE will be running, exposing the nodes' topologies, as a preamble to supporting TAS on HCP (hosted control planes) known as Hypershift. On OCP: The way that cluster nodes are grouped is by using Machine Config Pools (MCPs); On the operator side, numaresourcesoperator CR defines an MCP selector that behind the scenes uses the node selector on the corresponding MCP and sets it under the RTE daemonset's node selector. On HCP: The MCP term does not exist on hypershift. Instead, the hypershift platform has a node-pool, which essentially groups the nodes based on a specific node-pool label that is added under the node's labels. This PR proposes to enable additional options for selecting node groups to run the RTE pods that will work on both OCP and HCP platforms. The new API option called `PoolName` does not change or affect how the MCP selector is processed; It provides a new way for OCP to address the nodes by simply setting the name of the MCP as PoolName, which is equivalent to setting the MachineConfigPoolSelector to a label matching the desired MCP. The new field is optional; thus, this solution is backward compatible. On HCP, PoolName represents the node pool name to which the nodes belong and on which the user desires to operate the operator (RTE daemonset). Tackling HCP platform is yet to be fully supported and it requires additional modifications. Notes & restrictions: - PoolName represents a single string defining the name of the pool, be it MCP name on OCP or NodePool name on HCP. - Only one pool specifier should be specified per node group; more than one will not be tolerated and will cause a degraded state. - On OCP where both options can be set, the operator state will be degraded if PoolName of one node group and MachineConfigPoolSelector of another node group points to the same MCP. - Apart from the aforementioned, no extra validations are applied to determine whether nodes of one node group correlate with nodes from another node group. The user takes responsibility for providing nonconflicting nodes per selector group. - As for MCP selectors, the selected nodes are not validated for whether or not they have the correct machine configuration needed for TAS to be operational. Signed-off-by: Shereen Haj <[email protected]>

So far, tracking the node groups' statuses has been done via the collective operator status, which contains a list of all affected MCPs and their matching RTE daemonsets list. This part aims to enable the status per node group to be populated in a single node group status wrapper instead of an accumulative operator status. For that we need to link the MCPs to a node group for that we use the mcp name.The new NodeGroupStatus consists of Daemonset, NodeGroupConfig & PoolName. It is known that there is a daemonset per MCP not per NodeGroup (see https://github.com/openshift-kni/numaresources-operator/pull/1020/files). So we allow tracking each single matching MCP group config status in a single NodeGroupStatus, without breaking backward compatibility and while making a base for future plans. We keep populating the statuses in NUMAResourcesOperatorStatus fields to retain API backward compatibility, and additionally, we start reflecting the status per node pool (MCP or NodePool later in HCP). The relation between the current node group MCP and daemon sets and the new representation is 1:1, and there is no change in the functionality. Signed-off-by: Shereen Haj <[email protected]>

Extend controller tests to cover the PoolName scenarios. Signed-off-by: Shereen Haj <[email protected]>

shajmakh · 2024-10-16T13:27:17Z

/hold

shajmakh · 2024-10-16T14:40:40Z

failed to pull upi-installer image
/retest

ffromani

/approve
/lgtm

openshift-ci · 2024-10-16T14:47:03Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani, shajmakh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ffromani,shajmakh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shajmakh · 2024-10-16T15:08:23Z

/unhold

ffromani · 2024-10-16T16:38:46Z

/retest

the candidate fix should have been merged few mins ago

shajmakh · 2024-10-17T06:43:01Z

@ffromani could you please share what fixed the CI issue?

openshift-ci bot requested review from ffromani and mrniranjan September 20, 2024 11:58

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 20, 2024

shajmakh changed the title ~~API: enable NodeSelector under NodeGroup~~ WIP: API: enable NodeSelector under NodeGroup Sep 20, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 20, 2024

ffromani reviewed Sep 20, 2024

View reviewed changes

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go Outdated Show resolved Hide resolved

api/numaresourcesoperator/v1/numaresourcesoperator_types.go Outdated Show resolved Hide resolved

api/numaresourcesoperator/v1/numaresourcesoperator_types.go Outdated Show resolved Hide resolved

shajmakh force-pushed the enh-p2 branch 2 times, most recently from 0342679 to 464cc1c Compare September 23, 2024 12:33

shajmakh mentioned this pull request Sep 23, 2024

api: support NodeSelector for NodeGroup #1012

Closed

ffromani reviewed Sep 23, 2024

View reviewed changes

shajmakh force-pushed the enh-p2 branch 5 times, most recently from 5afa4d3 to 43b13ff Compare September 25, 2024 11:29

shajmakh force-pushed the enh-p2 branch 2 times, most recently from 4e7230d to 6479c28 Compare September 26, 2024 07:15

shajmakh changed the title ~~WIP: API: enable NodeSelector under NodeGroup~~ API: enable NodeSelector under NodeGroup Sep 26, 2024

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 26, 2024

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2024

shajmakh force-pushed the enh-p2 branch from 6479c28 to 65ab713 Compare September 26, 2024 07:34

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 26, 2024

shajmakh force-pushed the enh-p2 branch from 65ab713 to 48fa1fa Compare September 27, 2024 10:51

shajmakh mentioned this pull request Sep 27, 2024

ctrl: HCP adaptations for NodeGroup validation and RTE #1027

Merged

ffromani reviewed Sep 27, 2024

View reviewed changes

ffromani changed the title ~~API: enable NodeSelector under NodeGroup~~ api: add PoolName and NodeGroupStatus Sep 27, 2024

Tal-or reviewed Sep 30, 2024

View reviewed changes

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup_test.go Outdated Show resolved Hide resolved

pkg/validation/validation.go Outdated Show resolved Hide resolved

pkg/validation/validation.go Outdated Show resolved Hide resolved

shajmakh mentioned this pull request Sep 30, 2024

pkg: use errors.Join instead of appending to slice #1028

Merged

shajmakh force-pushed the enh-p2 branch from b7b36a1 to f6192e1 Compare October 3, 2024 10:32

ffromani reviewed Oct 3, 2024

View reviewed changes

shajmakh force-pushed the enh-p2 branch 2 times, most recently from 1b93e5d to 3f25cc6 Compare October 3, 2024 12:18

ffromani reviewed Oct 4, 2024

View reviewed changes

api/numaresourcesoperator/v1/helper/nodegroup/nodegroup.go Outdated Show resolved Hide resolved

shajmakh changed the title ~~WIP: api: add PoolName and NodeGroupStatus~~ api: add PoolName and NodeGroupStatus Oct 4, 2024

Tal-or reviewed Oct 6, 2024

View reviewed changes

ffromani reviewed Oct 7, 2024

View reviewed changes

api/numaresourcesoperator/v1/numaresourcesoperator_types.go Outdated Show resolved Hide resolved

api/numaresourcesoperator/v1/numaresourcesoperator_types.go Outdated Show resolved Hide resolved

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 10, 2024

shajmakh added 2 commits October 16, 2024 15:42

shajmakh force-pushed the enh-p2 branch from 3f25cc6 to c077d56 Compare October 16, 2024 12:42

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 16, 2024

shajmakh force-pushed the enh-p2 branch from c077d56 to 02d8bfe Compare October 16, 2024 12:47

controller: add tests related to PoolName

d4ba3d7

Extend controller tests to cover the PoolName scenarios. Signed-off-by: Shereen Haj <[email protected]>

shajmakh force-pushed the enh-p2 branch from 02d8bfe to d4ba3d7 Compare October 16, 2024 12:49

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 16, 2024

ffromani reviewed Oct 16, 2024

View reviewed changes

openshift-ci bot assigned ffromani Oct 16, 2024

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 16, 2024

openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 16, 2024

openshift-merge-bot bot merged commit 69033d1 into openshift-kni:main Oct 16, 2024
14 checks passed

shajmakh mentioned this pull request Oct 17, 2024

Report created RTE daemonsets in operator status and fix status sync issue #1046

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

api: add `PoolName` and `NodeGroupStatus` #1018

api: add `PoolName` and `NodeGroupStatus` #1018

shajmakh commented Sep 20, 2024 •

edited

Loading

ffromani left a comment

ffromani left a comment

ffromani commented Sep 23, 2024

ffromani commented Sep 25, 2024

ffromani left a comment

ffromani left a comment

shajmakh commented Oct 3, 2024

ffromani left a comment

Tal-or Oct 6, 2024

shajmakh Oct 15, 2024

Tal-or Oct 6, 2024

ffromani Oct 7, 2024

shajmakh Oct 15, 2024

shajmakh Oct 15, 2024

ffromani Oct 15, 2024

shajmakh Oct 16, 2024

ffromani left a comment

shajmakh commented Oct 16, 2024

shajmakh commented Oct 16, 2024

ffromani left a comment

openshift-ci bot commented Oct 16, 2024

shajmakh commented Oct 16, 2024

ffromani commented Oct 16, 2024

shajmakh commented Oct 17, 2024

api: add PoolName and NodeGroupStatus #1018

api: add PoolName and NodeGroupStatus #1018

Conversation

shajmakh commented Sep 20, 2024 • edited Loading

ffromani left a comment

Choose a reason for hiding this comment

ffromani left a comment

Choose a reason for hiding this comment

ffromani commented Sep 23, 2024

ffromani commented Sep 25, 2024

ffromani left a comment

Choose a reason for hiding this comment

ffromani left a comment

Choose a reason for hiding this comment

shajmakh commented Oct 3, 2024

ffromani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ffromani left a comment

Choose a reason for hiding this comment

shajmakh commented Oct 16, 2024

shajmakh commented Oct 16, 2024

ffromani left a comment

Choose a reason for hiding this comment

openshift-ci bot commented Oct 16, 2024

shajmakh commented Oct 16, 2024

ffromani commented Oct 16, 2024

shajmakh commented Oct 17, 2024

api: add `PoolName` and `NodeGroupStatus` #1018

api: add `PoolName` and `NodeGroupStatus` #1018

shajmakh commented Sep 20, 2024 •

edited

Loading