Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use hive clusterdeployment for creating spoke clusters #472

Merged
merged 6 commits into from
May 30, 2024

Conversation

TomerFi
Copy link
Member

@TomerFi TomerFi commented Apr 16, 2024

  • Added feature for creating Spoke Clusters using Hive's ClusterDeployment API.
  • Modified the clustergroup schema; added clusterDeployments for a group, used clusterPools as a template.
  • Modified test cases testing clusterDeployments on the same multiclustergroup with clusterPools and on its own.
  • Multiple Commits in this PR make reviewing easier; these can be squashed together if required.
  • If preferred, we can move the tests to a different PR to make this smaller.

Example configuration:

  managedClusterGroups:
    - name: group-with-pool
      clusterPools:
        spokesPool:
          name: spokes-pool
          openshiftVersion: 4.14.12
          baseDomain: domain.example.com
          platform:
            aws:
              region: ap-southeast-2
          clusters:
          - spoke1
          - spoke2

    - name: group-with-deployments
      clusterDeployments:
        myFirstCustomSpoke:
          name: my-first-custom-spoke
          openshiftVersion: 4.14.12
          baseDomain: domain.example.com
          platform:
            azure:
              baseDomainResourceGroupName: dojo-dns-zones
              region: eastus
        mySecondCustomSpoke:
          name: my-second-custom-spoke
          openshiftVersion: 4.14.12
          baseDomain: domain.example.ca
          platform:
            azure:
              baseDomainResourceGroupName: dojo-dns-zones
              region: westca

    # all clusters in the following group, claims or deployments, are considered part of the same
    # group, same labeling mechanism, shared managed cluster set, etc.
    - name: group-with-both
      clusterPools:
        anotherSpokesPool:
          name: another-spokes-pool
          openshiftVersion: 4.14.12
          baseDomain: domain.example.com
          platform:
            aws:
              region: ap-southwest-1
          clusters:
            - another-spoke1
      clusterDeployments:
        anotherCustomSpoke:
          name: another-custom-spoke
          openshiftVersion: 4.14.12
          baseDomain: domain.example.com
          platform:
            aws:
              region: ap-southwest-3

@TomerFi TomerFi force-pushed the add-clusterdeployments branch 5 times, most recently from eda03ab to 11fb95a Compare April 16, 2024 22:51
TomerFi and others added 2 commits April 16, 2024 19:16
Co-authored-by: Alejandro Villegas <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
Co-authored-by: Alejandro Villegas <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
Co-authored-by: Alejandro Villegas <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
@TomerFi TomerFi force-pushed the add-clusterdeployments branch from 277860f to 5b4e903 Compare April 17, 2024 15:23
@TomerFi TomerFi marked this pull request as ready for review April 17, 2024 15:56
Copy link
Contributor

@r2dedios r2dedios left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ;)

@claudiol claudiol added the enhancement New feature or request label Apr 18, 2024
@mbaldessari mbaldessari self-assigned this Apr 19, 2024
Copy link
Contributor

@mbaldessari mbaldessari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TomerFi and @r2dedios this is great work!! I left a couple of comments with smaller things we need to fix, nothing structural really.

name: {{ $deploymentName }}
namespace: {{ $deploymentName }}
labels:
vendor: OpenShift
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add the following here:

annotations:
  argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

because if you do a deployment from scratch (i.e. one where ACM gets installed at the same time), the ACM app might be stuck because the multiclusterhub (which provides these CRDs) has not been installed yet and Argo will be stuck with:

The Kubernetes API could not find hive.openshift.io/ClusterDeployment for requested resource first-spoke-deployments/first-spoke-deployments. Make sure the "ClusterDeployment" CRD is installed on the destination cluster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know - thanks!

Copy link
Member Author

@TomerFi TomerFi Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in a073b8c.

---
apiVersion: cluster.open-cluster-management.io/v1
kind: ManagedCluster
metadata:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above with ClusterDeployment. We should probably add the annotation:

annotations:
  argocd.argoproj.io/sync-options: SkipDryRunOnMissingResource=true

otherwise during an install from scratch we get:

The Kubernetes API could not find cluster.open-cluster-management.io/ManagedCluster for requested resource open-cluster-management/first-spoke-deployments. Make sure the "ManagedCluster" CRD is installed on the
 destination cluster.

Copy link
Member Author

@TomerFi TomerFi Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in a073b8c.

{{- $group := . }}

{{- range $group.clusterDeployments}}
{{ $cluster := . }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that we could add here (because I am an idiot an I got bitten by it) is the following:

--- a/common/acm/templates/provision/clusterdeployment.yaml
+++ b/common/acm/templates/provision/clusterdeployment.yaml
@@ -3,6 +3,12 @@

 {{- range $group.clusterDeployments}}
 {{ $cluster := . }}
+{{- if (eq $cluster.name nil) }}
+{{- fail (printf "managedClusterGroup clusterDeployment cluster name is empty: %s" $cluster) }}
+{{- end }}
+{{- if (eq $group.name nil) }}
+{{- fail (printf "managedClusterGroup clusterDeployment group name is empty: %s" $cluster) }}
+{{- end }}
 {{- $deploymentName := print $cluster.name "-" $group.name }}

 {{- $cloud := "None" }}

Reason was that I did not add the name attribute under the group and when running make preview-acm I saw a bunch of <nil> in the object names and it surprised me a bit ;)

We can also do it on top later, just writing it here so I don't forget

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh good catch! we'll add it here.

Copy link
Member Author

@TomerFi TomerFi Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in a073b8c.

Should we do the same for clusterpool.yaml?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's do later it in another PR maybe

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: {{ .name }}
spec:
clusterSelector:
selectorType: LegacyClusterSetLabel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now my deployment is stuck on the following error:

ManagedClusterSet.cluster.open-cluster-management.io "deployments" is invalid: spec.clusterSelector.selectorType: Unsupported value: "LegacyClusterSetLabel": supported values: "ExclusiveClusterSetLabel", "LabelSelector"

I suspect this selectorType is only valid when we use clusterPools and not clusterDeployments?

Copy link
Member Author

@TomerFi TomerFi Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any documentation on that one. I suspect it was deprecated for a while and removed in v1beta2. I think we accidentally modified the version from v1beta1 but didn't modify the spec. I'll look around for some docs.

Copy link
Member Author

@TomerFi TomerFi Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbaldessari I did some investigation.

This issue describes adding v1beta2 depercating LegacyClusterSetLabel (previous default) and adding ExclusiveClusterSetLabel (current default). Later on, v1beta1 was removed which explains our issue.

Currently available options for SelectorType:

  • ExclusiveClusterSetLabel, the default value, means that ManagedCluster resources will be selected based on the label: cluster.open-cluster-management.io/clusterset:<ManagedClusterSet Name>.
  • LabelSelector requires speicying a LabelSelector spec key specifying a custom label for selection.

Considering both LegacyClusterSetLabel and ExclusiveClusterSetLabel are the default for their respective versions, we should delete this spec for backward compatibility.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it seems we should just drop this part for the time being

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks for the excellent investigation on this btw ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My pleasure!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in c6ffd0e.

@TomerFi TomerFi force-pushed the add-clusterdeployments branch 4 times, most recently from 186a25e to 656e213 Compare April 23, 2024 17:13
…for clusterdeployments

Signed-off-by: Tomer Figenblat <[email protected]>
@TomerFi TomerFi force-pushed the add-clusterdeployments branch from 656e213 to a073b8c Compare April 23, 2024 21:08
@TomerFi
Copy link
Member Author

TomerFi commented Apr 23, 2024

@mbaldessari, can you please take a look at the failed workflow?

Tests are passing on my end. Based on the failure message, I tried aligning the valueFiles array identification, but the result was the same.

@mbaldessari
Copy link
Contributor

@mbaldessari, can you please take a look at the failed workflow?

Tests are passing on my end. Based on the failure message, I tried aligning the valueFiles array identification, but the result was the same.

@mbaldessari, can you please take a look at the failed workflow?

Tests are passing on my end. Based on the failure message, I tried aligning the valueFiles array identification, but the result was the same.

Aye we can worry / fix this one up later on. We fixed a bug in there in the meantime I suspect that is why the test is failing.

I did try this PR + the drop of legacyClusterSelector and for some reason the additional cluster could not be spun up by ACM. I might have done something wrong. I'll poke you next week about it if I can't figure it out

@TomerFi TomerFi force-pushed the add-clusterdeployments branch from 8fd42b7 to c6ffd0e Compare April 26, 2024 15:06
@TomerFi
Copy link
Member Author

TomerFi commented Apr 26, 2024

I did try this PR + the drop of legacyClusterSelector and for some reason the additional cluster could not be spun up by ACM. I might have done something wrong. I'll poke you next week about it if I can't figure it out

Have a great weekend! :-)

@TomerFi TomerFi force-pushed the add-clusterdeployments branch from 0471ed5 to 2ef395d Compare May 28, 2024 17:37
…er-namespace

Co-authored-by: Michele Baldessari <[email protected]>
Co-authored-by: Alejandro Villegas <[email protected]>
Signed-off-by: Tomer Figenblat <[email protected]>
@TomerFi TomerFi force-pushed the add-clusterdeployments branch from 2ef395d to 6cd4e85 Compare May 28, 2024 17:41
@TomerFi
Copy link
Member Author

TomerFi commented May 28, 2024

@mbaldessari @r2dedios
Following our session today, namespaces was added to clusterdeployment-related secrets in 6cd4e85.

@mbaldessari mbaldessari merged commit 7ba9db5 into validatedpatterns:main May 30, 2024
4 of 5 checks passed
@mbaldessari
Copy link
Contributor

Thanks!

@r2dedios
Copy link
Contributor

Thanks @mbaldessari

@TomerFi
Copy link
Member Author

TomerFi commented May 31, 2024

Thank you @mbaldessari and @r2dedios !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants