Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require EDPMServiceType #1176

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bshephar
Copy link
Contributor

@bshephar bshephar commented Nov 8, 2024

This change switches EDPMServiceType from optional to required. Since the EDPMServiceType of custom services needs to match that of an existing service, we also remove the defaulting mechanism in the webhook.

jira: https://issues.redhat.com/browse/OSPRH-11333

Copy link
Contributor

openshift-ci bot commented Nov 8, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bshephar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Nov 8, 2024
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/278066ea4b734db9a37b3e10a5a15889

openstack-k8s-operators-content-provider FAILURE in 5m 10s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
✔️ openstack-operator-docs-preview SUCCESS in 2m 05s

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch from 44af400 to 0cd0898 Compare November 8, 2024 03:59
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/d06be16b184c4f58ade975c692a654f9

✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 46m 22s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 18m 17s
cifmw-crc-podified-edpm-baremetal RETRY_LIMIT in 20m 43s
✔️ openstack-operator-tempest-multinode SUCCESS in 1h 33m 05s
✔️ openstack-operator-docs-preview SUCCESS in 1m 59s

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch from 0cd0898 to a88cadf Compare November 10, 2024 22:21
@@ -112,7 +112,7 @@ spec:
deployOnAllNodeSets: true
----

. Optional: Specify the `edpmServiceType` field for the service. Different custom services may use the same ansible content to manage the same EDPM service (such as `ovn` or `nova`). The `DataSources`, TLS certificates, and CA certificates need to be mounted at the same locations so they can be found by the ansible content even when using a custom service. `edpmServiceType` is used to create this association. The value is the name of the default service that uses the same ansible content as the custom service. If there are multiple services with the same `edpmServiceType` listed in a nodeset or deployment spec, latter ones would be ignored.
. Required: Specify the `edpmServiceType` field for the service. Different custom services may use the same ansible content to manage the same EDPM service (such as `ovn` or `nova`). The `DataSources`, TLS certificates, and CA certificates need to be mounted at the same locations so they can be found by the ansible content even when using a custom service. `edpmServiceType` is used to create this association. The value is the name of the default service that uses the same ansible content as the custom service. If there are multiple services with the same `edpmServiceType` listed in a nodeset or deployment spec, latter ones would be ignored.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not actually required that edpmServiceType matches that of an existing service. It's really just if you are re-using existing ansible content from epdm-ansible, and need to re-use the same paths for cert mounts. So, I think that needs to be clarified. The field can still be required though, so that we can force the user to make that distinction when they create the service.

Copy link
Contributor Author

@bshephar bshephar Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still kind of on the fence as to whether we should do this or not. Maybe it can just be improved in documentation. But I'll note my thoughts on the topic below since we have a thread here related to it.

The only thing that concerns me with the current implementation is that it doesn't work if users try to create a new Nova service for example, but fail to specify edpmServiceType: nova. I think it probably improves the user experience by requiring the field. But then at the same time, it also puts more of a mental burden on them to now go and understand what this edpmServiceType means.

I think it's probably easier for the user if we block the CR creation by requiring this field, then they have to go and read about the parameters. Whereas the alternative is having to debug why the deployment failed, which is technically more challenging / a worse user experience.

I agree with this comment, I'm just dropping my thoughts on the thread. I'll do some rewriting of this section to include that clarification.

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch 2 times, most recently from 2559460 to 516d2c1 Compare November 12, 2024 05:16
@bshephar bshephar force-pushed the EdpmServiceType-webhook branch 3 times, most recently from 5a47593 to 9559d05 Compare December 9, 2024 10:46
@bshephar bshephar force-pushed the EdpmServiceType-webhook branch from 9559d05 to 76505cb Compare December 16, 2024 05:25
@bshephar
Copy link
Contributor Author

/test functional

@bshephar
Copy link
Contributor Author

Functional tests pass locally, rechecking it:

 SUCCESS! 1m11.00949s
coverage: 50.5% of statements
[1734330348] Webhook Suite - 0/0 specs - 4 procs  SUCCESS! 2.333ms
coverage: 1.0% of statements
[1734330348] Controller v1beta1 Suite - 3/3 specs - 4 procs ••• SUCCESS! 2.88ms
coverage: 0.6% of statements
composite coverage: 62.6% of statements

Ginkgo ran 4 suites in 1m41.194835792s
Test Suite Passed

@bshephar
Copy link
Contributor Author

So, the remaining problem we have here is that in making this field mandatory, we risk the update process. Since CR's defined without the field will be un-upgradable until the field is defined.

I don't think this should be an issue, since we were adding the field with a validation webhook, so all existing CR's in the cluster should have the field. But we should consider it as a potential upgrade issue and test for it accordingly.

@bshephar
Copy link
Contributor Author

/test openstack-operator-build-deploy-kuttl

@bshephar
Copy link
Contributor Author

All Kuttl tests seem to be having problems with Quay:

4d9a29bd7a80877d8e
Copying blob sha256:8afba38a614e150d5cec9d32c41bedfe2fe3cc1a11fbd33c712e10abb9e65de9
time="2024-12-18T03:43:50Z" level=warning msg="Failed, retrying in 4s ... (3/3). Error: trying to reuse blob sha256:8afba38a614e150d5cec9d32c41bedfe2fe3cc1a11fbd33c712e10abb9e65de9 at destination: pinging container registry quay.rdoproject.org: Get \"https://quay.rdoproject.org/v2/\": dial tcp 38.129.56.158:443: i/o timeout"
Getting image source signatures
Copying blob sha256:3d7039b254a490f48fac42b072ca3e2249ff9b4b6bb65cd1930b4ff94600a86e
Copying blob sha256:8afba38a614e150d5cec9d32c41bedfe2fe3cc1a11fbd33c712e10abb9e65de9
Copying blob sha256:34db82eb2cd0f20a0d8e43721c3dc6e7c2900ee73351f94d9a29bd7a80877d8e
Warning: Push failed, retrying in 5s ...
Registry server Address: 
Registry server User Name: openstack-k8s-operators+cirobot
Registry server Email: 
Registry server Password: <<non-empty>>
error: build error: Failed to push image: trying to reuse blob sha256:8afba38a614e150d5cec9d32c41bedfe2fe3cc1a11fbd33c712e10abb9e65de9 at destination: pinging container registry quay.rdoproject.org: Get "https://quay.rdoproject.org/v2/": dial tcp 38.129.56.158:443: i/o timeout

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch 2 times, most recently from 4e2c8e4 to 0a362dc Compare January 5, 2025 23:19
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/342dcb950bca453a9f2540248c0c6fc3

✔️ openstack-k8s-operators-content-provider SUCCESS in 3h 13m 01s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 22m 29s
cifmw-crc-podified-edpm-baremetal FAILURE in 57m 07s
✔️ adoption-standalone-to-crc-ceph-provider SUCCESS in 3h 00m 34s
✔️ openstack-operator-tempest-multinode SUCCESS in 1h 37m 03s
✔️ openstack-operator-docs-preview SUCCESS in 1m 53s

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch from 0a362dc to 25f0653 Compare January 6, 2025 23:42
@slagle
Copy link
Contributor

slagle commented Jan 8, 2025

So, the remaining problem we have here is that in making this field mandatory, we risk the update process. Since CR's defined without the field will be un-upgradable until the field is defined.

I don't think this should be an issue, since we were adding the field with a validation webhook, so all existing CR's in the cluster should have the field. But we should consider it as a potential upgrade issue and test for it accordingly.

Given the webhook was in GA, I think we're good in this regard. Any other concerns before we merge this one?

@bshephar bshephar force-pushed the EdpmServiceType-webhook branch from 25f0653 to b6c2b10 Compare January 8, 2025 23:55
@bshephar
Copy link
Contributor Author

bshephar commented Jan 8, 2025

The kuttl test failure looks legit after rebase. Let me check that:

 Container test exited with code 2, reason Error
---
name: nova-combined-ca-bundle
        -        - mountPath: /var/lib/openstack/cacerts/custom-global-service
        +        - mountPath: /var/lib/openstack/cacerts
                   name: custom-global-service-combined-ca-bundle 

This change switches EDPMServiceType from optional to required.
Since the EDPMServiceType of custom services needs to match that of an existing service,
we also remove the defaulting mechanism in the webhook.

Signed-off-by: Brendan Shephard <[email protected]>
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@bshephar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/precommit-check b6c2b10 link true /test precommit-check
ci/prow/openstack-operator-build-deploy-kuttl b6c2b10 link true /test openstack-operator-build-deploy-kuttl

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants