Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tlse] tls for NovaAPI, NovaMetadata and NovNoVNCProxy #646

Merged

Conversation

stuggi
Copy link
Contributor

@stuggi stuggi commented Jan 11, 2024

Public/Internal service cert secrets and the CA bundle secret can be passed to configure httpd virtual hosts for tls termination. The CA cert get direct mounted as the environment bundle to /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem .

The service certificates like config files and copied via kolla to /etc/pki/tls/certs/%s.crt|/etc/pki/tls/private/%s.key .

Job deployments for bootstrap/cron get the CA bundle added if configured.

Also indexes the named input resources for CA bundle, and tls secrets to be able to watch them for a change and reconcile.

Depends-On: openstack-k8s-operators/lib-common#428

Jira: OSPRH-3294

stuggi added a commit to stuggi/openstack-operator that referenced this pull request Jan 11, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
@stuggi stuggi requested review from olliewalsh and Deydra71 January 11, 2024 15:09
@stuggi stuggi force-pushed the tlse_secret_update branch from d3b8816 to 493bd4b Compare January 12, 2024 15:37
stuggi added a commit to stuggi/openstack-operator that referenced this pull request Jan 12, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/e5ed9cb737784ea798f204e04f43220f

✔️ nova-operator-content-provider SUCCESS in 3h 05m 43s
✔️ nova-operator-kuttl SUCCESS in 37m 21s
nova-operator-tempest-multinode FAILURE in 2h 41m 39s

Copy link
Contributor

@gibizer gibizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at the TLS data flow between CRs but I have questions inline about how we pass this new input around.

Comment on lines 850 to 862
Watches(&source.Kind{Type: &corev1.Secret{}},
handler.EnqueueRequestsFromMapFunc(r.GetSecretMapperFor(&novav1.NovaAPIList{}, context.TODO()))).
Watches(
&source.Kind{Type: &corev1.Secret{}},
handler.EnqueueRequestsFromMapFunc(r.findObjectsForSrc),
builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuggi what do you think, can we merge the logic of the two secret watch into a single solution? If so then would you do it? If you don't have time for it then just fill an github issue for it and we will get to it eventually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean for Spec.Secret, right ? if so I can do that in this PR if you are ok doing it here, or you prefer to do it in a follow up?

Copy link
Contributor

@gibizer gibizer Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would like to watch Spec.Secret the same way as the cert secrets. It is up to you if you do it here or in a follow up. Thanks for doing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a commit to this PR to do the change.

api/v1beta1/novacell_types.go Show resolved Hide resolved
@@ -213,6 +219,7 @@ func NewNovaComputeSpec(
ServiceUser: novaCell.ServiceUser,
ServiceAccount: novaCell.ServiceAccount,
ComputeDriver: computeTemplate.ComputeDriver,
TLS: novaCell.TLS,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so each ironic compute in the same cell will use the same set of certs therefore we only need the TLS field in the computeSpec not in the computeTemplate and here we initialize the computeSpec.TLS from cellSpec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, only need the CA bundle

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. So if the ironic compute only gets the CA bundle it means the compute can validate the rabbit server's certificate but the compute will not provide a client cert. I guess mTLS is out of the scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes client certs is something we can/will add at a later point. not for this initial work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that woudl be nice to use instead of user and password
use user and cert for login if it supports it.

then instead of having to implement password rotation for rabbit we could just to client cert rotation and achieve both

api/v1beta1/novaconductor_types.go Show resolved Hide resolved
api/v1beta1/novametadata_types.go Show resolved Hide resolved
api/v1beta1/novanovncproxy_types.go Show resolved Hide resolved
@@ -845,6 +845,7 @@ func (r *NovaReconciler) ensureCell(
ServiceUser: instance.Spec.ServiceUser,
KeystoneAuthURL: keystoneAuthURL,
ServiceAccount: instance.RbacResourceName(),
TLS: instance.Spec.APIServiceTemplate.TLS.Ca,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so each cell will use the same cert that the NovaAPI uses. Then I'm wondering if it would be cleaner to introduce only a single NovaSpec.TLS paramater and distribute that to each cell and service Spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the CA bundle the bundle from api gets set on the deployments where no service is deployed. for api, medatadata and vncproxy which also runs a service they need the cert secret should be used for the deployment. I was thinking about to have a top level CA bundle parameter for nova and just the certs for the services in needs, but if there is a configuration where a service should have a different bundle from the top level one its beneficial to be able have this together in the component template.

Copy link
Contributor

@gibizer gibizer Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry my fault. We only pass forward the CA to the cell and that is assumed to be the same for the whole deployment.

I was thinking about to have a top level CA bundle parameter for nova and just the certs for the services in needs, but if there is a configuration where a service should have a different bundle from the top level one its beneficial to be able have this together in the component template.

The current solution you propose assumes that the CA bundle for ironic compute in the cell and the conductor in the cell always the same as the nova-api bundle. I think it is reasonable assumption. But I would probably document the fact here in the code that this CA gets used by the ironic compute and the conductor but VNC and Metadata (if in the cell) uses a CA passed in via the respective service template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have added a comment where the api ca is reused

// +kubebuilder:validation:Optional
// +operator-sdk:csv:customresourcedefinitions:type=spec
// TLS - Parameters related to the TLS
TLS tls.SimpleService `json:"tls,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what the diffent between tls.API and tls.SimpleService?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh looking at the generateed files
tls.API has a public and internal cert ahwere as tls.SimpleService only has one cert

presumable the internal cert since its not exposed externally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, api has the two endpoints while other services which don't use the SimpleService, just service cert/key + CA bundle

# secret holding the tls-ca-bundle.pem to be used as a deploymend env CA bundle
caBundleSecretName: combined-ca-bundle
# secret holding tls.crt and tls.key for the novncproxy k8s service
secretName: cert-nova-novncproxy-cell1-public-svc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that kind of makes sesne.

we may want to automate generating the correct name based on a convention later but that looks pretty clean to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name the service is being created is basically the k8s service name the nova-operator creates with a cert prefix and svc appendix as there is also a cert for the route which gets -route

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the convention we have is to have the crname as the prefix so this should be something like nova-novncproxy-cell1-public-svc-cert

# secret holding the tls-ca-bundle.pem to be used as a deploymend env CA bundle
caBundleSecretName: combined-ca-bundle
# secret holding tls.crt and tls.key for the metadata k8s service
secretName: cert-nova-metadata-internal-svc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack you also have an example of global metadata +1

stuggi added a commit to stuggi/openstack-operator that referenced this pull request Jan 17, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
@gibizer
Copy link
Contributor

gibizer commented Jan 19, 2024

I only looked at the TLS data flow between CRs but I have questions inline about how we pass this new input around.

My fault. I realize that there is not a single input type but two different types. One for API and one for SimpleService. I will read it again with this extra knowledge

Copy link
Contributor

@gibizer gibizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at the test code separately. I have some questions inline but the implementation overall looks good.

pkg/nova/cellmapping.go Show resolved Hide resolved
templates/nova.conf Show resolved Hide resolved
Public/Internal service cert secrets and the CA bundle secret
can be passed to configure httpd virtual hosts for tls termination.
The CA cert get direct mounted as the environment bundle to
/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem .

The service certificates like config files and copied via kolla
to /etc/pki/tls/certs/%s.crt|/etc/pki/tls/private/%s.key .

Job deployments for bootstrap/cron get the CA bundle added if
configured.

Also indexes the named input resources for CA bundle, and tls secrets
to be able to watch them for a change and reconcile.

Depends-On: openstack-k8s-operators/lib-common#428

Jira: OSPRH-3294
Updates all controllers to use the FieldIndexer to reconcile
when changes happen to the secret defined in .Spec.Secret.
Copy link
Contributor

@gibizer gibizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the envtest coverage I have one additional envtest test case request (reconfiguration).

Do you plan to add basic kuttl coverage as well?

test/functional/base_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the envtest coverage, I appreciate it. It is possible to cover the secret watching and pod reconfiguration functionality with a reasonable small test. You can see my placement example here gibizer/placement-operator@f8ae8ee .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new commit to add those reconfig tests

@stuggi
Copy link
Contributor Author

stuggi commented Jan 23, 2024

Thanks for the envtest coverage I have one additional envtest test case request (reconfiguration).

Do you plan to add basic kuttl coverage as well?

yes, I planned to add kuttl tests, but did not manage to get to it before being out for traveling this week. Also wanted to reach out to you on how the kuttl tests are structured in the nova operator.

@gibizer
Copy link
Contributor

gibizer commented Jan 24, 2024

Thanks for the envtest coverage I have one additional envtest test case request (reconfiguration).
Do you plan to add basic kuttl coverage as well?

yes, I planned to add kuttl tests, but did not manage to get to it before being out for traveling this week. Also wanted to reach out to you on how the kuttl tests are structured in the nova operator.

I don't have hard opinion about kuttl structure in nova-operator. (maybe @jamepark4 @SeanMooney has). Just general guidance:

  • try not to repeat too much input data but instead create some common input yamls
  • try to only match for the relevant fields of the output to avoid brittle tests.

@SeanMooney
Copy link
Contributor

i would be fine with working on the kuttle test in a follow-up pr.
i have structured them a little differently then in other repos
intentionally so the can be run form the repo without install yamls in there own namespace so you can run them on the same crc/openshift as a real deployment.

the basic structure is as follows

i have provided 4 make targets

kuttl-test-prep kuttl-test-run and kuttl-test-cleanup do all the real work and
kuttl-test just does prep and run in one step.

we probably should have a kuttl-tests that loops over all the kuttl test suites and run
pre run and clean up for each suite.

that braing use to what a test suite is
today we have only one called default.

each test suite is an independent set of kuttl tests that run in there own namespace.

there are three ways to extend the kuttl test coverage.
first you can add new tests to an existing test directory
we currently only have one test dir which is the scale tests.

https://github.com/openstack-k8s-operators/nova-operator/tree/main/test/kuttl/test-suites/default/scale-tests

each test dir should be loosely testing a single theme i.e. deploying and scaling nova in this case.

you could jsutadd a new test file to the scale test with tls but that is kind of breaking that convention that each test dir in a suite does one thing.

the second approach is you could add a second tls dir in the default suite and then add your tests there. note that tests in the same suite share the same namespace as we only deploy that once per test suite. that means each test dir need to remove all resources it created during the test at the end to not break other tests. since it not simple to clean up the database in this way you have to be carful of test interactions if you have many tests in the same test suite.

my approach to tls testing would be to copy the default suite
https://github.com/openstack-k8s-operators/nova-operator/tree/main/test/kuttl/test-suites
and add a tls suite beside it in the same folder

the structure of a test suite is each suite has a set of folder and a config.yaml

the deps directory container a kustomize definition that uses the openstack control plane to
create all of the depencies for nova such as keyston/placment MySQL rabbit and the secrets need for the tests

https://github.com/openstack-k8s-operators/nova-operator/blob/main/test/kuttl/test-suites/default/deps/kustomization.yaml

in your deps kustomization file i would use the secrete generorat to pre-create static secrets with fake tls data.

the output dir should be empty bar a .keep file and the test suite should be configure using the config.yaml file to place the out put of kuttle there.

the config.yaml defines where kuttl will place its output and look for tests to execute

it also defines the namespace to use for the tests
you can basically use https://github.com/openstack-k8s-operators/nova-operator/blob/main/test/kuttl/test-suites/default/config.yaml and sed 's/default/tls/'

finally the last set of directors in a test suite are the test dirs containing the kuttl tests
there you can just write test as normal.

so it would look something like this

test/kuttl/test-suites/tls/deps/
test/kuttl/test-suites/tls/output/
test/kuttl/test-suites/tls/tls-tests/
test/kuttl/test-suites/tls/config.yaml

then you could run the test by doing

KUTTL_SUITE=tls make kuttl-test

that would invoke
KUTTL_SUITE=tls make kuttl-test-prep which is an
oc apply -k $(KUTTL_SUITE_DIR)/deps/ --timeout=120s
oc wait -n $(KUTTL_NAMESPACE) openstackcontrolplane openstack --for condition=Ready --timeout=300s

then

KUTTL_SUITE=tls make kuttl-test-run
which is

oc kuttl test --v 1 --start-kind=false --config $(KUTTL_SUITE_DIR)/config.yaml

@stuggi
Copy link
Contributor Author

stuggi commented Jan 24, 2024

@SeanMooney thanks for the details. I'd prefer to work on the kuttl tests as a follow up then.

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/08e0e51306af4eb684eda99e3cbc546b

✔️ nova-operator-content-provider SUCCESS in 1h 45m 00s
nova-operator-kuttl RETRY_LIMIT in 21m 39s
nova-operator-tempest-multinode FAILURE in 1h 25m 43s

@stuggi
Copy link
Contributor Author

stuggi commented Jan 24, 2024

recheck

Copy link
Contributor

@gibizer gibizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my comments were addressed. The PR looks good to me. I'm OK to take the kuttl work as a separate PR. I assume it was tested together with openstack-k8s-operators/openstack-operator#626 and works.

@openshift-ci openshift-ci bot added the lgtm label Jan 25, 2024
Copy link
Contributor

openshift-ci bot commented Jan 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gibizer, stuggi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 5de7de4 into openstack-k8s-operators:main Jan 25, 2024
7 checks passed
stuggi added a commit to stuggi/openstack-operator that referenced this pull request Jan 29, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
stuggi added a commit to stuggi/openstack-operator that referenced this pull request Jan 30, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
stuggi added a commit to stuggi/openstack-operator that referenced this pull request Feb 13, 2024
Creates TLS certs via cert-manager for NovaAPI, NovaMetadata and
NovaNoVNCProxy when spec.tls.endpoint.internal.enabled: true

Depends-On: openstack-k8s-operators/lib-common#428
Depends-On: openstack-k8s-operators#620
Depends-On: openstack-k8s-operators/nova-operator#646

Jira: OSPRH-3294
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants