Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NET-6394 Create/update/delete ServiceAccount on MeshGateway reconcile #3244

Merged
merged 14 commits into from
Nov 28, 2023

Conversation

nathancoleman
Copy link
Member

@nathancoleman nathancoleman commented Nov 21, 2023

Changes proposed in this PR:
When reconciling a MeshGateway resource in the v2 API, create/update/delete a corresponding ServiceAccount.
Notably, we need to ensure we aren't updating or deleting resources that our controller didn't create simply because they have the expected name.

How I've tested this PR:
Added unit test coverage

How I expect reviewers to test this PR:
See above

Checklist:

@nathancoleman nathancoleman added pr/no-changelog PR does not need a corresponding .changelog entry pr/no-backport signals that a PR will not contain a backport label labels Nov 21, 2023
@nathancoleman nathancoleman force-pushed the mesh-gateway-service-account branch from ebd8623 to d78494f Compare November 21, 2023 20:44
@nathancoleman nathancoleman requested a review from a team November 21, 2023 22:06
@nathancoleman nathancoleman marked this pull request as ready for review November 21, 2023 22:06
The MeshGateway is partition-scoped which means that it should never include a namespace in its resource ID
This enables any future merge operations than need to be done before writing the new object, such as setting a value that needs to carry forward from the existing object onto the new object.
Copy link
Contributor

@andrewstucki andrewstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the only thing I really see missing here is triggering reconciliation on owned object changes. For that I think you'll probably need to extend

func setupWithManager(mgr ctrl.Manager, resource client.Object, reconciler reconcile.Reconciler) error {

and add some ability to specify additional Owns references (maybe just make a closure for doing additional arbitrary resource watch configuration before calling Complete?), similar to how we do in our current impl:

For(&gwv1beta1.Gateway{}).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Service{}).
Owns(&corev1.Pod{}).

otherwise any changes to the managed ServiceAccount won't re-trigger reconciliation and you'll never hit the ifExists branch and will get configuration drift.

return errors.New("onDelete not implemented")
builder := gateways.NewMeshGatewayBuilder(resource)

deleteOp := func(ctx context.Context, _, object client.Object) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we can just let the garbage collector clean these up? By setting the owner reference field, if the owner is deleted, these will actually get cleaned up automatically. If we're cool with just letting the GC do its work, then the only thing we have to worry about on deletion is cleaning up any side-effects that we may have done in our controller upsert path.

Copy link
Member Author

@nathancoleman nathancoleman Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My assumption was that there's a reason we do explicit cleanup in our v1 implementation here, but I don't know what that might have been. Any ideas?

Does the GC handle dependency ordering well? For example, the ServiceAccount winds up being referenced by the created Deployment, so we clean them up in reverse order in our v1 implementation.

// Delete removes the resources for handling routing of network traffic.
// This is done in the reverse order of Upsert due to dependencies between resources.
func (g *Gatekeeper) Delete(ctx context.Context, gatewayName types.NamespacedName) error {
g.Log.V(1).Info(fmt.Sprintf("Delete Gateway Deployment %s/%s", gatewayName.Namespace, gatewayName.Name))
if err := g.deleteDeployment(ctx, gatewayName); err != nil {
return err
}
if err := g.deleteService(ctx, gatewayName); err != nil {
return err
}
if err := g.deleteRoleBinding(ctx, gatewayName); err != nil {
return err
}
if err := g.deleteServiceAccount(ctx, gatewayName); err != nil {
return err
}
if err := g.deleteRole(ctx, gatewayName); err != nil {
return err
}
return nil
}

Copy link
Member Author

@nathancoleman nathancoleman Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've dropped the explicit deletes in favor of allowing the k8s garbage collector to do its thing.
If we discover a reason we need to delete explicitly like we do in the gatekeeper package today, then we can add the explicit deletion back at a later date.

@nathancoleman
Copy link
Member Author

The enterprise control-plane tests are failing because our self-hosted macos runner doesn't have yarn installed. Working to resolve this outside of this PR.

@nathancoleman nathancoleman merged commit a1761c4 into main Nov 28, 2023
3 checks passed
@nathancoleman nathancoleman deleted the mesh-gateway-service-account branch November 28, 2023 19:58
jm96441n pushed a commit that referenced this pull request Nov 29, 2023
…#3244)

* Create/update/delete ServiceAccount on MeshGateway reconcile

* Implement owner ref checking for existing objects before create/update

* Use builder struct for creating k8s resources

* Extend upsert function to handle delete as well

* Add unit tests asserting ServiceAccount CRUD w/ ownership enforcement

* Update list of TODOs in reconcile handlers

* Uncomment call to MeshConfigController.ReconcileEntry

* Check error returned by SetControllerReference

* Add unit test coverage for MeshGatewayBuilder.serviceAccount()

* Omit namespace on MeshGateway's resource ID tenancy

The MeshGateway is partition-scoped which means that it should never include a namespace in its resource ID

* Provide existing object to operation

This enables any future merge operations than need to be done before writing the new object, such as setting a value that needs to carry forward from the existing object onto the new object.

* Declare ownership of ServiceAccount when setting up controller

* Fix typo

* Rely on garbage collector to delete resource instead of deleting explicitly
sarahalsmiller pushed a commit that referenced this pull request Jan 5, 2024
…#3244)

* Create/update/delete ServiceAccount on MeshGateway reconcile

* Implement owner ref checking for existing objects before create/update

* Use builder struct for creating k8s resources

* Extend upsert function to handle delete as well

* Add unit tests asserting ServiceAccount CRUD w/ ownership enforcement

* Update list of TODOs in reconcile handlers

* Uncomment call to MeshConfigController.ReconcileEntry

* Check error returned by SetControllerReference

* Add unit test coverage for MeshGatewayBuilder.serviceAccount()

* Omit namespace on MeshGateway's resource ID tenancy

The MeshGateway is partition-scoped which means that it should never include a namespace in its resource ID

* Provide existing object to operation

This enables any future merge operations than need to be done before writing the new object, such as setting a value that needs to carry forward from the existing object onto the new object.

* Declare ownership of ServiceAccount when setting up controller

* Fix typo

* Rely on garbage collector to delete resource instead of deleting explicitly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/no-backport signals that a PR will not contain a backport label pr/no-changelog PR does not need a corresponding .changelog entry theme/mesh-gw
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants