Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: auto restart deployment when secret files change #43

Merged
merged 1 commit into from
Jul 15, 2024

Conversation

uvegla
Copy link
Contributor

@uvegla uvegla commented Jul 10, 2024

What type of PR is this?

feature

What does this PR do / Why do we need it:

This PR ensure when the secretFiles changes and mountSecret is true, zot deployment will be automatically restarted.

This will eliminate the manual zot restart step after secrets change.

This is done per helm's official best practice guide.

This is very similar to what #36 achieved just for secrets.

Additionally I fixed the comment for the .configFiles Helm value as the reload was fixed in the above PR.

Testing done on this change:

Unit test added.

Automation added to e2e:

N/A

Will this break upgrades or downgrades?

No.

Does this PR introduce any user-facing change?:

Changing secret files will now automatically restart Zot's deployment.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@uvegla
Copy link
Contributor Author

uvegla commented Jul 11, 2024

Just pushed a small commit to correct the Helm value comment for .configFiles as it actually got implemented in #36 to reload the pod on change.

@andaaron
Copy link
Contributor

Hi @uvegla, you also need to increment the chart version

@uvegla uvegla force-pushed the secret-checksum branch from c0506a7 to e989a5b Compare July 11, 2024 11:43
@uvegla
Copy link
Contributor Author

uvegla commented Jul 11, 2024

Fixed commit message and added sign off.

Copy link
Contributor

@andaaron andaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rchincha
Copy link
Contributor

Is this restart async, meaning it can interrupt a zot possibly in the middle of things (we do protect against this quite a bit though)

@nvtkaszpir
Copy link

nvtkaszpir commented Jul 12, 2024

AFAIR zot auto reloads config on the change using ionotify, and the helm chart does not use subpaths so you do not need to explicitly roll out restart on change, when secret changes then in about 1 minute k8s will update mounted secret.

https://github.com/project-zot/zot/blob/e5eacaa082c9fa9f85fb4548841e8dc9f1918590/pkg/cli/server/config_reloader.go#L18

@uvegla
Copy link
Contributor Author

uvegla commented Jul 12, 2024

@nvtkaszpir I gave that a try, but I think it is not working, at least not for secrets.

Here is what I did to test that. Relevant parts of my configuration:

mountSecret: true
secretFiles:
  privateRegistries: |-
    {
       "my-private-registry.example.org": {
         "username": "test",
         "password": "my-token"
      }      
    }
configFiles:
 config.json: |-
   # ...
     "log":
     {
        "level": "debug"
     },
     "extensions": {
       "sync": {
         "enable": true,
         "credentialsFile": "/secret/privateRegistries",
         "registries": [
           {
             "urls": [
               "https://my-private-registry.example.org"
             ],
             "onDemand": true,
             "tlsVerify": true,
             "maxRetries": 3,
             "retryDelay": "5m"
           }
         ]
       },
     }
   #...      

In my test I also exposed it via an authenticated ingress - config for that I left our for simplicity -, lets say my-cache.example.org with read user as test and password as password with basic HTTP auth on Zot.

The my-org/my-image is an image in my-private-registry.example.org with let's say tags: v1 and v2.

Then this works:

docker login -u test -p password my-cache.example.org
docker pull my-cache.example.org/my-org/my-image:v1

Then I invalidate the token for my-private-registry.example.org, generate a new one and update the Helm release with the new value under .secretFiles.privateRegistries JSON. The deployment is fine, the secret is updated, the pod is not rolled.

Even after minutes, this fails:

docker pull my-cache.example.org/my-org/my-image:v2

Even tho runing with debug log level, I dont see this message in the logs: https://github.com/project-zot/zot/blob/v2.0.1/pkg/cli/server/config_reloader.go#L75 and I think the secret file is not reloaded, cos the pull fails and the error log lines about failed authentication are present in the logs.

If I kill the pod, the new one will have it working just fine and the above pull works, so the updated credentials are used towards my-private-registry.example.org.

It is just a hunch, but I experienced previously in Kubernetes with other apps, that file system watchers are not working correctly with volume mounts. Not super sure it is related or similar (tho fsnotify is the watcher in Zot as well), but it reminded me of this issue: fsnotify/fsnotify#9. (Specifically: fsnotify/fsnotify#9 (comment)).

Also related, but they lead back to that above linked issue:

@nvtkaszpir
Copy link

OK so, this change is indeed good if some volume mounts do not support updates (especially subpaths are not supporting it).

One more thing - zot generates a log message if the config is updated, worth to check in the logs if that happened.

@uvegla
Copy link
Contributor Author

uvegla commented Jul 12, 2024

@nvtkaszpir Do you mean this: https://github.com/project-zot/zot/blob/v2.0.1/pkg/api/controller.go#L378-L379? I redid the above scenario, but I don't see that or anything resembling that the config was reloaded. So I assume it is not that it is reloaded just does not take effect for further upstream registry calls.

However when I try to pull before the pod is manually restarted, logs state that failed to sync image and invalid username/password: unauthorized.

Examples:

{
  "level": "error",
  "error": "unable to retrieve auth token: invalid username/password: unauthorized: authentication required, visit https://aka.ms/acr/authorization for more information.",
  "repository": "my-org/my-image",
  "reference": "v2",
  "goroutine": 284,
  "caller": "zotregistry.dev/zot/pkg/api/routes.go:1917",
  "time": "2024-07-12T14:58:21.292619015Z",
  "message":"failed to sync image"
}

@nvtkaszpir
Copy link

@uvegla I must say I did not try zot in container (yet), but directly running binary on the host with systemd picks up config changes as I described.

@nvtkaszpir
Copy link

also please sign your commit because it is required to accept the PR :)

@rchincha rchincha merged commit 076a029 into project-zot:main Jul 15, 2024
5 checks passed
@rchincha rchincha mentioned this pull request Jul 15, 2024
@uvegla uvegla deleted the secret-checksum branch July 16, 2024 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants