-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sysroot auto cleanup #2511
base: main
Are you sure you want to change the base?
Sysroot auto cleanup #2511
Conversation
This is a wrapper around `ostree_sysroot_cleanup` that only performs the cleanup when the `.cleanup` file exists. When the cleanup has completed, the file is deleted. The purpose of this API is to allow other routines to indicate that cleanup is needed when they're not in a good position to initiate the cleanup themselves. This API can then be run at a more opportune time without actually initiating the expensive cleanup operations unless they were needed.
This provides a convenient method to run `ostree_sysroot_auto_cleanup` rather than `ostree_sysroot_cleanup`.
Currently finalizing a staged deployment skips the pruning that typically happens when writing out a deployment since it would block the shutdown path. In order to leave an indication that a cleanup should be run at a more opportune time, touch the `.cleanup` file after the staged deployment is written out. The `ostree_sysroot_auto_cleanup` API can then be used to perform the pruning at a better time without fear that it will initiate an unnecessary prune.
This unit makes use of the `--auto` option of `ostree admin cleanup` to cleanup the sysroot when the `/sysroot/.cleanup` file is left behind when finalizing a staged deployment. The purpose of this service is to restore pruning of deleted deployments when a staged deployment is written out during shutdown of the previous boot. The unit is optional as the cleanup can be run at another time or not at all. Cleaning the sysroot will prune the repo, and this can be a slow and IO intensive operation. To keep the system from blocking, the default `Before` dependency on `multi-user.target` has been removed. Since the service will then run in the background, the IO scheduling class has been lowered to `idle` to keep the system's primary tasks more responsive.
ProtectSystem=strict | ||
ReadWritePaths=/sysroot /boot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I like this. But don't we also have to touch /var
and/or /etc
for the .updated
files? IIRC I hit that last time trying to restrict things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that happens when finalizing a deployment, not when cleaning up. See the comments in ostree-finalize-staged.service
. I tested this in a downstream way and it worked fine.
Over in endlessm/eos-updater#298 @wjt pointed out that having /sysroot
writable means essentially the whole system is writable. One thing that could be nicer would be to limit this whole exercise to just pruning since finalizing also does ostree_sysroot_prepare_cleanup
. Then you could limit the writable paths to just /sysroot/repo
.
@@ -20,12 +20,17 @@ | |||
- uncomment the include in Makefile-libostree.am | |||
*/ | |||
|
|||
LIBOSTREE_2022.2 { | |||
global: | |||
ostree_sysroot_auto_cleanup; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New public API feels overkill for this if the expectation is that it will be invoked by a separate systemd unit. Related below:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this was with an eye toward having the systemd unit disabled and letting the management daemon handle it itself. But I did do this in endlessm/eos-updater#298 without needing the API. I do think it's cleaner to have the semantics contained within libostree, though.
GError **error) | ||
{ | ||
struct stat stbuf; | ||
if (!glnx_fstatat_allow_noent (self->sysroot_fd, _OSTREE_SYSROOT_AUTOCLEANUP, &stbuf, AT_SYMLINK_NOFOLLOW, error)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the systemd unit is already doing this, do we need a new API at all versus simply ExecStart=ostree admin cleanup
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not. I think you could do this all in the CLI or via systemd unit settings. It felt cleaner to encapsulate it in libostree, but it's not necessary. I could go either way.
@dbnicholson: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[Unit] | ||
Description=OSTree Automatic Sysroot Cleanup | ||
Documentation=man:ostree-admin-cleanup(1) | ||
ConditionPathExists=/sysroot/.cleanup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, we do something similar, but with the condition inverted:
# This is deleted by deploy.sh:
ConditionPathExists=!/sysroot/.ostree-cleaned
This made migration easier. If you're upgrading ostree with the deploy then the old version of ostree won't know to create this file, so when booting into the new deployment cleaning won't run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a nice property. However, doesn't that mean you're running a prune on every boot, even if you haven't made a new deployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we delete .ostree-cleaned only just before ostree admin deploy.
@dbnicholson: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This is my proof of concept to initiate cleanup on the next boot of a staged deployment. In #2510 we discussed whether it was a good idea to add another systemd unit when the cleanup may be handled better by a management daemon like
rpm-ostree
.I think this is a nice general purpose way to address the issue, but I definitely concede it's not the only way to handle it. I think even without the systemd unit the rest of the commits are useful. It would be possible to stash some state indicating cleanup from outside ostree, but really ostree is by far best positioned since it handles writing out the staged deployment via
ostree admin finalize-staged
.Fixes: #2510