Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add zfs-scrub systemd units #146

Merged
merged 4 commits into from
Nov 19, 2024

Conversation

Eelviny
Copy link
Contributor

@Eelviny Eelviny commented Apr 15, 2024

Add zfs scrub systemd units to ucore to allow for easy configuration of periodic scrubbing of zpools.

I don't see any easy way to only have these unit files available for -zfs images only, however the main [email protected] has a condition check in it to ensure it can't run if ZFS is not installed.

@bsherman
Copy link
Collaborator

Add zfs scrub systemd units to ucore to allow for easy configuration of periodic scrubbing of zpools.

This is great! Thank you! Somehow I completely missed your PR submissions until last night when I was too tired to respond.

I don't see any easy way to only have these unit files available for -zfs images only, however the main [email protected] has a condition check in it to ensure it can't run if ZFS is not installed.

I think the unit files look great with excellent use of Conditions.

Regarding the best way to conditionally add the unit files... while in-unit Conditions are good, I agree it would be best to not include them unless on a ZFS image.

I think the "best" way would be to start building a ublue-os-ucore-zfs RPM like the ublue-os-ucore-nvidia RPM built in the ucore-kmods repo. That RPM would only be installed with the other zfs RPMs as you can see in install-ucore.sh.

Thoughts on this approach?

@dylanmtaylor
Copy link
Contributor

I like the use of the monotonic timer units

@Eelviny
Copy link
Contributor Author

Eelviny commented Apr 30, 2024

I think the "best" way would be to start building a ublue-os-ucore-zfs RPM like the ublue-os-ucore-nvidia RPM built in the ucore-kmods repo. That RPM would only be installed with the other zfs RPMs as you can see in install-ucore.sh.

Thoughts on this approach?

I think the approach is great! It's what I was looking for but couldn't find it, hence this solution. But I don't personally have enough time/expertise to build this. If the base is built, then I'm happy to put a PR onto it with these timers.

@bsherman
Copy link
Collaborator

If the base is built, then I'm happy to put a PR onto it with these timers.

I'm working on a base and I'll let you know when it's ready for you to PR the timers.

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 17, 2024
@bsherman
Copy link
Collaborator

@Eelviny I want to apologize for completely dropping the ball on getting a framework setup for packaging this PR in an RPM.

Things have shifted a bit at this point. I'd like to just get this added now, and if we can move to an RPM at sometime, that'll be a nice to have.

All said, I've been doing some local testing by manually copying in the timer and service units to one of my systems.

I do think there's a problem.

I have a zpool called zfundata.

root@glamdring:~# zpool list zfundata
NAME       SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zfundata  9.09T   732K  9.09T        -         -     0%     0%  1.00x    ONLINE  -

I enabled a weekly scrub timer for it:

root@glamdring:~# systemctl enable --now [email protected]
Created symlink '/etc/systemd/system/timers.target.wants/[email protected]' → '/etc/systemd/system/[email protected]'.

we can check the timer status to see the service unit to run

root@glamdring:~# systemctl status [email protected][email protected] - Weekly zpool scrub timer for zfundata
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; preset: disabled)
     Active: active (waiting) since Sun 2024-11-17 15:50:17 CST; 1min 2s ago
 Invocation: b46dad94299f4a809d80ac732f1c8a02
    Trigger: Mon 2024-11-18 00:34:44 CST; 8h left
   Triggers: ● [email protected]
       Docs: man:zpool-scrub(8)

Nov 17 15:50:17 glamdring systemd[1]: Started [email protected] - Weekly zpool scrub timer for zfundata.

and i'll manually run the service as a test:

root@glamdring:/~# systemctl start [email protected]
root@glamdring:~# systemctl status [email protected]
× [email protected] - zpool scrub on zfundata
     Loaded: loaded (/etc/systemd/system/[email protected]; static)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: exit-code) since Sun 2024-11-17 15:51:33 CST; 1s ago
   Duration: 22ms
 Invocation: 8b941f5ae35f44819bb20f0ad1740694
TriggeredBy: ● [email protected]
       Docs: man:zpool-scrub(8)
    Process: 3486799 ExecStart=/bin/sh -c  if @sbindir@/zpool status zfundata | grep -q "scrub in progress"; then exec @sbindir@/zpool wait>
   Main PID: 3486799 (code=exited, status=127)
   Mem peak: 1.2M
        CPU: 5ms

Nov 17 15:51:33 glamdring systemd[1]: Started [email protected] - zpool scrub on zfundata.
Nov 17 15:51:33 glamdring sh[3486800]: /bin/sh: line 1: @sbindir@/zpool: No such file or directory
Nov 17 15:51:33 glamdring sh[3486799]: /bin/sh: line 1: /@sbindir@/zpool: No such file or directory
Nov 17 15:51:33 glamdring systemd[1]: [email protected]: Main process exited, code=exited, status=127/n/a
Nov 17 15:51:33 glamdring systemd[1]: [email protected]: Failed with result 'exit-code'

So it looks like the @sbindir@ isn't getting expanded properly. I would guess it's due to the use of that symbol within a wrapped shell script:

EnvironmentFile=-@initconfdir@/zfs
ExecStart=/bin/sh -c '\
if @sbindir@/zpool status %i | grep -q "scrub in progress"; then\
exec @sbindir@/zpool wait -t scrub %i;\
else exec @sbindir@/zpool scrub -w %i; fi'
ExecStop=-/bin/sh -c '@sbindir@/zpool scrub -p %i 2>/dev/null || true'

I'm curious if you've moved forward using this on your own and if you have an improvement to add here?

Thanks!

@Eelviny
Copy link
Contributor Author

Eelviny commented Nov 19, 2024

Ah good spot, yes I had already found this and applied a fix to my own machine, but forgot to bring it back into the PR, apologies for that.

I'm not god-tier experienced with systemd so I wasn't sure how to fix that exact issue, but I figured that since we're in full control of the system here and where the binaries are located, hardcoding the binary location won't be as much of an issue. Lmk your opinion on that. All I can say is, the above fix works on my machine

@bsherman
Copy link
Collaborator

I figured that since we're in full control of the system here and where the binaries are located, hardcoding the binary location won't be as much of an issue.

I'm good with this. Thank you!

Copy link
Collaborator

@bsherman bsherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thank you.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 19, 2024
@bsherman bsherman requested review from bketelsen, p5 and m2Giles November 19, 2024 17:11
@bsherman bsherman enabled auto-merge November 19, 2024 17:11
@bsherman bsherman requested a review from johnmmcgee November 19, 2024 17:11
@bsherman bsherman added this pull request to the merge queue Nov 19, 2024
Merged via the queue into ublue-os:main with commit 95783da Nov 19, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants