Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dracut: remove dependency on plymouth #3337

Merged
merged 3 commits into from
Jun 26, 2023
Merged

dracut: remove dependency on plymouth #3337

merged 3 commits into from
Jun 26, 2023

Conversation

mberndt123
Copy link
Contributor

I'm adding stratis root fs support on the NixOS distribution, and they asked me to make it work without plymouth, so I've replaced that with systemd-ask-password.

@jbaublitz
Copy link
Member

Hey @mberndt123! I remember originally trying this when I was developing this work, but couldn't quite remember what went wrong and why I opted for plymouth. I've set up an environment to test this and on Fedora, the problem I'm seeing is that systemd-ask-passphrase returns an error (errno 6, No such device or address) three times and then the boot fails to start. Because this change would break Fedora, I'd say we're definitely open to accepting this change if you can find a way to make it work on Fedora too.

@mberndt123
Copy link
Contributor Author

mberndt123 commented May 9, 2023

Hey @jbaublitz,

I'm using Fedora myself and that exact script works fine on my machine. Maybe some other dracut module on my system is pulling in functionality that is required for systemd-ask-password to work, and that module is not present on your system.
Dracut comes with a 01systemd-ask-password module, so I've added a dependency on that in module-setup.sh (and also removed some plymouth leftovers that shouldn't be required any longer). Can you try it again and see if it works now?

@jbaublitz
Copy link
Member

That doesn't resolve the issue for me. Just so you're aware, systemd-ask-passphrase defaults to using plymouth for passphrase prompts. Are you doing something different on your system?

@mberndt123
Copy link
Contributor Author

mberndt123 commented May 9, 2023

I can't think of anything I'm doing differently. I've generated initramfs images both with and without the plymouth module and either works fine. Without plymouth it just shows a password prompt in the console, with plymouth it shows the typical plymouth password prompt.

Do you have any suggestions how I might be able to reproduce this?

@mberndt123
Copy link
Contributor Author

I've tried to disable as many dracut modules as reasonably possible…

$ lsinitrd initrd 
Image: initrd: 23M
========================================================================
Version: dracut-059-3.fc38

Arguments:  --omit 'plymouth crypt dbus-broker dbus ifcfg network network-manager nss-softokn i18n kernel-modules-extra rootfs-block terminfo usrmount fs-lib memstrack' -v

dracut modules:
systemd
systemd-ask-password
systemd-initrd
dm
kernel-modules
stratis
udev-rules
dracut-systemd
base
shutdown
========================================================================

… but the generated initramfs can still boot my system just fine.

Can you provide a virtual machine to reproduce this with?

@jbaublitz
Copy link
Member

@mberndt123 Please see https://stratis-storage.github.io/stratify/ for an overview of how I'm setting up the VMs that I'm using. After that, I check out your git repo, run make build-all, and then sudo make install. Can you try that process and see if it's reproducible for you?

@mberndt123
Copy link
Contributor Author

Thanks @jbaublitz. It's a bit late here to try that now (00:27), but I hope I'll get around to it tomorrow.

@mberndt123
Copy link
Contributor Author

Did you use Fedora 37 or 38 for that? I'm on Fedora 38.

@jbaublitz
Copy link
Member

I'm using Fedora 38 too.

@jbaublitz
Copy link
Member

Hi @mberndt123! I've done a little bit more debugging and this may just be triggering a bug in our code. Part of the reason I believe this is the case is that when I pull your commits in before a commit that reworked the crypt module rather heavily, booting is successful. Futhermore, after this commit, I'm seeing some really bizarre behavior in the recovery console. Let me see if I can track down exactly what's going on before you invest any more time in this.

@jbaublitz
Copy link
Member

I've confirmed that the rpassword update is what breaks systemd-ask-passphrase! I will get more information on what we can do once I talk with @mulkieran.

@mberndt123
Copy link
Contributor Author

mberndt123 commented May 13, 2023

Wow, good news! I was testing this change by simply copying those shell scripts to /usr/lib/dracut/modules.d/90stratis and regenerating the initramfs, whereas you did things properly and installed the complete branch, including the broken rpassword. So it makes sense we got different results.

Maybe a solution is to have systemd-ask-password store the password in the kernel keyring and have Stratis read it from there, then you don't need rpassword at all.

This was referenced May 17, 2023
@jbaublitz
Copy link
Member

I've been working on #3347 and once that's merged, I'll probably ask for some PR revisions around error handling in your PR, but other than that, I think I've got this working!

@jbaublitz jbaublitz self-requested a review May 17, 2023 14:53
@jbaublitz
Copy link
Member

Okay, I've tested #3347 and fixed some bugs. I'm going to leave a review now.

dracut/90stratis/stratis-rootfs-setup Outdated Show resolved Hide resolved
@jbaublitz
Copy link
Member

@mberndt123 Okay, so we have tracked down all of the problems we've seen with this code and one of them appears to be stratis-storage/devicemapper-rs#856. For some reason, your code seems to trigger this bug in our code, so we'll likely want to fix this first, but as stated above, I'll approve this PR after you drop the most recent commit, and then we'll probably wait on fixing that problem with our semaphore handling.

@jbaublitz jbaublitz requested a review from mulkieran May 18, 2023 13:56
@mberndt123
Copy link
Contributor Author

Wow, that's crazy that a change like this can uncover that sort of bug!

@mulkieran
Copy link
Member

@mberndt123 I'm afraid the CI doesn't like your fomatting. Plz run make fmt-shell in the top-level directory and fix up that which needs to be fixed up. Thanks for the PR!

@mberndt123
Copy link
Contributor Author

Thanks @mulkieran, formatting should be OK now.

@mulkieran
Copy link
Member

@jbaublitz Have we merged all the pre-requisiites and is it fine to merge this now?

@jbaublitz
Copy link
Member

@mulkieran I think we just need to merge one last PR from @bmr-cymru to handle the faulty state that udevd gets into only in Fedora 38. Udevd doesn't get shut down according to systemd in emergency mode, but no longer has a component active that allows udev sync to complete. This is what's causing the hanging. Once we add that PR, devicemapper will no longer attempt to synchronize if that component is detected to be down, avoiding the hang in Fedora 38.

@mberndt123
Copy link
Contributor Author

Actually while I was reading this script, I thought about whether polling is really the optimal solution here. Instead of calling stratis-min pool is-stopped "$STRATIS_ROOTFS_UUID" repeatedly until the pool shows up, we could have a new command like stratis-min pool wait "$STRATIS_ROOTFS_UUID" that will block until that pool shows up or a timeout expires. I could implement that if you think it's a good idea, obviously in a PR separate from this one.

Which brings me to another question: where can I get in touch with the Stratis team to discuss this kind of proposal? Because on stratis-storage.github.io I can only see links to GitHub and Twitter. Should I just raise an issue on GitHub to discuss that?

@mulkieran
Copy link
Member

Actually while I was reading this script, I thought about whether polling is really the optimal solution here. Instead of calling stratis-min pool is-stopped "$STRATIS_ROOTFS_UUID" repeatedly until the pool shows up, we could have a new command like stratis-min pool wait "$STRATIS_ROOTFS_UUID" that will block until that pool shows up or a timeout expires. I could implement that if you think it's a good idea, obviously in a PR separate from this one.

Which brings me to another question: where can I get in touch with the Stratis team to discuss this kind of proposal? Because on stratis-storage.github.io I can only see links to GitHub and Twitter. Should I just raise an issue on GitHub to discuss that?

Raising an issue on GitHub is really the best for us.

@mberndt123
Copy link
Contributor Author

Any updates on this?

@jbaublitz
Copy link
Member

Once stratis-storage/devicemapper-rs#859 gets merged and propagated to stratisd, we can merge this. @mulkieran Any reason not to merge the devicemapper PR now?

@jbaublitz
Copy link
Member

@mberndt123 We're just waiting on #3367.

@mulkieran
Copy link
Member

@mberndt123 Plz rebase at your convenience and we should be able to merge. Thanks!

@mberndt123
Copy link
Contributor Author

@mulkieran ok I've rebased it

@mulkieran mulkieran merged commit 43b0f42 into stratis-storage:master Jun 26, 2023
@mulkieran
Copy link
Member

thanks!

@mberndt123
Copy link
Contributor Author

mberndt123 commented Jun 26, 2023

@mulkieran @jbaublitz
thank you!

@mberndt123 mberndt123 deleted the mberndt123/initrd-without-plymouth branch June 27, 2023 16:44
mulkieran added a commit to mulkieran/stratisd that referenced this pull request Jul 31, 2023
It is no longer used to collect a password anymore.

See: stratis-storage#3337.

Signed-off-by: mulhern <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants