-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad initrd generation for non-default snapshot when using systemd-boot and dracut modules: mdraid
, dracut-sshd
#136
Comments
I found some time to look at the sources of this project, related Still, I'd greatly appreciate some pointers where to look. |
Hi @prawilny, sorry for the delay. This dates is always a bit more complicated. You are right, the one that generates the Something that we can check is to disable the service and call Do you know if the sshd and mdraid are usually included, or do you have an specific configuration in dracut.d to add them? |
@aplanas, thank you for your response. Of course I know these days are free for many - that is the very reason why I found some time to tinker. To answer your questions:
I have some questions on my own:
Also, I'll try to reproduce the problem on the next kernel upgrade (it should trigger initrd regeneration, right?) since I'm a bit afraid of rolling back to a snapshot older than today, assuming that I might've forgotten something I did in the period in between. |
None that I can think of. But there is a difference of
Yes, this will work too. But to control better the situation I think that is better if the service is disabled and you manually do the update. After the update we will be in the situation that we need to reboot to activate the new snapshot. This is the same situation when the service is running and a new initrd is created from the old snapshot. We can try to simulate this too, booting from an old snapshot but keeping the default one as such, and trying to create the initrd for the new one from the old one. |
What you said makes a lot of sense, so I went ahead and tried to reproduce using the new
So I think the solution of the mystery of different result is simple - I didn't realize I was running the commands from different snapshots. Still, there remains a problem - why is a wrong initrd image generated in the first place? Can you give me some pointers how to debug it? I think the most important hint would be to point me to the place in code that generates initrd for the new snapshot (I still feel that the In the short term, do you have any idea for a workaround? I think a way to regenerate initrd for a new snapshot from the old one would suffice for now. PS I also took a look at the initrds from previous snapshots and it looks like some of them do have sshd and mdraid and some don't - it looks random to me. |
@prawilny I had more time to dig into this issue:
Can you confirm that sshd is missing in other initrds? As commented in the attached logs the module is present in both. I am trying to reproduce this issue with dracut-sshd but so far I am not able. I understand that you have a RAID configuration? |
@aplanas, responding to you a point at a time:
I ran # part of /lib/dracut/modules.d/90mdraid/module-setup.sh
check() {
local dev holder
# No mdadm? No mdraid support.
require_binaries mdadm expr || return 1
[[ $hostonly ]] || [[ $mount_needs ]] && {
for dev in "${!host_fs_types[@]}"; do
[[ ${host_fs_types[$dev]} != *_raid_member ]] && continue
DEVPATH=$(get_devpath_block "$dev")
for holder in "$DEVPATH"/holders/*; do
[[ -e $holder ]] || continue
[[ -e "$holder/md" ]] && return 0
break
done
done
return 255
}
return 0
} So it seems that it's broken by chrooting to the snapshot. As to
Also, only when writing this message did I check the version and it turns out that the version in tumbleweed of the package is oldish (0.6.1, about 5 years old in comparison to 6 months old latest 0.6.7) and doesn't support putting keys in The log output makes sense since the allowed locations for the keys are as per the documentation:
and since I configured the plugin before the whole debugging started, I ended up putting them under I'm going to live with I also read some dracut code and played a bit with preparing chroot the way it is done by Once again, thank you for your help, @aplanas. |
transactional-update.service
seems to generate a different initrd than transactional-update initrd
mdraid
, dracut-sshd
Update: I realized than when playing with If you want me to try some fixes or answer some questions, just ask. Out of curiousity, do you have any plans to prevent others in the future from stepping into this trap of different behavior for default and nondefault snapshot? |
Oh ... seems to me that you maybe have a good clue here. sdbootutil is not doing the bind mount of /boot (only root and etc for the correct overlay). I can create a package for you with a version of sdbootutil that does this mount to see if it address the problem. I will post the address here in case you want to test it. |
I believe that But for |
If you want to help me, I'll ask you for your work only if my workaround with adding If you want me to test the change you're going to push upstream, I'll gladly help and test it. To be precise about mounts, in my case sdbootutil mounts all |
Yes, [1] https://github.com/openSUSE/sdbootutil/blob/main/sdbootutil#L608 |
I'm also not sure what the correct behavior is. Both for transactional-update and sdbootutil. Just wanted to document the use case. Could you please link me the sdbootutil documentation you mentioned? Or do I just need to read the scripts? |
https://github.com/openSUSE/sdbootutil/blob/main/sdbootutil#L113-L115 |
Ah, I thought you said that what's mounted is documented somewhere. Also, I'll start work soon, so I'll respond only in the evening if there's something to reply to. |
Fix openSUSE/transactional-update#136 Signed-off-by: Alberto Planas <[email protected]>
Fix openSUSE/transactional-update#136 Signed-off-by: Alberto Planas <[email protected]>
@prawilny thanks for your patience. I changed how the chroot is created in sdbootutil in openSUSE/sdbootutil#183, and I package it in this repo: https://download.opensuse.org/repositories/home:/aplanas:/branches:/devel:/microos:/images/openSUSE_Tumbleweed/ Do you want to test it? The change will allow dracut to access directories like /root and /boot/efi, but I did not test the mdraid case |
Fix openSUSE/transactional-update#136 Signed-off-by: Alberto Planas <[email protected]>
Fix openSUSE/transactional-update#136 Signed-off-by: Alberto Planas <[email protected]>
Hello,
I seem to have encountered a peculiar problem: a system upgrade caused by timer-triggered
transactional-update.service
generated an initrd that was missing some dracut modules (in particular:sshd
andmdraid
).After the problematic update, I booted the system, entered the password using the console rather than remotely, found the currently used initrd, and dumped output of
lsinitrd
when specifying it as an argument.The (bad) result:
lsinitrd.bad.txt
Then I just ran
transactional-update initrd
and it generated the initrd that contained the missing modules:The (good) result:
lsinitrd.good.txt
Note that both initrds seem to have been generated with the same
dracut
command (Arguments: --quiet --reproducible --force --tmpdir '/var/tmp'
in the log files).After a reboot, the module was present and I managed to succesfully use SSH to decrypt the drive.
My setup:
systemd-boot
as the bootloader/
: BTRFS RAID1 setup on two LUKS-encrypted partitions of two SSDs/boot
: ext4mdadm
-RAID1 setup on the same (but unencrypted) drives/boot/efi
: single drive vfat on one of the drivessystemd-networkd
networking as described in the project's READMEIs the whole issue caused by some misconfiguration I did?
How can I check it?
I already checked that when using
transactional-update shell
,bash
sees/usr/lib/dracut/modules.d
which is wheresshd
is stored within the system.Where should I look for the documentation that could help me puzzle it out?
Mainly I'd appreciate pointing me to the component that is likely to be the one calling
dracut
and/or ideally some documentation/explanation what parts of the filesystem that caller should see.Please point me to a better place for such a request for help if here isn't an appropriate one.
Of course, I can provide some more logs if they are needed.
edit: I messed up attaching logs, fixed it now.
The text was updated successfully, but these errors were encountered: