Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bootc: Keep /usr read-only after transient transactions ("re-locking") #2195

Open
wants to merge 3 commits into
base: bootc
Choose a base branch
from

Conversation

evan-goode
Copy link
Member

To keep /usr read-only after DNF is finished with a transient transaction, we can call ostree admin unlock --transient to mount the /usr overlay as read-only by default. Then, we create a private mount namespace for DNF and its child processes and remount the /usr overlayfs as read/write in the private mountns. See also this upstream OSTree test: https://github.com/ostreedev/ostree/blob/main/tests/kolainst/destructive/unlock-transient.sh.

Some ugly things/remaining points of uncertainty in this PR:

  • os.unshare is unfortunately only available in Python >= 3.12, so we have to call libc.unshare via Python ctypes here and hardcode the CLONE_NEWNS flag that we need to pass.
  • Using libostree adds a new runtime dependency on python3-gobject and ostree-libs. I added these to dnf.spec behind a %bcond_with ostree conditional.
  • Lazy import of python3-gobject (gi) and OSTree libraries in _BootcSystem.__init__
  • We still call ostree admin unlock --transient via subprocess instead of doing the unlock with libostree. This keeps some complexity and implementation details out of DNF. On the other hand, the OSTree CLI may be less stable than libostree, and we can't customize or translate its output.

Using libostree gives us more detail about the current state of the
deployment than only checking whether /usr is writable.
To keep /usr read-only after DNF is finished with a transient
transaction, we call `ostree admin unlock --transient` to mount the /usr
overlay as read-only by default. Then, we create a private mount
namespace for DNF and its child processes and remount the /usr overlayfs
as read/write in the private mountns.

os.unshare is unfortunately only available in Python >= 3.12, so we have
to call libc.unshare via Python ctypes here and hardcode the CLONE_NEWNS
flag that we need to pass.
Add a disabled-by-default build conditional for ostree dependencies
@pep8speaks
Copy link

Hello @evan-goode! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 648:29: E261 at least two spaces before inline comment
Line 732:121: E501 line too long (131 > 120 characters)

@evan-goode evan-goode requested a review from dcantrell January 16, 2025 20:22
Copy link
Contributor

@dcantrell dcantrell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. My only real issue is with the libc_so constant, but I'm not going to die on that hill. I'm just in the mood for some good yak shaving.

@@ -31,6 +31,7 @@ INSTALLONLYPKGS=['kernel', 'kernel-PAE',
'installonlypkg(kernel-module)',
'installonlypkg(vm)',
'multiversion(kernel)']
LIBC_SONAME = "libc.so.6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing constant is change. Though we have had the DT_SONAME libc.so.6 for 25+ years, I bet it will change in the future. Probably for a dumb reason too. At least musl uses libc.so.1 right now and someone out there is likely to try and make this work on musl.

We have a Python module called pyelftools, available in Fedora as python3-pyelftools. You could do something like this to get the libc SONAME dynamically:

import sys
from elftools.elf.elffile import ELFFile

....... (various other code)

libc = None
f = open(sys.executable, 'rb')
ef = ELFFile(f)

dyn = ef.get_section_by_name('.dynamic')

for tag in dyn.iter_tags():
    if tag.entry.d_tag != 'DT_NEEDED':
        continue

    libc = tag.needed

f.close()
print("libc SONAME is [probably]: %s" % libc)

Read the DT_NEEDED entries on the executable running the code, which would be the Python executable. This is likely over-engineering, but constants like this always come back up later to present a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, maybe we can use ctypes.CDLL(None):

https://bugs.python.org/issue34592

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well there you go looking for the easy and shorter non-yak shaving way to do it!


def _bootc_unlock():
"""Set up a writeable overlay on bootc systems."""
CLONE_NEWNS = 0x00020000 # defined in linux/include/uapi/linux/sched.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, but yeah.

@cgwalters
Copy link
Contributor

On the other hand, the OSTree CLI may be less stable than libostree

The ostree CLI is stable, we will never break it.

and we can't customize or translate its output.

I'd just redirect its output to /dev/null and keep the text here instead.

@cgwalters
Copy link
Contributor

Thanks for working on this! We need to expose this feature in bootc too actually.

That said see also ostreedev/ostree#3177 - it's temping to enable that by default in the base image.

@cgwalters
Copy link
Contributor

Unless we get to the point of deeper integration, I'd vote for just forking the CLI over linking to libostree.

An advantage of forking is that we get "loose coupling" - I think we probably shouldn't pull in libostree into every system using dnf in rhel9 and 10 today.

@evan-goode
Copy link
Member Author

Thanks for taking a look!

The ostree CLI is stable, we will never break it.
I'd just redirect its output to /dev/null and keep the text here instead.

Roger.

Unless we get to the point of deeper integration, I'd vote for just forking the CLI over linking to libostree.

For reading the unlocked state, I suppose we could grep ostree admin status for Unlocked: (unlocked state nick) instead of using libostree.

An advantage of forking is that we get "loose coupling" - I think we probably shouldn't pull in libostree into every system using dnf in rhel9 and 10 today.

Since this is Python, we can achieve sort of the same loose coupling by lazily importing the libostree GObject API like this. The spec file changes here/splitting the DNF builds isn't strictly necessary since we can check at runtime whether libostree is available. But thinking ahead to DNF 5, there we'll probably want to stick to forking the CLI, so we might as well here too.

ostreedev/ostree#3177

I think this would still work fine if the transient RO usroverlay were there by default, as long as ostree_deployment_get_unlocked still correctly reports transient.

@cgwalters
Copy link
Contributor

cgwalters commented Jan 17, 2025

For reading the unlocked state, I suppose we could grep ostree admin status for Unlocked: (unlocked state nick) instead of using libostree.

Oh sorry yes...I hadn't thought about the state reading part. Yeah, ostree really needs json output, but this leads to the next thing that we likely want this state in the bootc status.

EDIT: filed containers/bootc#1044

@cgwalters
Copy link
Contributor

Only very tangential to this but not quite sure where to put this (another issue here? JIRA?) at some point we are going to need to fix the problem that dnf upgrade should probably error out if an attempt is made to change nontrivial things outside of /usr and probably /etc. The key ones here are kernel (needs deeper integration, ref coreos/rpm-ostree#5135 ) and selinux-policy (ref https://bugzilla.redhat.com/show_bug.cgi?id=1290659 ) and shim/grub (ref https://github.com/coreos/bootupd/ )

Ideally we error before we even start an install; if we download filelists this would be pretty easy; without filelists maybe we should have a Provides that flags these special cases? Or we could inject into the base image something that denylists them in the dnf config?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants