-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buildah 1.38.1 build "FROM oci-archive:" failed #5952
Comments
Detected in the Image Mode pipelines which cannot use v1.38.1 for now. https://gitlab.com/redhat/rhel/bifrost/compose-bootc/-/merge_requests/208/diffs |
@henrywang do we have an easy reproducer? |
I think this is a regression from 25a3b38 This feature is really key as a way to generate chunked/reproducible containers and we've been using it for quite some time for rpm-ostree, and we have plans in the immediate future to promote this more for custom base images (ref https://gitlab.com/fedora/bootc/tracker/-/issues/32 and coreos/rpm-ostree#5221 specifically) Way back in coreos/rpm-ostree#4688 (comment) there was a soft commitment to support this. My understanding is that there are security implications (not mentioned in the commit message) that motivated 25a3b38 Can we elaborate on those now that the commit is public (I don't see a corresponding PR with discussion)? It seems to me that what we always wanted here is a way to write not to the build context, but just any internal filesystem (a tmpfs or a tempdir bind mount) whose lifetime is scoped to the entire build, right? |
The relevant CVE is GHSA-5vpc-35f4-r8w6 A core part of this was the ability to write to the host filesystem from mounts, which is why it was disabled - so what you desire is a large part of a high-severity CVE. I don't see us turning this back on. Adding an option with the understanding that enabling it is explicitly insecure and exposes the host to potential escape by malicious Dockerfile is a potential option, but that also feels very bad. |
Here's a trivial example of a "build" that actually just copies from a registry into a local oci: directory via skopeo - but the point is that an "oci" directory is something that any tool running in the container build can synthesize however it wants - without e.g. being subject to injected floating-timestamp content from buildah itself etc. but also more generally having total power over structuring the layout of the target container image.
|
Yes, agree adding an option isn't what we want here. Let's instead dig more into the final paragraph of my previous comment:
To flesh this out slightly...maybe we could use
Note if we take this step and formalize things more it actually gets a lot cleaner as the builder can know the ordering here and we can declare that the cached OCI directory is automatically consumed (since why would you leave it around)? |
Caches are not scoped to the lifetime of a given build. |
Yes I know - its lifetime is longer - which isn't needed here, but should still work. In my proposal you'd just end up with an empty directory/image as a cache; which actually we could detect that's the case and just discard. Do you have any alternative proposals? |
Maybe another possible fix would be to expand the lifetime of the overlayfs created from a I guess though a downside of this might be that it's different from what Docker is doing today, which could confuse people. But at the cost of getting more hacky, we could do that scope expansion only if we detect there's a |
I should emphasize here though again: our production builds use this feature today. Having a working version of this (and really ideally compatible with the existing Containerfile, although that wouldn't be a requirement) as relatively soon as possible (e.g. < 1 month) would be quite helpful. I think in the short term we can just hold off on applying the buildah update, but that's obviously not sustainable. Among other things, a big part of the entire point of this is that we can tell people the base image build is just "podman build" today, and after updating that no longer works. |
It looks like I guess the more I think about the more I'd say:
|
I would have (and have previously) suggested solving this at the pipeline level. Anyway, the reason the FROM fails is that the OCI archive is no longer being written to the actual build context directory by the previous stage, so it's literally not where the FROM instruction says it is. Unless it's a remote build being done via When we teach the |
It looks like that flag is there because the set of accepted flags for |
Can you elaborate on that? Previously in this comment you said:
In what we're doing here we aren't using buildah in any way because we want fine-grained control over the bit for bit output of the OCI archive. But your point more generally here I would phrase as "don't try to do it via Containerfile" - which definitely makes sense as a starting point, except I'd just reiterate that going from "podman build" to "run this script/Makefile" is a large leap. Actually today e.g. on MacOS (or more generally with podman-machine) this trick doesn't work; I didn't dig in but I think it has to do with us always copying the build context with podman-remote. But in theory, it could work on podman-remote/podman-machine cases, which is quite important because there's lots of infrastructure built on top of "podman-build". If we have a custom script/pipeline, then especially in podman-remote cases we'd have to do some careful juggling to make sure our build works there. Hmm...I guess in theory we could implement our build process as "spawn a container image with write access to containers-storage:" (or if one wants to isolate it more, "spawn a container imgae with write access to an empty tempdir, which at the end of the build should contain an oci directory with one image"). But we just went through a large effort to switch our Konflux builds over to the "stock" buildah task they expose. Reworking everywhere we do a build, from the documented local developer entrypoint to Konflux would not be a small thing for us. It's big enough that if I had to I'd rather spend that engineering time on implementing a more sustainable way to do this in buildah myself. |
Hmm...but here we're generating an entire OCI layout inside a build stage, we can't provide it as a build context at the start of a build. |
I'm pretty sure that at that point I was referring to "FROM oci-archive:", and if you're feeling charitable, depending on it.
Agreed, it's not as simple a story to tell.
I have personally tripped over the overlay driver's mount_program option having to be specified, or not specified, more than I'd like to have, to the point that I tend to want to avoid sharing the storage space. That's before you start worrying about different storage drivers. The |
Can you indicate level of support for the "short+medium term options" listed in this comment you have ranging from say "yes I'd implement" to "would take a patch" to "no"? Most especially the "short term" one of lifetime-extending only if we detect Actually let me just repeat it here, and expand/elaborate: Option lifetime-extend:
Advantages:
Option
|
Having just done the changes to make this an overlay, I can say that the number of places in the source code that this impacts would be high. I also don't relish creating more differences in behavior that a user has to remember and keep track of, so I would lean toward not doing this.
|
How would the syntax be incompatible? The leading |
It's more than a story, it's about the ecosystem. While there certainly are multiple build systems to build container images, Dockerfile is very, very popular as I think everyone here knows. And that means that actual infrastructure has been built on top of it. At Red Hat for example with Konflux there's been an investment in a few things like "hermetic" builds where the builds can't fetch arbitrary things from the network, only a restricted set of inputs. Also, there's policies around which container images can be used in tasks. Both of these things are today applied on top of Dockerfile/Containerfile (though there's also lockfiles for rpms and other source content needed). As soon as "build this project" becomes more freeform than that, it needs custom logic to match these types of policies. Another related thing is there are some parts of Dockerfile that function perfectly fine and don't need reinventing, such as e.g. The fact that we were able to fit custom container images into the "shape" of a Dockerfile and again continue to reuse existing things like |
There is no leading
The ideal for me here is an approach that doesn't depend on implementing custom syntax, and that doesn't break when we fix compliance bugs. |
Right, it would clearly make sense to try to get this into Docker at some point, though last I looked at that the ecosystem there is around buildx, which basically a whole lot of people who want to build containers in a more intelligent way are using. But maybe they'd be amenable to the One thing I realized in a conversation just now with the team is that actually because of this writable mount to the build context and some other reasons, we'd required the build to be run with So here's yet another, hopefully more targeted short term proposal: Change buildah to detect when the input build is privileged in this way (
or so |
quay.io/buildah/stable:v1.38.1
failed to build image withFROM oci-archive:./out.ociarchive
and got error:quay.io/buildah/stable:v1.38.0
does not have this issue.The text was updated successfully, but these errors were encountered: