-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/create: Support passing --device option to podman-create #1407
Conversation
Build succeeded. ✔️ unit-test SUCCESS in 8m 11s |
There was a related issue with UPDATE: The fix was released in v1.15.0, everything work out of the box now! |
@debarshiray can I attract your attention, get some opinions on this? 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, @Jmennius I finally got myself some NVIDIA hardware to play with this.
I see that the Container Device Interface requires installing the NVIDIA Container Toolkit.
However, as far as I can make out, the nvidia-container-toolkit
or nvidia-container-toolkit-base
packages are only available from NVIDIA's own repositories right now. For example, I am on Fedora 39, and neither the RPMFusion free nor the non-free repositories have it, but they do have NVIDIA's proprietary driver.
Is there anything else other than NVIDIA that uses the Container Device Interface?
I would like to understand the situation a bit better. Ultimately I want to make it as smooth as possible for the user to enable the NVIDIA proprietary driver. That becomes a problem if one needs to enable multiple different unofficial repositories, at least on Fedora.
Yeah, you are right - it's only available in Nvidia repos for Fedora. It would be nice if it was repackaged somehow on rpmfusion... I saw that some distroa package it.
I am not aware of other CDI implementations :(
I guess this is a way to go for the best experience. |
Hi! This is great news!! Is it expected to be merged soon or should I grab the patch? |
This allows to use CDI infrastructure, which is often does more then just mapping devices in /dev - for NVIDIA this will additionally map a set of libraries into container (which are essential to use the device without a hassle). Signed-off-by: Ievgen Popovych <[email protected]>
I'd go for a patch for sure 😉 P.S. You can do this in your [general]
devices = ["nvidia.com/gpu=all"] |
For Silverblue, something that I've come up with to handle regenerating the CDI spec after an upgrade: [Unit]
Description=Update Nvidia CDI configuration
DefaultDependencies=no
Before=systemd-update-done.service
ConditionNeedsUpdate=/etc
[Service]
Type=oneshot
ExecStart=/usr/bin/nvidia-ctk cdi generate --output /etc/cdi/nvidia.yaml
[Install]
WantedBy=multi-user.target
|
Build succeeded. ✔️ unit-test SUCCESS in 6m 37s |
That's a really neat hack, indeed. :) |
I see commits from Intel in github.com/cncf-tags/container-device-interface, which is great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to @Jmennius and @owtaylor I changed my mind about how to enable the proprietary NVIDIA driver in Toolbx containers. Since Intel, NVIDIA and several container tools, including Podman, have embraced the Container Device Interface, it's a better path to take than the unmanaged Flatpak extension approach that I had mentioned before.
However, I want to be a bit careful when using the CDI. The way it's widely advertised requires root privileges, because podman run --device nvidia.com/gpu...
expects the CDI file to be present in either /etc/cdi
or /var/run/cdi
. It's not possible to create the file with nvidia-ctk cdi generate
and put it in those locations without root access. It will be good if we could make it work entirely rootless.
One option is to use the Go packages from tags.cncf.io/container-device-interface and github.com/NVIDIA/nvidia-container-toolkit to create the Container Device Interface file ourselves during enter
and run
, make it available to init-container
, and let it parse and apply it when the container starts. The CDI file is ultimately a bunch of environment variables, bind mounts and hooks to call ldconfig(8)
, so it shouldn't be that hard. Since Toolbx already makes the entire /dev
from the host available to the container, we don't need to worry about the devices.
This avoids the need for root privileges, and has the extra benefit of enabling the driver in existing Toolbx containers.
I have a working proof-of-concept using this approach in #1497 that seems to work with the NVIDIA Quadro P600 GPU on my ThinkPad P72 laptop.
The NVIDIA Container Toolkit code seems to be entirely free software. I wonder if we can get it into Fedora proper, instead of RPMFusion. |
This allows to use CDI infrastructure, which often does more then just mapping devices in
/dev
-for NVIDIA this will additionally map a set of libraries into container (which are essential to use the device without a hassle).
Since this is only a pass-through, maybe instead of having a
--device
specific option we should have an ability to pass arbitrary option topodman-create
? Like aftertoolbox create -c my-container -- --device foo:bar --other-podman-option
?Fixes: #116 (although it is possible to use nvidia devices inside toolboxes - this change improves the usability significantly when using NVIDIA CTK+CDI with toolbox)