Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set LIB_PATH to libnvidia-ml.so.1 instead of libnvidia-ml.so on Linux #63

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dmitryduev
Copy link

In the official Go bindings for NVML, they use libnvidia-ml.so.1: https://github.com/NVIDIA/go-nvml/blob/0e815c71ca6e8184387d8b502b2ef2d2722165b9/pkg/nvml/lib.go#L30, and I believe the same is true for pynvml.

@scaronni
Copy link

Yes please, in driver 560 and above, as shipped in the CUDA repository, we've also removed the symlink to the unversioned library.

The approach of loading libnvidia-ml.so.1 is the correct approach. The unversioned library should be used only for linking building against it.

In the previous driver versions, there was in fact an nvidia-driver-devel subpackage which contained the unversioned libnvidia-ml.so library. But that was a mistake, as the package could not really be used as the unversioned libraries contained therein did not have any header for compiling against them and it was a leftover.

The few remaining unversioned libraries that are required in the driver have been moved to the main library packages.

Regarding NVML, the NVML stub and the headers are in the cuda-nvml-devel package, so if you need to link to it that's what should be installed.

Sample output for the RPM (deb is similar):

$ rpm -qpl cuda-nvml-devel-12-6-12.6.77-1.x86_64.rpm | grep targets
/usr/local/cuda-12.6/targets
/usr/local/cuda-12.6/targets/x86_64-linux
/usr/local/cuda-12.6/targets/x86_64-linux/include
/usr/local/cuda-12.6/targets/x86_64-linux/include/nvml.h
/usr/local/cuda-12.6/targets/x86_64-linux/lib
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libnvidia-ml.a
/usr/local/cuda-12.6/targets/x86_64-linux/lib/stubs/libnvidia-ml.so

Again, please stick to loading libnvidia-ml.so.1 which is the correct approach. Thanks!

@dmitryduev
Copy link
Author

@Cldfire can I please get a stamp?

@scaronni
Copy link

This is actually again #47

@Cldfire
Copy link
Owner

Cldfire commented Dec 13, 2024

Hi folks. My apologies for the delay here, and thank you for the PR and the information :)

I've recently started a job at Apple which makes it difficult for me to continue maintaining this library. I am in the process of finding new ownership for this repository, and I've also reached out to a contact at NVIDIA to see if there's any interest on their side in making this crate more official.

I'll provide an update in the coming weeks. In the meantime please continue to use NvmlBuilder to load libnvidia-ml.so.1.

al42and added a commit to al42and/bottom that referenced this pull request Jan 4, 2025
Recently, NVIDIA CUDA repository packages started shipping only
`libnvidia-ml.so.1` file, without `libnvidia-ml.so`. The upstream
`nvml-wrapper` package has a fix proposed
(Cldfire/nvml-wrapper#63), yet the package is
in search of a maintainer at the moment.

To allow `bottom` to correctly detect NVIDIA GPUs on Ubuntu with
official NVIDIA packages, add a wrapper around `Nvml::init` to be more
persistent in its search for the NVML library.
al42and added a commit to al42and/bottom that referenced this pull request Jan 4, 2025
Recently, NVIDIA CUDA repository packages started shipping only
`libnvidia-ml.so.1` file, without `libnvidia-ml.so`. The upstream
`nvml-wrapper` package has a fix proposed
(Cldfire/nvml-wrapper#63), yet the package is
in search of a maintainer at the moment.

To allow `bottom` to correctly detect NVIDIA GPUs on Ubuntu with
official NVIDIA packages, add a wrapper around `Nvml::init` to be more
persistent in its search for the NVML library.
ClementTsang pushed a commit to ClementTsang/bottom that referenced this pull request Jan 7, 2025
Recently, NVIDIA CUDA repository packages started shipping only
`libnvidia-ml.so.1` file, without `libnvidia-ml.so`. The upstream
`nvml-wrapper` package has a fix proposed
(Cldfire/nvml-wrapper#63), yet the package is
in search of a maintainer at the moment.

To allow `bottom` to correctly detect NVIDIA GPUs on Ubuntu with
official NVIDIA packages, add a wrapper around `Nvml::init` to be more
persistent in its search for the NVML library.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants