Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vllm #28931

Draft
wants to merge 112 commits into
base: main
Choose a base branch
from
Draft

Add vllm #28931

wants to merge 112 commits into from

Conversation

maresb
Copy link
Contributor

@maresb maresb commented Jan 25, 2025

Very rough draft. I will almost certainly require help.

Opened on the advice of @h-vetinari in conda-forge/xformers-feedstock#42

Direct and transitive dependencies:

Checklist

  • Title of this PR is meaningful: e.g. "Adding my_nifty_package", not "updated meta.yaml".
  • License file is packaged (see here for an example).
  • Source is from official source.
  • Package does not vendor other packages. (If a package uses the source of another package, they should be separate packages or the licenses of all packages need to be packaged).
  • If static libraries are linked in, the license of the static library is packaged.
  • Package does not ship static libraries. If static libraries are needed, follow CFEP-18.
  • Build number is 0.
  • A tarball (url) rather than a repo (e.g. git_url) is used in your recipe (see here for more details).
  • GitHub users listed in the maintainer section have posted a comment confirming they are willing to be listed there.
  • When in trouble, please check our knowledge base documentation before pinging a team.

Copy link
Contributor

Hi! This is the staged-recipes linter and your PR looks excellent! 🚀

@conda-forge-admin
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found some lint.

Here's what I've got...

For recipes/vllm/recipe.yaml:

  • ❌ license_file entry is missing, but is required.
  • ❌ Non noarch packages should have python requirement without any version constraints.
  • ❌ Non noarch packages should have python requirement without any version constraints.

For recipes/vllm/recipe.yaml:

  • ℹ️ Please depend on pytorch directly. If your package definitely requires the CUDA version, please depend on pytorch =*=cuda*.
  • ℹ️ Use importlib-metadata instead of importlib_metadata
  • ℹ️ PyPI default URL is now pypi.org, and not pypi.io. You may want to update the default source url.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12962027449. Examine the logs at this URL for more detail.

@conda-forge-admin
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found it was in an excellent condition.

@maresb
Copy link
Contributor Author

maresb commented Jan 25, 2025

Interesting, we're getting different results between CUDA 11.8 and 12.0.

Both fail in the following command:

['cmake', '$SRC_DIR', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', 
'-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=$PREFIX/bin/python', 
'-DVLLM_PYTHON_PATH=$PREFIX/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process:$PREFIX/lib/python39.zip:$PREFIX/lib/python3.9:$PREFIX/lib/python3.9/lib-dynload:$PREFIX/lib/python3.9/site-packages:$PREFIX/lib/python3.9/site-packages/setuptools/_vendor', 
'-DFETCHCONTENT_BASE_DIR=$SRC_DIR/.deps', '-DNVCC_THREADS=1',
'-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=2']

12.0 fails earlier at CUDA detection:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.2.0
 │ │       -- Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "12.0")
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:31 (message):
 │ │         Caffe2: CUDA cannot be found.  Depending on whether you are building Caffe2
 │ │         or a Caffe2 dependent library, the next warning / error will give you more
 │ │         info.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       CMake Error at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:90 (message):
 │ │         Your installed Caffe2 version uses CUDA but I cannot find the CUDA
 │ │         libraries.  Please set the proper CUDA prefixes and / or install CUDA.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)

11.8 gets further:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.2.0
 │ │       -- Found CUDA: /usr/local/cuda (found version "11.8")
 │ │       -- The CUDA compiler identification is NVIDIA 11.8.89 with host compiler GNU 11.4.0
 │ │       -- Detecting CUDA compiler ABI info
 │ │       -- Detecting CUDA compiler ABI info - done
 │ │       -- Check for working CUDA compiler: $PREFIX/bin/nvcc - skipped
 │ │       -- Detecting CUDA compile features
 │ │       -- Detecting CUDA compile features - done
 │ │       -- Found CUDAToolkit: /usr/local/cuda/include (found version "11.8.89")
 │ │       -- Caffe2: CUDA detected: 11.8
 │ │       -- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
 │ │       -- Caffe2: CUDA toolkit directory: /usr/local/cuda
 │ │       -- Caffe2: Header version is: 11.8
 │ │       -- Found Python: $PREFIX/bin/python (found version "3.9.21") found components: Interpreter
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message):
 │ │         Failed to compute shorthash for libnvrtc.so
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       CMake Warning (dev) at $PREFIX/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
 │ │         The package name passed to `find_package_handle_standard_args` (nvtx3) does
 │ │         not match the name of the calling package (Caffe2).  This can lead to
 │ │         problems in calling code that expects `find_package` result variables
 │ │         (e.g., `_FOUND`) to follow a certain pattern.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:174 (find_package_handle_standard_args)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       This warning is for project developers.  Use -Wno-dev to suppress it.
 │ │       
 │ │       -- Could NOT find nvtx3 (missing: nvtx3_dir)
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:180 (message):
 │ │         Cannot find NVTX3, find old NVTX instead
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Automatic GPU detection failed. Building for common architectures.
 │ │       -- Autodetected CUDA architecture(s): 3.5;5.0;8.0;8.6;8.9;9.0;8.9+PTX;9.0+PTX
 │ │       -- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_89,code=compute_89;-gencode;arch=compute_90,code=compute_90
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       -- CUDA target architectures: 3.5;5.0;8.0;8.6;8.9;9.0
 │ │       -- CUDA supported target architectures: 8.0;8.6;8.9;9.0
 │ │       -- FetchContent base directory: $SRC_DIR/.deps
 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/share/cmake-3.31/Modules/ExternalProject.cmake:3041 (_ep_add_download_command)
 │ │         CMakeLists.txt:29 (ExternalProject_Add)
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:84 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Automatic GPU detection failed. Building for common architectures.
 │ │       -- Autodetected CUDA architecture(s): 3.5;5.0;8.0;8.6;8.9;9.0;8.9+PTX;9.0+PTX
 │ │       -- Added CUDA NVCC flags for: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_89,code=sm_89;-gencode;arch=compute_90,code=sm_90;-gencode;arch=compute_89,code=compute_89;-gencode;arch=compute_90,code=compute_90
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       -- CUDA target architectures: 3.5;5.0;8.0;8.6;8.9;9.0
 │ │       -- CUDA supported target architectures: 8.0;8.6;8.9;9.0
 │ │       -- FetchContent base directory: $SRC_DIR/.deps
 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/share/cmake-3.31/Modules/ExternalProject.cmake:3041 (_ep_add_download_command)
 │ │         CMakeLists.txt:29 (ExternalProject_Add)
 │ │       
 │ │       
 │ │       -- Configuring incomplete, errors occurred!

@conda-forge-admin
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found some lint.

Here's what I've got...

For recipes/vllm/recipe.yaml:

  • ❌ Selectors in comment form no longer work in v1 recipes. Instead, if / then / else maps must be used. See lines [39, 41, 46, 48, 49].

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12967644561. Examine the logs at this URL for more detail.

@conda-forge-admin
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/vllm/recipe.yaml) and found it was in an excellent condition.

@h-vetinari
Copy link
Member

Thanks @maresb! I had forgotten that there's already #24710, perhaps @mediocretech would be interested in collaborating?

W.r.t. CUDA, we need to move on from 12.0 here, which isn't used anywhere else in conda-forge anymore - it's just that staged-recipes seems to have been forgotten in the context of conda-forge/conda-forge-pinning-feedstock#6630.

@maresb
Copy link
Contributor Author

maresb commented Jan 26, 2025

Oh, I didn't notice that effort, thanks @h-vetinari! Although that's old it looks like @rongou is eager to help! 🚀

Do you think that CUDA 12.0 is actually causing a problem here? I was thinking (i.e. wildly guessing) that we need to patch CMakeLists.txt, but I've never used cmake. 😞

@h-vetinari
Copy link
Member

Mainly I want to avoid redundant work. As soon as #28938 is in and we have merged main here, I'll be happy to take a look what's going on.

@h-vetinari
Copy link
Member

In any case, you'll have to address

 │ │       CMake Error at $PREFIX/share/cmake-3.31/Modules/ExternalProject/shared_internal_commands.cmake:943 (message):
 │ │         error: could not find git for clone of cutlass-populate

@maresb
Copy link
Contributor Author

maresb commented Jan 27, 2025

Woah, after adding git as a host dependency it's compiling on CUDA 11.8 until it runs out of memory and crashes. Maybe I can add some swap. CUDA 12.0 is still not being discovered.

...
 │ │ Building wheels for collected packages: vllm
 │ │   Building wheel for vllm (pyproject.toml): started
 │ │   Building wheel for vllm (pyproject.toml): still running...
...
 │ │   Building wheel for vllm (pyproject.toml): still running...
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
##[warning]Free memory is lower than 5%; Currently used: 95.80%
 │ │   Building wheel for vllm (pyproject.toml): still running...
 │ │   Building wheel for vllm (pyproject.toml): still running...
 │ │   Building wheel for vllm (pyproject.toml): still running...

Copy link
Contributor

github-actions bot commented Jan 27, 2025

Hi! This is the staged-recipes linter and I found some lint.

It looks like some changes were made outside the recipes/ directory. To ensure everything runs smoothly, please make sure that recipes are only added to the recipes/ directory and no other files are changed.

If these changes are intentional (and you aren't submitting a recipe), please add a maintenance label to the PR.

File-specific lints and/or hints:

  • .azure-pipelines/azure-pipelines-osx.yml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • conda-forge.yml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/linux64.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .azure-pipelines/azure-pipelines-linux.yml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/linux64_cuda118.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/linux_aarch64.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/linux64_cuda126.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/run_docker_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/new_run_docker_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/debug_osx_arch.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/osx64.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .ci_support/osx_arm64.yaml:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/new_run_osx_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.
  • .scripts/run_osx_build.sh:

    • lints:
      • Do not edit files outside of the recipes/ directory.

@maresb
Copy link
Contributor Author

maresb commented Jan 27, 2025

Ah, hmm, I just added swap to conda-forge.yml. Not sure how that's supposed to work here on staged-recipes. 🤔

EDIT: Oh good, the linter is complaining, so that will help us to remember to revert it before merging.

EDIT2: Hmm, it seems that the swap setting works on linux_64 but fails on linux_64_cuda_*:

image

@maresb
Copy link
Contributor Author

maresb commented Jan 29, 2025

Hi @h-vetinari!

As soon as #28938 is in and we have merged main here, I'll be happy to take a look what's going on.

As a brief summary of the above, I merged main into this branch after #28938 was merged into main. It didn't seem to change anything with respect to the errors.

On CUDA 12.x I'm hitting the error:

Your installed Caffe2 version uses CUDA but I cannot find the CUDA
 │ │         libraries.  Please set the proper CUDA prefixes and / or install CUDA

On 11.8, after adding git as a host dependency, compilation starts but it runs out of memory. I tried to add swap by editing conda-forge.yml, but it didn't apply to the CUDA builds.

I'd be grateful for any advice you could provide. Thanks!

@h-vetinari
Copy link
Member

On CUDA 12.x I'm hitting the error:

We're (now) aware of the CUDA-angle of conda-forge/pytorch-cpu-feedstock#333

On 11.8, after adding git as a host dependency, compilation starts but it runs out of memory. I tried to add swap by editing conda-forge.yml, but it didn't apply to the CUDA builds.

#28979

@maresb
Copy link
Contributor Author

maresb commented Feb 1, 2025

I would have hoped to get more out of setting VERBOSE=1. The only logs I get are:

 │ │   Building wheel for vllm (pyproject.toml): still running...

VERBOSE=1 is supposed to add the flag -DCMAKE_VERBOSE_MAKEFILE=ON. Not sure what exactly that does.

Here's the corresponding Python code to go from the envvar to get the flag:

https://github.com/vllm-project/vllm/blob/a1fc18c030e4d0466f2b23cb7dd4d11ce4b85603/vllm/envs.py#L138-L140

https://github.com/vllm-project/vllm/blob/a1fc18c030e4d0466f2b23cb7dd4d11ce4b85603/setup.py#L132-L134

@maresb maresb closed this Feb 4, 2025
@maresb maresb reopened this Feb 4, 2025
@shermansiu shermansiu mentioned this pull request Feb 12, 2025
2 tasks
@shermansiu
Copy link

Hmm, it still appears broken after conda-forge/pytorch-cpu-feedstock#339.

Could NOT find CUDA (missing: CUDA_INCLUDE_DIRS) (found version "12.6")

Is CUDA_INCLUDE_DIRS properly set?

@shermansiu
Copy link

shermansiu commented Feb 12, 2025

The CUDA 11.8 build probably fails because it's out of disk space and/or RAM, but that's just speculation:

##[warning]Free disk space on / is lower than 5%; Currently used: 95.08% (x5)

##[warning]Free memory is lower than 5%; Currently used: 96.11% (x5)

@maresb
Copy link
Contributor Author

maresb commented Feb 12, 2025

Hey @shermansiu, great to have you around!!!

I'm a bit lost since I'm not very familiar with CUDA.

I was just now having some trouble getting the CI to rerun the CUDA builds, but rebasing seems to have fixed it.

Also, post-rebase things seem to be proceeding slightly further for 12.6:

 │ │       -- Caffe2: Found protobuf with new-style protobuf targets.
 │ │       -- Caffe2: Protobuf version 28.3.0
 │ │       -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"
 │ │       -- Found CUDAToolkit: $PREFIX/targets/x86_64-linux/include (found version "12.6.85")
 │ │       -- Check for working CUDA compiler: $PREFIX/bin/nvcc - skipped
 │ │       -- Detecting CUDA compile features
 │ │       -- Detecting CUDA compile features - done
 │ │       -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"
 │ │       -- Caffe2: CUDA detected: 12.6.85
 │ │       -- Caffe2: CUDA nvcc is: $PREFIX/bin/nvcc
 │ │       -- Caffe2: CUDA toolkit directory:
 │ │       -- Caffe2: Header version is: 12.6
 │ │       CMake Error at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:107 (get_target_property):
 │ │         get_target_property() called with non-existent target "CUDA::nvrtc".
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- Found Python: $PREFIX/bin/python (found version "3.9.21") found components: Interpreter
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:116 (message):
 │ │         Failed to compute shorthash for libnvrtc.so
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       CMake Warning (dev) at $PREFIX/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
 │ │         The package name passed to `find_package_handle_standard_args` (nvtx3) does
 │ │         not match the name of the calling package (Caffe2).  This can lead to
 │ │         problems in calling code that expects `find_package` result variables
 │ │         (e.g., `_FOUND`) to follow a certain pattern.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:154 (find_package_handle_standard_args)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       This warning is for project developers.  Use -Wno-dev to suppress it.
 │ │       
 │ │       -- Could NOT find nvtx3 (missing: nvtx3_dir)
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:160 (message):
 │ │         Cannot find NVTX3, find old NVTX instead
 │ │ Failed to build vllm
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- USE_CUDNN is set to 0. Compiling without cuDNN support
 │ │       -- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
 │ │       -- USE_CUDSS is set to 0. Compiling without cuDSS support
 │ │       -- USE_CUFILE is set to 0. Compiling without cuFile support
 │ │       -- Added CUDA NVCC flags for:
 │ │       CMake Warning at $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
 │ │         static library kineto_LIBRARY-NOTFOUND not found.
 │ │       Call Stack (most recent call first):
 │ │         $PREFIX/lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:120 (append_torchlib_if_found)
 │ │         CMakeLists.txt:81 (find_package)
 │ │       
 │ │       
 │ │       -- Found Torch: $PREFIX/lib/libtorch.so
 │ │       CMake Error at CMakeLists.txt:122 (message):
 │ │         Can't find CUDA or HIP installation.

I'm not too sure what this means or how to fix it. I'd be very grateful for any suggestions.

@shermansiu
Copy link

Hmm, I'd like to build the recipe locally to diagnose this further, but at a glance, the following line looks a bit concerning:

 -- Unable to find cublas_v2.h in either "$PREFIX/targets/x86_64-linux/include" or "$PREFIX/math_libs/include"

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you need more than just {{ compiler("cuda") }} to get what all the CUDA components you need.

Look like you need at minimum

    - cuda-version =={{ cuda_compiler_version }}
    - cuda-cudart-dev
    - cuda-nvrtc-dev
    - libcublas-dev

in the host environment. Also note that we're still figuring out an issue with nvtx, see conda-forge/pytorch-cpu-feedstock#357

Comment on lines 33 to 38
- cmake
- git
- ${{ stdlib('c') }}
- ${{ compiler('c') }}
- ${{ compiler('cxx') }}
- ${{ compiler('cuda') }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this (+ninja) should move to the build environment.

Copy link

@shermansiu shermansiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to resolve the nvtx issue, but then it complains about not being able to find kineto.

Using USE_KINETO=0 doesn't seem to work because the existing PyTorch .cmake files in the environment already have kineto enabled.

lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake

if(ON)
  append_torchlib_if_found(kineto)
endif()

See:

@maresb
Copy link
Contributor Author

maresb commented Feb 17, 2025

@shermansiu, thanks for continuing to push this through. Would you add yourself as a recipe maintainer?

@shermansiu
Copy link

Sounds good!

@shermansiu
Copy link

I was able to cross-compile the wheel for macOS (arm64). All that's left is to ensure that opencv-python-headless ships with an arm64 build.

@shermansiu
Copy link

I don't think the current scripts in staged-recipes were tested to support cross-compilation.

The aarch64 one fails because it seems to depend on the setup from the newer Conda-smithy script (i.e. defining RECIPE_ROOT) and the Mac Arm64 one fails because cmake doesn't support Python 3.13? Something doesn't feel right...

@shermansiu
Copy link

I was able to fix the MacOS compilation script, but the recipe_root definition I added is hacky because most other parts of the conda-forge ecosystem expect a single recipe directory for the feedstock.

https://github.com/conda-forge/conda-forge-ci-setup-feedstock/blob/d8b11fd85622f4d332cd099d91e8fa12562c9be6/recipe/cross_compile_support.sh#L36C20-L36C31

@shermansiu
Copy link

Summary:

  • MacOS (Arm64): Awaiting opencv-python-headless. In the CI, it is able to build the wheel and gets stuck at this step
  • Linux 64 (Cuda 11.8): It seems to work, but it runs out of disk space and crashes the image. When running it locally, it also keeps crashing my computer. I'll need to get access to better compute to ensure that this runs.

TODO:

  • Fix cross-compilation in staged-recipes. I'll need to add the fixes I made to the macOS run script and find a more permanent fix for the RECIPE_ROOT environment variable during cross-compilation (Linux AArch64). I'll also need to include instructions to set the TARGET_ARCH environment variable when cross-compiling from osx-64 to osx-arm64.
  • Get opencv-python-headless to work for Mac (Arm64).

@maresb
Copy link
Contributor Author

maresb commented Feb 18, 2025

I'll need to get access to better compute to ensure that this runs.

I can run this on a GCP instance. (Unfortunately I can't give you direct access.) But just let me know which builds you want. Just linux-64 cuda 11.8 for now?

@shermansiu
Copy link

shermansiu commented Feb 18, 2025

Yep! That's the only one left, thanks!

@maresb
Copy link
Contributor Author

maresb commented Feb 18, 2025

Sorry, didn't manage it today. Will try tomorrow.

@maresb
Copy link
Contributor Author

maresb commented Feb 19, 2025

I'm trying to build using the build-locally.py script. (Last time I was using rattler-build raw.)

I'm having some really weird issues with the conditional evaluation. I did quite a few experiments but haven't been able to understand yet what was going on.

In the meantime, I've temporarily hard-coded stuff. Note that vllm-cpu-utils.patch seems necessary even with CUDA, so I recommend adding the first hunk to this branch:

diff --git a/recipes/vllm/recipe.yaml b/recipes/vllm/recipe.yaml
index a7c71a9f13..ed07dce39e 100644
--- a/recipes/vllm/recipe.yaml
+++ b/recipes/vllm/recipe.yaml
@@ -15,7 +15,7 @@ source:
   sha256: bdeeda5624182e6a93895cbb7e20b6e88b04d22b8272d8a255741b28b36ae941
   patches:
   - patches/vllm-cmakefiles.patch
-  - if: linux and use_cuda == "false"
+  - if: linux
     then:
     - patches/vllm-cpu-utils.patch
   - if: is_cross_compiling == "true"
@@ -54,7 +54,7 @@ requirements:
   - ${{ stdlib('c') }}
   - ${{ compiler('c') }}
   - ${{ compiler('cxx') }}
-  - if: use_cuda == "true"
+  - if: true
     then:
     - ${{ compiler('cuda') }}
   - if: is_cross_compiling == "true"
@@ -62,7 +62,7 @@ requirements:
     - python
     - cross-python_${{ target_platform }}
     - pytorch ==${{ pytorch_version }}
-    - if: use_cuda == "true"
+    - if: true
       then:
       - pytorch-gpu
       else:
@@ -79,7 +79,7 @@ requirements:
   - if: linux
     then:
     - libnuma
-  - if: use_cuda == "true"
+  - if: true
     then:
     - pytorch-gpu
     - nvtx-c
@@ -144,7 +144,7 @@ requirements:
   - pytorch ==${{ pytorch_version }}
   - torchaudio ==${{ pytorch_version }}
   - torchvision ==0.20.1
-  - if: use_cuda == "true"
+  - if: true
     then:
     - pytorch-gpu
     else:

It seems to be running now. I'll let you know how it goes.

@shermansiu
Copy link

Hmm, interesting! I've always been using build-locally.py.

@maresb
Copy link
Contributor Author

maresb commented Feb 19, 2025

The pip check failed due to conda-forge/xgrammar-feedstock#5. Adding it in by hand and rerunning...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants