-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vllm #28931
base: main
Are you sure you want to change the base?
Add vllm #28931
Conversation
Hi! This is the staged-recipes linter and your PR looks excellent! 🚀 |
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipes/vllm/recipe.yaml:
For recipes/vllm/recipe.yaml:
This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12962027449. Examine the logs at this URL for more detail. |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
Interesting, we're getting different results between CUDA 11.8 and 12.0. Both fail in the following command: ['cmake', '$SRC_DIR', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo',
'-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=$PREFIX/bin/python',
'-DVLLM_PYTHON_PATH=$PREFIX/lib/python3.9/site-packages/pip/_vendor/pyproject_hooks/_in_process:$PREFIX/lib/python39.zip:$PREFIX/lib/python3.9:$PREFIX/lib/python3.9/lib-dynload:$PREFIX/lib/python3.9/site-packages:$PREFIX/lib/python3.9/site-packages/setuptools/_vendor',
'-DFETCHCONTENT_BASE_DIR=$SRC_DIR/.deps', '-DNVCC_THREADS=1',
'-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=2'] 12.0 fails earlier at CUDA detection:
11.8 gets further:
|
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipes/vllm/recipe.yaml:
This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/12967644561. Examine the logs at this URL for more detail. |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
Thanks @maresb! I had forgotten that there's already #24710, perhaps @mediocretech would be interested in collaborating? W.r.t. CUDA, we need to move on from 12.0 here, which isn't used anywhere else in conda-forge anymore - it's just that staged-recipes seems to have been forgotten in the context of conda-forge/conda-forge-pinning-feedstock#6630. |
Oh, I didn't notice that effort, thanks @h-vetinari! Although that's old it looks like @rongou is eager to help! 🚀 Do you think that CUDA 12.0 is actually causing a problem here? I was thinking (i.e. wildly guessing) that we need to patch |
Mainly I want to avoid redundant work. As soon as #28938 is in and we have merged main here, I'll be happy to take a look what's going on. |
In any case, you'll have to address
|
Woah, after adding
|
Hi! This is the staged-recipes linter and I found some lint. It looks like some changes were made outside the If these changes are intentional (and you aren't submitting a recipe), please add a File-specific lints and/or hints:
|
Ah, hmm, I just added swap to conda-forge.yml. Not sure how that's supposed to work here on staged-recipes. 🤔 EDIT: Oh good, the linter is complaining, so that will help us to remember to revert it before merging. EDIT2: Hmm, it seems that the swap setting works on |
Hi @h-vetinari!
As a brief summary of the above, I merged On CUDA 12.x I'm hitting the error:
On 11.8, after adding I'd be grateful for any advice you could provide. Thanks! |
We're (now) aware of the CUDA-angle of conda-forge/pytorch-cpu-feedstock#333
|
I would have hoped to get more out of setting
Here's the corresponding Python code to go from the envvar to get the flag: |
Hmm, it still appears broken after conda-forge/pytorch-cpu-feedstock#339.
Is |
The CUDA 11.8 build probably fails because it's out of disk space and/or RAM, but that's just speculation:
|
Hey @shermansiu, great to have you around!!! I'm a bit lost since I'm not very familiar with CUDA. I was just now having some trouble getting the CI to rerun the CUDA builds, but rebasing seems to have fixed it. Also, post-rebase things seem to be proceeding slightly further for 12.6:
I'm not too sure what this means or how to fix it. I'd be very grateful for any suggestions. |
Hmm, I'd like to build the recipe locally to diagnose this further, but at a glance, the following line looks a bit concerning:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you need more than just {{ compiler("cuda") }}
to get what all the CUDA components you need.
Look like you need at minimum
- cuda-version =={{ cuda_compiler_version }}
- cuda-cudart-dev
- cuda-nvrtc-dev
- libcublas-dev
in the host environment. Also note that we're still figuring out an issue with nvtx, see conda-forge/pytorch-cpu-feedstock#357
recipes/vllm/recipe.yaml
Outdated
- cmake | ||
- git | ||
- ${{ stdlib('c') }} | ||
- ${{ compiler('c') }} | ||
- ${{ compiler('cxx') }} | ||
- ${{ compiler('cuda') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this (+ninja) should move to the build environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to resolve the nvtx issue, but then it complains about not being able to find kineto
.
Using USE_KINETO=0
doesn't seem to work because the existing PyTorch .cmake files in the environment already have kineto enabled.
lib/python3.9/site-packages/torch/share/cmake/Torch/TorchConfig.cmake
if(ON)
append_torchlib_if_found(kineto)
endif()
See:
@shermansiu, thanks for continuing to push this through. Would you add yourself as a recipe maintainer? |
Sounds good! |
I was able to cross-compile the wheel for macOS (arm64). All that's left is to ensure that |
I don't think the current scripts in staged-recipes were tested to support cross-compilation. The aarch64 one fails because it seems to depend on the setup from the newer Conda-smithy script (i.e. defining RECIPE_ROOT) and the Mac Arm64 one fails because cmake doesn't support Python 3.13? Something doesn't feel right... |
I was able to fix the MacOS compilation script, but the recipe_root definition I added is hacky because most other parts of the conda-forge ecosystem expect a single |
Summary:
TODO:
|
I can run this on a GCP instance. (Unfortunately I can't give you direct access.) But just let me know which builds you want. Just linux-64 cuda 11.8 for now? |
Yep! That's the only one left, thanks! |
Sorry, didn't manage it today. Will try tomorrow. |
I'm trying to build using the I'm having some really weird issues with the conditional evaluation. I did quite a few experiments but haven't been able to understand yet what was going on. In the meantime, I've temporarily hard-coded stuff. Note that diff --git a/recipes/vllm/recipe.yaml b/recipes/vllm/recipe.yaml
index a7c71a9f13..ed07dce39e 100644
--- a/recipes/vllm/recipe.yaml
+++ b/recipes/vllm/recipe.yaml
@@ -15,7 +15,7 @@ source:
sha256: bdeeda5624182e6a93895cbb7e20b6e88b04d22b8272d8a255741b28b36ae941
patches:
- patches/vllm-cmakefiles.patch
- - if: linux and use_cuda == "false"
+ - if: linux
then:
- patches/vllm-cpu-utils.patch
- if: is_cross_compiling == "true"
@@ -54,7 +54,7 @@ requirements:
- ${{ stdlib('c') }}
- ${{ compiler('c') }}
- ${{ compiler('cxx') }}
- - if: use_cuda == "true"
+ - if: true
then:
- ${{ compiler('cuda') }}
- if: is_cross_compiling == "true"
@@ -62,7 +62,7 @@ requirements:
- python
- cross-python_${{ target_platform }}
- pytorch ==${{ pytorch_version }}
- - if: use_cuda == "true"
+ - if: true
then:
- pytorch-gpu
else:
@@ -79,7 +79,7 @@ requirements:
- if: linux
then:
- libnuma
- - if: use_cuda == "true"
+ - if: true
then:
- pytorch-gpu
- nvtx-c
@@ -144,7 +144,7 @@ requirements:
- pytorch ==${{ pytorch_version }}
- torchaudio ==${{ pytorch_version }}
- torchvision ==0.20.1
- - if: use_cuda == "true"
+ - if: true
then:
- pytorch-gpu
else: It seems to be running now. I'll let you know how it goes. |
Hmm, interesting! I've always been using |
The pip check failed due to conda-forge/xgrammar-feedstock#5. Adding it in by hand and rerunning... |
Very rough draft. I will almost certainly require help.
Opened on the advice of @h-vetinari in conda-forge/xformers-feedstock#42
Direct and transitive dependencies:
Checklist
url
) rather than a repo (e.g.git_url
) is used in your recipe (see here for more details).