You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The vllm inline inference adapter works in both the conda and docker stack types, but some features fail in the docker case because the base image does not include all necessary dependencies (some cuda libraries, in particular). The specific case that failed for me was trying to use tensor_parallel_size greater than 1. NCCL fails to initialize (nccl library isn't present).
I started working on this, but didn't get it working completely yet.
This is a follow-up issue for #181.
The vllm inline inference adapter works in both the
conda
anddocker
stack types, but some features fail in thedocker
case because the base image does not include all necessary dependencies (some cuda libraries, in particular). The specific case that failed for me was trying to usetensor_parallel_size
greater than1
. NCCL fails to initialize (nccl library isn't present).I started working on this, but didn't get it working completely yet.
my WIP was here: russellb@3a61246
The text was updated successfully, but these errors were encountered: