使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

545771889a · 2024-11-01T07:17:43Z

我的代码
from vllm import LLM, SamplingParams
from chatharuhi import ChatHaruhi (这里只要导入ChatHaruhi就会报Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method)

def load_model_(model_name, peft_model, quantization=None, use_fast_kernels=True, seed=42, **kwargs):
# 加载model、tokenizer、rag
llm = LLM(model=model_name, max_model_len=40452, tensor_parallel_size=2) #这里只有tensor_parallel_size设置为1才能正常使用
torch.cuda.manual_seed(seed)
torch.manual_seed(seed)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

# rag
chatbot = ChatHaruhi(role_name='Sheldon', max_len_story=1000)
return llm, tokenizer, chatbot

The text was updated successfully, but these errors were encountered:

LC1332 · 2024-11-03T09:41:29Z

可能内部启rag的vector模型的时候发生冲突了-o-

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

545771889a commented Nov 1, 2024 •

edited

Loading

LC1332 commented Nov 3, 2024

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

使用vllm数据并行和ChatHaruhi一起使用会报RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method #83

Comments

545771889a commented Nov 1, 2024 • edited Loading

LC1332 commented Nov 3, 2024

545771889a commented Nov 1, 2024 •

edited

Loading