-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: 使用 Xinference vLLM 启动 qwen2.5-32b-instruct 推理结果都是感叹号 #1038
Comments
not following issue template. vllm version is too old. try disabling custom reduce if it is enabled and you are using PCIE cards. |
The VLLM version has been upgraded to 0.5.1, but the issue still persists.
|
can you please follow the issue template? what's your driver version? what's your card? did you use multiple cards? how did you start vllm? and so on. why does xinference show custom-qwen25-32-instruct? how to actually reproduce? |
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. |
Model Series
Qwen2.5
What are the models used?
Qwen2.5-32B-Instruct
What is the scenario where the problem happened?
Xinference
Is this a known issue?
Information about environment
System Info / 系統信息
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
The command used to start Xinference / 用以启动 xinference 的命令
Reproduction / 复现过程
Expected behavior / 期待表现
正常推理结果。
Log output
Description
System Info / 系統信息
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
The command used to start Xinference / 用以启动 xinference 的命令
Reproduction / 复现过程
Expected behavior / 期待表现
正常推理结果。
The text was updated successfully, but these errors were encountered: