Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR when running real_time_interactive_demo.py #59

Open
CR400AF-A opened this issue Dec 11, 2024 · 2 comments
Open

ERROR when running real_time_interactive_demo.py #59

CR400AF-A opened this issue Dec 11, 2024 · 2 comments

Comments

@CR400AF-A
Copy link

Hi, thanks for your great work! I have tried gradio_demo and it is perfect.

When I try real_time_interactive_demo, it fails. It seems that BOTH models will be loaded to GPU 0 and 1(tensor_parallel=2), even I specified different GPU( 0,1 for first model and 2,3 for second model ). which causes an OOM error.

It seems an error duo to vLLM, but I tried to fix it but no gains. Have you ever met this problem?

Looking forward to your advice.

@longzw1997
Copy link
Collaborator

Thank you for your attention. We have updated the code. Please download the latest version to experience it.

@Anonymous4CV
Copy link

can you run the model with tensor_parallel=2 now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants