-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rm_alloc returned 81: Out of memory #658
Comments
What device are you running on? |
Arch Linux |
I think I have same issue. |
sounds like EXO is trying to only use VRAM and do inference on the GPU instead of hybrid inference or resorting to CPU-only inference, i ran into the same issue today |
Mind saying how you were able to do it? Did you isolate the gpu from the rest of the system? Making the system seem like it had an integrated gpu only? I assume OOM may not be an issue in systems with integrated graphics/unified memory because both GPU and CPU share the same memory |
I thought about implementing a docker workaround too it's reliable and effective but I believe it is not ideal since it implies that a network is also emulated in the system for communicating between the cpu and gpu and that imposes an overhead which can be significant at high speeds like 10gbps, an ideal solution I believe would be to support hybrid inference on individual nodes |
exo seems to be OOMing despite having lots of free RAM.
If I read the README.md correctly, I should only need 16GB of RAM to run this model so having more than 100GB free seems like I should not be OOMing.
The text was updated successfully, but these errors were encountered: