You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi
it seems we can load model in on one node then distribute model between node and train or inference but,
imagine we have 2 node each node 2 gpu with 24GB vram each gpu.
wanna loading model like gemma2 27B. one node can not load it. it need to distribute load between node from start till end without needing to load completely in one node.(imagine none of nodes can not offload for loading complete model on one node, like llama 3.1 405B).
is there a way to do this?
The text was updated successfully, but these errors were encountered:
Hi
it seems we can load model in on one node then distribute model between node and train or inference but,
imagine we have 2 node each node 2 gpu with 24GB vram each gpu.
wanna loading model like gemma2 27B. one node can not load it. it need to distribute load between node from start till end without needing to load completely in one node.(imagine none of nodes can not offload for loading complete model on one node, like llama 3.1 405B).
is there a way to do this?
The text was updated successfully, but these errors were encountered: