Multi-GPU Support: Originally limited to single-GPU setups, this code… #48

StevenChen16 · 2024-10-29T04:03:32Z

Multi-GPU Support: Originally limited to single-GPU setups, this code now leverages the Accelerate library with init_empty_weights and infer_auto_device_map for multi-GPU deployment, maximizing memory utilization across available GPUs.
Efficient Weight Management: By using load_checkpoint_and_dispatch, model weights are dynamically allocated across GPUs and offloaded to disk as needed, enhancing memory efficiency for larger models.
Nested Event Loop Support: The addition of nest_asyncio enables nested event loops, improving compatibility when running FastAPI within Jupyter or similar environments.
Code Simplification: Streamlined model and tokenizer loading eliminates manual device allocation, making the code more readable and efficient.

… now leverages the Accelerate library with init_empty_weights and infer_auto_device_map for multi-GPU deployment, maximizing memory utilization across available GPUs.

zRzRzRzRzRzRzR · 2024-10-29T08:37:36Z

Why not directly use the auto solution provided by transformers? This solution can automatically allocate models to different GPUs (GPUs with insufficient memory)

StevenChen16 · 2024-10-29T12:52:44Z

Why not directly use the auto solution provided by transformers? This solution can automatically allocate models to different GPUs (GPUs with insufficient memory)

I chose a custom approach over the transformers auto allocation because it offers finer control over GPU memory management. Specifically, by using Accelerate with init_empty_weights and infer_auto_device_map, I can define exact memory constraints per GPU, ensuring stable distribution even when memory is limited or varies across devices. This method also leverages offloading to disk for parts of the model that exceed GPU capacity, reducing the risk of memory issues during runtime.

Multi-GPU Support: Originally limited to single-GPU setups, this code…

fa939e1

… now leverages the Accelerate library with init_empty_weights and infer_auto_device_map for multi-GPU deployment, maximizing memory utilization across available GPUs.

Update requirements.txt adding nest_asyncio

ea0ffa9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-GPU Support: Originally limited to single-GPU setups, this code… #48

Multi-GPU Support: Originally limited to single-GPU setups, this code… #48

StevenChen16 commented Oct 29, 2024

zRzRzRzRzRzRzR commented Oct 29, 2024

StevenChen16 commented Oct 29, 2024

Multi-GPU Support: Originally limited to single-GPU setups, this code… #48

Are you sure you want to change the base?

Multi-GPU Support: Originally limited to single-GPU setups, this code… #48

Conversation

StevenChen16 commented Oct 29, 2024

zRzRzRzRzRzRzR commented Oct 29, 2024

StevenChen16 commented Oct 29, 2024