-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Mixture of Expert (MoE) Models #32
Comments
I looked at this yesterday, would be great exo can support the deepseek v2, it should be very similar to the llama sharding in the DeepseekV2DecoderLayer. But maybe worth trying model parallelism -> ml-explore/mlx-examples#890 |
will like to work on this :) |
@345ishaan that would be great - go for it |
Indeed, MoE is the most suitable application scenario for exo and should be prioritized for implementation. |
looking forward to support MoE deepseek v2 total:236B active:21B |
yeah i was planning to experiment the setup with https://github.com/deepseek-ai/DeepSeek-Coder-V2 . Will be looking into it this weekend. |
Is it possible to have the active parameters favor an Nvidia Cuda GPU and let the other nodes store the inactive parameters? |
Yes! We can definitely do something like this. |
The text was updated successfully, but these errors were encountered: