-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ScatterMoE feature #104
Comments
We'd love community PRs for this! Happy to help review and design. It's not currently on our roadmap, but we are evaluating it. |
Eric, do you know that Scatter MoE is beneficial for your use case or are you interested based on the results from the paper? If the former, it would be very helpful if you could share! I have some scripts from Shawn and it is on my list to benchmark and see if we could get some wins from their kernels. I am a bit buried though, so I am not sure when I'll get to it 😓 |
we know that it's much more efficient training with Scatter MoE and we would like to benefit from the cost savings |
Thanks, Eric. Can you share more about your use case so that we can include it in our analysis? Scripts to reproduce would be excellent, if possible :) |
this is feature request, not a bug that could be reproduced. |
I would like to request ScatterMoE feature in Megablocks
https://arxiv.org/abs/2403.08245
https://github.com/shawntan/scattermoe
The text was updated successfully, but these errors were encountered: