Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScatterMoE feature #104

Open
ehartford opened this issue Apr 5, 2024 · 5 comments
Open

ScatterMoE feature #104

ehartford opened this issue Apr 5, 2024 · 5 comments

Comments

@ehartford
Copy link

I would like to request ScatterMoE feature in Megablocks

https://arxiv.org/abs/2403.08245

https://github.com/shawntan/scattermoe

@mvpatel2000
Copy link
Contributor

We'd love community PRs for this! Happy to help review and design. It's not currently on our roadmap, but we are evaluating it.

@tgale96
Copy link
Contributor

tgale96 commented Apr 9, 2024

Eric, do you know that Scatter MoE is beneficial for your use case or are you interested based on the results from the paper? If the former, it would be very helpful if you could share!

I have some scripts from Shawn and it is on my list to benchmark and see if we could get some wins from their kernels. I am a bit buried though, so I am not sure when I'll get to it 😓

@ehartford
Copy link
Author

we know that it's much more efficient training with Scatter MoE and we would like to benefit from the cost savings

@tgale96
Copy link
Contributor

tgale96 commented Apr 16, 2024

Thanks, Eric. Can you share more about your use case so that we can include it in our analysis? Scripts to reproduce would be excellent, if possible :)

@ehartford
Copy link
Author

this is feature request, not a bug that could be reproduced.
The academic paper I am requesting is linked above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants