-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Special remat for Neuron #898
Conversation
a08fb3c
to
9715aef
Compare
9715aef
to
b8c8fa5
Compare
Thanks for the reviews, most comments are resolved and PR looks clean, let me know if more changes are needed. |
b8c8fa5
to
9b9bba5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A suggestion to address the concern from @apghml. WDYT?
9b9bba5
to
bf4a016
Compare
Thanks for the guidance, I addressed all comments. Let me know if more changes are needed |
45342e7
to
4d4e32e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level comment, probably not to add more activation checkpoints in attention transformer layer class unless we must do so, otherwise the change looks good to me. Will approve once revert that part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address @kelvin-zou 's comments. Otherwise LGTM.
4d4e32e
to
c720878
Compare
I removed the extra remat save points as discussed. Please let me know if more changes are needed. Thank you! |
c720878
to
cad26b0
Compare
Hello @ruomingp, looks like merge is blocked on your requested changes, can you please mark them as resolved so this PR can be merged. Thank you! |
802eacb
to
ed58dd2
Compare
Updated PR to avoid merge conflicts. Let's merge this soon, thanks everyone! |
ed58dd2
to
345e4a9
Compare
@kelvin-zou thanks for the review, I resolved your comments. |
cfc5b88
to
4ba8a00
Compare
Addressed all comments. Thanks for the review @kelvin-zou @hanzhi713. Let's try to merge this soon. |
Thanks. In the meantime, could you address comments in #884? |
@apoorvtintin Can you fix the pre-commit errors? |
4ba8a00
to
21f362a
Compare
@hanzhi713 Fixed the pre-commit errors, thank you! |
I see test |
The test itself is poorly written @markblee. It didn't wait for async future to finish. I will send a PR to fix it. |
This PR adds special remat configuration for TRN2 and Fuji-70B. This is done by adding a new remat policy that uses regex to match against multiple regex patterns for remat names. This allows more flexibility in remat configurations for different backends and device types.
Misc: Enable remat for StackedTransformer