Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bf16 parameters in bf16 mixed prec #283

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

jph00
Copy link
Contributor

@jph00 jph00 commented Nov 1, 2023

What does this PR do?

bfSixteen_mixed is a poor default choice for mixed precision training, because it does not use tensor cores. Instead, it does all computation in fp32! I've tested, and on an A6000 it's 2.5x slower to train a 34B model.

Feature/Issue validation/testing

I tried running the fine tuning script with this change with both 7B and 34B models and it ran 2.5x faster each time.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Thanks for contributing 🎉!

@jph00
Copy link
Contributor Author

jph00 commented Nov 1, 2023

PR previously discussed with @lessw2020 .

Copy link
Contributor

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jph00 for the PR.

@lessw2020
Copy link
Contributor

Core issue here is the params are being set to torch.float32 instead of the expected torch.bfloat16 with the default policy.
This was creating memory and slowness issues that Jeremy flagged.
With the current default, it is not the expected mixed precison where we do local computations in bf16 and just keep the master copies in fp32.
Moving to bf16 for params resolves this and hence his PR.

@HamidShojanazeri HamidShojanazeri merged commit acce2d8 into meta-llama:main Nov 1, 2023
3 checks passed
@jph00 jph00 deleted the patch-1 branch November 1, 2023 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants