-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add schedule-free adamw submission in JAX #809
Comments
I can help debug any issues here. Do you have any code you can share? If there are issues with the optax jax implementation I want to get it fixed asap. |
There are many small differences between the behavior of schedule-free jax wrapper and the original algoperf submission. Some differences I'm aware of:
So overall I expect the jax wrapper version to give as good results on all problems (maybe slightly slower on fastmrI), so if there is a difference it would be from some sort of bug. |
Hi Aaron! thanks for weighing in on this. I seemed to have missed your messages on this thread. We have a slightly modified version based on the optax code here: https://github.com/priyakasimbeg/algorithmic-efficiency/blob/compare_schedule_free/tests/test_algorithms/schedule_free_adamw/jax/schedule_free_optax.py. I'm working on a test to compare the pytorch and jax implementations side by side on the algoperf github code but the test is still in progress. I can perhaps run a full training run on some of the workloads. |
Ok, I take a look and see if I spot any differences. |
It looks like the z buffer my be initialized with zeros: Suggestion: you might want to set z on the first call to the main optimizer update, that's what we do in the pytorch version. |
@priyakasimbeg Let me know if that initialization issue was the problem. |
Hi Aaron thanks for spotting that! |
Description
Currently we have been unable to reproduce the schedule free adamw results with JAX.
There seem to be differences between the optax implementation of schedule-free adamw and the pytorch submission.
The text was updated successfully, but these errors were encountered: