You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.
I couldn't find a detailed documentation (or a step-by-step guideline) about pre-training opt125 with exactly the same corpus and model architecture that you used in paper. In short, I would like to reproduce your smallest model results, from scratch.
Could you point out where can find the regarding guideline or provide anything else which can help? Special thanks,
The text was updated successfully, but these errors were encountered:
I couldn't find a detailed documentation (or a step-by-step guideline) about pre-training opt125 with exactly the same corpus and model architecture that you used in paper. In short, I would like to reproduce your smallest model results, from scratch.
Could you point out where can find the regarding guideline or provide anything else which can help? Special thanks,
The text was updated successfully, but these errors were encountered: