Load best model + evaluate best model on devset #11

OrianeN · 2025-01-22T10:47:15Z

I've noticed that the model loaded at the end of training differs whether the EarlyStoppingException was raised or not:

if EarlyStoppingException: load best model
otherwise, keep on with the last model

I'd like to suggest always loading the best model at the end of the training phase, as well as performing the final evaluation successively on the devset and on the testset (with the best model).

Logs of 2 runs to show you the changes in practice:

without loading the best model: papie_run_noloadbest.log (POS loss of the last model is higher than the one-but-last one)
with loading the best model: papie_run_loadbest.log

Note: I've never tried training without a dev set, so I'm not sure about the comment at the end of the Trainer.train_epochs() function:
https://github.com/OrianeN/PaPie/blob/d23f4a5e95638e74dad640cf7f1d44b896c2fb0e/pie/trainer.py#L429-L430

The best model was only loaded in the case of early stopping, now it is loaded at the end of all trainings. Also, the final evaluation is performed with the best model 1) on the dev set 2) on the test set

OrianeN · 2025-01-22T12:38:11Z

As discussed, I made the training script save both the best and the last model, as well as print the index of the best epoch (starting from 1).

Here are the updated example logs: papie_run_loadbest_saveboth.log

Load best model + evaluate best model on devset

d23f4a5

The best model was only loaded in the case of early stopping, now it is loaded at the end of all trainings. Also, the final evaluation is performed with the best model 1) on the dev set 2) on the test set

OrianeN marked this pull request as draft January 22, 2025 12:17

Save both best and last models + print index of the best epoch

8a169da

OrianeN marked this pull request as ready for review January 22, 2025 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load best model + evaluate best model on devset #11

Load best model + evaluate best model on devset #11

OrianeN commented Jan 22, 2025

OrianeN commented Jan 22, 2025

Load best model + evaluate best model on devset #11

Are you sure you want to change the base?

Load best model + evaluate best model on devset #11

Conversation

OrianeN commented Jan 22, 2025

OrianeN commented Jan 22, 2025