Fine tuning & catastrophic forgetting from GNNFF frameworks? #539

turbosonics · 2024-08-02T13:46:22Z

turbosonics
Aug 2, 2024

I assume we all learned and know about what catastrophic forgetting in NN in general is.

I didn't have a chance to test this from MACE. But I was wondering, if I fine-tune training MP pretrained model for specific structures and surfaces of specific alloys or oxides, then would resulting model "forget" about the 'original' description of the same or similar materials from MP pre-trained model or not?

For example, if I fine-tune a model for amorphous silica and amorphous silica surface, how much accuracy of SiO2 crystals and pure silicon crystal property would remain intact from 'original' model, whether I fine-tune from MP pre-trained model or other custom models for Si-O crystals? Then what would happen to 'original' if I perform multiple steps of fine tuning?

As far as I know, GNN can't escape from forgetting problem, so I'm assuming GNNFF MD frameworks like MACE (and even non-GNN type frameworks like DeePMD) could experience this issue. But I don't know if MACE ever tested about this, and how to prevent or minimize the "forgetting" of model.

Any idea or suggestions?

I've been trying something like rehearsal, by including some data from previous training set to new training set when I try fine-tune, just in case. But this makes training set bigger and bigger as I proceed more and more fine-tune training. So I was wondering if there are any other good ways to prevent or minimize forgetting.

Answered by gabor1

Aug 2, 2024

We solve this by multihead-finetuning, where we retain some of the old data during fine tuning on a different head than your new data. Seems to work well. We have a multi-head-interface branch where this is implemented automatically, you just have to provide the new data. Don't forget to provide new isolated atom energies, and use spin polarised data (like the foundation model)

View full answer

gabor1 · 2024-08-02T14:15:02Z

gabor1
Aug 2, 2024
Maintainer

We solve this by multihead-finetuning, where we retain some of the old data during fine tuning on a different head than your new data. Seems to work well. We have a multi-head-interface branch where this is implemented automatically, you just have to provide the new data. Don't forget to provide new isolated atom energies, and use spin polarised data (like the foundation model)

2 replies

turbosonics Aug 2, 2024
Author

Thanks, glad to hear that. I was already trying something similar, include a bit of previous training set data to new training set. I assume "head" means Config_Type description to distinguish old and new training set?

Also, does the current MACE version includes this 'multi-head-interface' branch, or do I need to separately download and try it? If it is separate then where can I find this?

Also, if I hope to fine tune anything from MP pretrained model, then all of those AIMD data for fine tuning should be spin polarised?

gabor1 Aug 4, 2024
Maintainer

it's a separate git branch, so you need to check it out from GitHub.

it's not done via config_types. in this branch it is done automatically, so you only need to give the new data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tuning & catastrophic forgetting from GNNFF frameworks? #539

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Fine tuning & catastrophic forgetting from GNNFF frameworks? #539

turbosonics Aug 2, 2024

Replies: 1 comment · 2 replies

gabor1 Aug 2, 2024 Maintainer

turbosonics Aug 2, 2024 Author

gabor1 Aug 4, 2024 Maintainer

turbosonics
Aug 2, 2024

Replies: 1 comment 2 replies

gabor1
Aug 2, 2024
Maintainer

turbosonics Aug 2, 2024
Author

gabor1 Aug 4, 2024
Maintainer