multi-gpu training & maml baseline #42

DanqingZ · 2020-12-27T20:18:58Z

Hi thank you so much for the codebase! I am looking for a multi-gpu PyTorch maml implementations. I am wondering if I can use your codebase for this.

For the multi-gpu training, can I simply use DataParallel to parallel the model? Will the existing data loader work with the the DataParallel model?

self.model = torch.nn.DataParallel(self.model)

Also, I am wondering if I skip the pre-train step and run meta learning directly (made some changes not to load the pre-trained model), is that MAML?
Many thanks and look forward to your reply!

The text was updated successfully, but these errors were encountered:

yaoyao-liu · 2020-12-28T00:10:27Z

Hi Danqing,

Thanks for your interest in our project.
For (1): I have never tried to run this project on multiple GPUs. You may try that. Welcome to report your results here.
For (2): It is different from the original MAML. In our method, during base-learning, we only update the FC classifier weights. During meta-learning, we update the scaling and shifting weights. In MAML, they update all the network parameters during both base-learning and meta-learning.

If you have any further questions, feel free to leave additional comments.

Best,
Yaoyao

DanqingZ · 2020-12-28T03:11:03Z

Hi Yaoyao, thanks for the reply! I see, I can report the numbers later here when I finish the experiments.

For (2), so what you mentioned is the FT and SS meta-training operations in your paper. I actually have one question on the table 2 of your paper. For the line "MAML deep, HT", did you combine the pre step with the MAML algorithm? Do you have the experiment: "MAML deep, HT" without the fine-tuning? Then we can see how performance improvement fine-tuning contributes.
The differences between your proposed MTL algorithm, and the MAML-Resnet algorithm are: 1) fine-tuning; 2) HT and 3) FT->SS meta-training operations. I am actually curious how much performance improvement each component contributes. Thanks!

yaoyao-liu · 2020-12-28T14:02:05Z

Hi Danqing,

For "MAML deep, HT" in Table 2, we used the pre-trained model (ResNet-12 (pre)).
For different ablative fine-tuning settings, you may see the results in Table 1.
As the model is pre-trained on 64 classes (miniImageNet), we are not able to directly apply it to 5-class tasks without any fine-tuning steps. At least, we need to fine-tune the FC classifiers.

Best,
Yaoyao

DanqingZ · 2021-01-02T19:45:00Z

@yaoyao-liu , then for "SS[Θ;θ], HT meta-batch" in table 2, is that also the pre-trained model without the first fine-tuning step? I mean which experiments in table 2 has the "(a) large-scale DNN training" step?

DanqingZ · 2021-01-02T19:46:21Z

The differences between your proposed MTL algorithm, and the MAML-Resnet algorithm are: 1) fine-tuning; 2) HT and 3) FT->SS meta-training operations.
If we want to claim "SS meta-training operations" works, then we need to make sure the comparison experiments also have 1) fine-tuning and 2) HT.
I am trying to understand your work better, please correct me if I am wrong. Thanks.

yaoyao-liu · 2021-01-02T19:54:27Z

I am not sure what do you mean by "first fine-tuning" step.

In Table 2, if the feature extractor is labeled with "(pre)" (e.g., ResNet-12 (pre)), then the pre-trained model is applied. The model is pre-trained on all base class samples.

The results in Table 1 show that the "SS meta-training operation" works. Comparing the 3rd block with the 1st and the 2nd blocks, you can observe that our "SS" performs better than "FT" and "update". "HT meta-batch" is not applied in Table 1.

DanqingZ · 2021-01-02T19:56:18Z

oh I see, I thought ResNet-12 (pre) means the ResNet-12 without any fine-tuning.

By 'first fine-tuning" step I mean "(a) large-scale DNN training" step.

DanqingZ · 2021-01-02T19:57:13Z

For table 1, did you first conduct the "(a) large-scale DNN training" step?

yaoyao-liu · 2021-01-02T19:57:54Z

For table 1, did you first conduct the "(a) large-scale DNN training" step?

Yes. In the caption, you can see "ResNet-12 (pre)" is applied.

DanqingZ · 2021-01-02T19:58:19Z

Yeah I understand by loading the pre-trained models, we have to drop the classifier parameters and only use the encoder parameters. This is like domain-finetuning steps, adapting the pre-trained model weights to the domain.

DanqingZ · 2021-01-02T19:58:36Z

For table 1, did you first conduct the "(a) large-scale DNN training" step?

Yes. In the caption, you can see "ResNet-12 (pre)" is applied.

I see, thanks for the clarification! I misunderstood "ResNet-12 (pre)".

yaoyao-liu · 2021-01-02T19:59:11Z

You're welcome.

DanqingZ · 2021-01-02T20:54:03Z

Hi @yaoyao-liu , I have an additional question, if we don't run the large-scale DNN training step, and just run the experiment with "SS[Θ;θ], HT meta-batch", will the performance be better than "MAML, HT meta-batch"?

DanqingZ closed this as completed Jan 2, 2021

DanqingZ reopened this Jan 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-gpu training & maml baseline #42

multi-gpu training & maml baseline #42

DanqingZ commented Dec 27, 2020 •

edited

Loading

yaoyao-liu commented Dec 28, 2020

DanqingZ commented Dec 28, 2020

yaoyao-liu commented Dec 28, 2020

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021 •

edited

Loading

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

multi-gpu training & maml baseline #42

multi-gpu training & maml baseline #42

Comments

DanqingZ commented Dec 27, 2020 • edited Loading

yaoyao-liu commented Dec 28, 2020

DanqingZ commented Dec 28, 2020

yaoyao-liu commented Dec 28, 2020

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

DanqingZ commented Jan 2, 2021 • edited Loading

yaoyao-liu commented Jan 2, 2021

DanqingZ commented Jan 2, 2021

DanqingZ commented Dec 27, 2020 •

edited

Loading

DanqingZ commented Jan 2, 2021 •

edited

Loading