The original Meta-Weight-Net is only tested with relatively small models like ResNet. In this example, we try to scale up the model from ResNet to BERT with Betty's various systems support.
- Model: (pre-trained) BERT-base from Hugging Face
- Dataset: SST-2 benchmark. We artificially injected class imbalance
via
args.imbalance_factor
- No meta-learning (baseline)
python main.py --baseline
- Meta-learning (Single GPU)
python main.py
- Meta-learning (Single GPU + mixed-precision)
python main.py --precision fp16
- Meta-learning (Multi GPU)
torchrun --standalone --nnodes=1 --nproc_per_node=2 main.py --precision fp16 --strategy distributed
- Meta-learning (Multi GPU + ZeRO optimizer)
torchrun --standalone --nnodes=1 --nproc_per_node=2 main.py --precision fp16 --strategy zero
We modified the data loading code from https://github.com/YJiangcm/SST-2-sentiment-analysis.