See the readme from our ICLR 2021 work for details on setting up the basic training pipeline.
- Download and normalize datasets
make download-datasets
make normalize-datasets
- Create the transformed datasets
make apply-transforms-sri-py150
make apply-transforms-csn-python
make extract-transformed-tokens
- Train a normal seq2seq model for 10 epochs on
sri/py150
bash experiments/normal_seq2seq_train.sh
- Run adversarial training and testing on
sri/py150
for 5 epochs
bash experiments/normal_adv_train.sh
- Get the augmented
sri/py150
datasets with random and adversarial views
bash scripts/augment.sh
- Pretrain a seq2seq encoder on a
sri/py150
augmented dataset, finetune the encoder onsri/py150
, and test the final model on normal and adversarial datasets.
bash experiments/finetune_and_test_0.sh
- Pretrain a seq2seq encoder on
sri/py150
and run adversasrial training starting from the pretrained model.
bash experiments/pretrain_adv_train.sh