Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

[Bug Fix] trainer.update(1) should be used after loss.mean() is called #1000

Open
wants to merge 49 commits into
base: v0.x
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
de7b23d
clean slate for 1.x
szha Mar 18, 2020
01122db
[Numpy] Numpy version of GluonNLP (#1225)
sxjscience Jun 10, 2020
982a416
Fix bert cfg (#1245)
zheyuye Jun 11, 2020
789e2b9
fix download
sxjscience Jun 11, 2020
b714eac
[Numpy] Try to fix the CI (#1248)
sxjscience Jun 11, 2020
85b6f09
[Numpy] Add "match_tokens_with_char_spans" + Enable downloading from …
sxjscience Jun 16, 2020
ee1f0e3
[Numpy] Update QA Dataset and revise run_squad (#1250)
zheyuye Jun 18, 2020
e06ff01
Pin mxnet version range on CI (#1257)
leezu Jul 7, 2020
689eba9
[CI] AWS batch job tool for GluonNLP (Part I) (#1251)
szha Jul 7, 2020
cd48efd
Update codecov action to handle different OS and Python versions (#1254)
leezu Jul 8, 2020
83e1f13
Use Amazon S3 Transfer Acceleration (#1260)
leezu Jul 10, 2020
a646c34
[FEATURE] update backtranslation and add multinomial sampler (#1259)
hutao965 Jul 11, 2020
ea9152b
Fixes to make the CI more stable (#1265)
sxjscience Jul 16, 2020
70a1887
Update for Block API (#1261)
leezu Jul 17, 2020
9d83fe6
Fix parameter share regex (#1267)
leezu Jul 17, 2020
4743afc
Add fp16 support for Bert QA inference (#1264)
MoisesHer Jul 17, 2020
e78a24e
[CI] update batch to gluonnlp-dev (#1268)
szha Jul 18, 2020
3a0ed9f
[Numpy] Refactor Roberta (#1269)
zheyuye Jul 21, 2020
f407b8e
[CI] Batch cpu version (#1275)
szha Jul 22, 2020
57eb411
[Numpy] Fix conversion toolkits (#1274)
zheyuye Jul 23, 2020
74bd2ce
[Feature] Add FP16 inference support to NMT + Add BoundedBudgetSample…
hutao965 Jul 23, 2020
d76897b
Add embedding related methods in numpy version (#1263)
acphile Jul 28, 2020
4d43f82
add subversion/wget to docker, add readme (#1279)
szha Jul 28, 2020
3c87457
Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, EL…
sxjscience Jul 29, 2020
033214e
[Numpy] Fix SQuAD + Fix GLUE downloading (#1280)
sxjscience Jul 29, 2020
2294421
[Numpy Refactor] BART (#1282)
zheyuye Jul 30, 2020
1f9ad44
Horovod support for pretraining and fune-tuning squad (#1276)
zheyuye Aug 1, 2020
7e1f9d0
[DOC] Add the basic documentation for the embedding API (#1281)
acphile Aug 4, 2020
20af58f
Fix gelu (#1287)
zheyuye Aug 5, 2020
ded0f99
fix prepare_openwebtext (#1289)
ZiyueHuang Aug 6, 2020
c33e62e
[FEATURE]Horovod support for training transformer + add mirror data f…
hutao965 Aug 7, 2020
9e268c0
Fix electra (#1291)
zheyuye Aug 8, 2020
32e87d4
[Numpy] Benchmark the backbone models + Some fixes + Always use pytho…
sxjscience Aug 14, 2020
6ae558e
[FEATURE]Horovod support for training transformer (PART 2) (#1301)
hutao965 Aug 20, 2020
d8b68c6
[Numpy] Fix AWS Batch + Add Docker Support (#1302)
sxjscience Aug 20, 2020
d17ec4c
minor fix for run_electra.py & remove hybridization in the constructi…
ZiyueHuang Aug 22, 2020
99b35d8
Add Intro for batch + upload squad traininng command (#1305)
zheyuye Aug 22, 2020
d93356f
[MODEL] make beam search a hybrid block (#1310)
szha Aug 23, 2020
210dd0c
[Numpy] [Fix] Update README.md (#1306)
sxjscience Aug 23, 2020
b324ee6
[CI] Add GPU pytest + Append AWS Batch job submission to current pipe…
barry-jin Aug 24, 2020
3b14d69
[CI] Update unittests-gpu (#1313)
barry-jin Aug 24, 2020
dca17ee
automatically generate date suffix for dev versions (#1314)
szha Aug 25, 2020
39ec921
fix typo (#1317)
liuzh47 Aug 26, 2020
970318d
fix typo (#1318)
liuzh47 Aug 26, 2020
bba8697
[CI] Update GPU Test Workflow + Update Some Tests and README (#1316)
barry-jin Aug 28, 2020
66e5e05
fix https://github.com/dmlc/gluon-nlp/issues/1315 (#1319)
ZiyueHuang Aug 28, 2020
ff95fb4
[CI] Fix Source Reference Issues (#1332)
barry-jin Sep 1, 2020
1bd85b6
[BUGFIX] fix valid candidates issue (#1323)
liuzh47 Sep 1, 2020
189bbdc
[MODEL] convert gpt2 model (#1328)
hutao965 Sep 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[Numpy Refactor] BART (#1282)
* init

* fix convert roberta

* rename TransformerNMTModel as TransformerModel

* update bart

* fix

* fix

* update init

* add layernorm_embedding for transformer

* convert script

* encoder

* fix

* fix vocab

* fix roberta

* fix

* fix electra

* add conversion bash for roberta and xlmr

* ELECTRA SETUP

* convert bart decoder

* fix

* update

* testing output

* remove arange_like for embeddings

* fix

* update

* use_pooler for bart

* fix

* upload params for bart

* add test_models_bart

* fix cfg

* test bart

* update

* fix transformer

* Squashed commit of the following:

commit 510d991
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <[email protected]>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <[email protected]>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <[email protected]>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (#1279)

    commit d76897b
    Author: phile <[email protected]>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline

* Squashed commit of the following:

commit 9e1ffde
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 11:42:01 2020 +0800

    todo

commit 9a7c343
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 10:53:15 2020 +0800

    revert gelu

commit 0425346
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 10:49:52 2020 +0800

    re-upload bart

commit 516ae84
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 03:32:35 2020 +0800

    use_qkv_bias for transformer

commit 9d60cda
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 03:17:28 2020 +0800

    classifier_activation

commit 510d991
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 02:33:22 2020 +0800

    test

commit 1b5fa7b
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:48:01 2020 +0800

    fix comment1

commit 6533601
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:27:44 2020 +0800

    fix comment

commit a8853f9
Author: ZheyuYe <[email protected]>
Date:   Thu Jul 30 01:10:06 2020 +0800

    Squashed commit of the following:

    commit 232e0b6
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 01:05:17 2020 +0800

        update

    commit 995e5d7
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 01:01:56 2020 +0800

        fix

    commit 9623240
    Author: ZheyuYe <[email protected]>
    Date:   Thu Jul 30 00:52:17 2020 +0800

        fix

    commit d9c4140
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 23:07:10 2020 +0800

        fix transformer

    commit e49fbe1
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 22:18:12 2020 +0800

        update

    commit 1f75b26
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 22:04:08 2020 +0800

        test bart

    commit 5bab516
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 21:34:47 2020 +0800

        fix cfg

    commit 6c62a29
    Merge: 3366cf3 033214e
    Author: ZheyuYe <[email protected]>
    Date:   Wed Jul 29 21:33:10 2020 +0800

        Merge remote-tracking branch 'upstream/numpy' into bart

    commit 033214e
    Author: Xingjian Shi <[email protected]>
    Date:   Wed Jul 29 00:36:57 2020 -0700

        [Numpy] Fix SQuAD + Fix GLUE downloading (#1280)

        * Update run_squad.py

        * Update run_squad.py

        * Update prepare_glue.py

    commit 3c87457
    Author: Xingjian Shi <[email protected]>
    Date:   Tue Jul 28 18:03:21 2020 -0700

        Add layout + compute_layout support: TransformerNMT, BERT, ALBERT, ELECTRA, MobileBERT, RoBERTA, XLMR (#1258)

        * Add layout support

        * fix test

        * Update transformer.py

        * Update transformer.py

        * Update README.md

        * try to add set_layout

        * update test case

        * fix

        * update

        * update

        * update

        * Update bert.py

        * fix bug

        * update

        * Update test_models_bert.py

        * Update tokenizers.py

        * add compute layout

        * Update xlmr.py

        * Update test_models_bert.py

        * revise test cases

        * Update layers.py

        * move jieba to try import

        * fix

        * Update transformer.py

        * fix

        * Update bert.py

        * Update setup.py

        * Update test_models_bert.py

        * Update test_models_bert.py

        * fix

        * update

        * Revise

        * Update electra.py

        * Update electra.py

        * Update test_models_electra.py

        * fix

        * fix bug

        * Update test_models_albert.py

        * add more testcases

        * fix

        * Update albert.py

        * Update albert.py

        * fix bug

        * fix testcase

        * Update test_models_electra.py

        * Update bert.py

        * update

        * Update test_models_electra.py

        * Update mobilebert.py

        * Update mobilebert.py

        * update mobilebert

        * Update test_models_mobilebert.py

        * Update mobilebert.py

        * fix bug

        * Update roberta.py

        * fix roberta

        * update

        * update

        * fix import

        * fix bug

        * update

        * reduce test workloads

        * address comment

        * address comment

    commit 4d43f82
    Author: Sheng Zha <[email protected]>
    Date:   Mon Jul 27 20:21:00 2020 -0700

        add subversion/wget to docker, add readme (#1279)

    commit d76897b
    Author: phile <[email protected]>
    Date:   Tue Jul 28 10:10:13 2020 +0800

        Add embedding related methods in numpy version (#1263)

        * A draft for embedding

        * fix embed_loader

        * add hyperbolic space and some updates

        * revise evaluation

        * fix

        * simple fixes

        * move l2norm to op.py

        * new features

        * fix

        * update

        * add tests, update

        * newline

* fix comment

* use xavier for embedding initializer
zheyuye authored Jul 30, 2020

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit 2294421b990ce92fedfb5876aa3ee4dd119d83b4
55 changes: 32 additions & 23 deletions scripts/conversion_toolkits/README.md
Original file line number Diff line number Diff line change
@@ -12,6 +12,8 @@ The testing step mentioned above are controlled by the flag `--test`, in which t
tolerance of 1e-3 between gluon model with converted weights and original tensorflow model.
In addition, we can use GPU in all converting scripts by adding `--gpu 0`.

For RoBERTa XLM-R and BART model, please instal the [fairseq](https://github.com/pytorch/fairseq#requirements-and-installation) package locally as `pip install git+https://github.com/pytorch/fairseq.git@master`.

## BERT
Convert model from [BERT LIST](https://tfhub.dev/google/collections/bert/1).

@@ -37,25 +39,42 @@ do
done
```

## RoBERTa
## ELECTRA
The TF Hub is not available for ELECTRA model currently.
Thus, you will need to clone the [electra repository](https://github.com/ZheyuYe/electra)
and download the checkpoint. The parameters are converted from local checkpoints.
By running the following command, you can convert + verify the ELECTRA model with both the discriminator and the generator.

Notice: pleas set up the `--electra_path` with the cloned path ~~or get this electra repository packaged by `pip install -e .`.~~

```bash
# Need to use TF 1.13.2 to use contrib layer
pip uninstall tensorflow
pip install tensorflow==1.13.2

# Actual conversion
bash convert_electra.sh
```

## Mobile Bert
```bash
pip install fairseq==0.9.0
bash convert_mobilebert.sh
```

## RoBERTa
```bash
for model in base large
do
mkdir roberta_${model}
wget "https://dl.fbaipublicfiles.com/fairseq/models/roberta.${model}.tar.gz"
tar zxf roberta.${model}.tar.gz --directory roberta_${model}
python convert_fairseq_roberta.py --fairseq_model_path roberta_${model}/roberta.${model} --model_size ${model} --test
python convert_fairseq_roberta.py --fairseq_model_path roberta_${model}/roberta.${model} --test
done
```

## XLM-R

```bash
pip install fairseq==0.9.0

for model in base large
do
mkdir xlmr_${model}
@@ -65,23 +84,13 @@ do
done
```

## ELECTRA
The TF Hub is not available for ELECTRA model currently.
Thus, you will need to clone the [electra repository](https://github.com/ZheyuYe/electra)
and download the checkpoint. The parameters are converted from local checkpoints.
By running the following command, you can convert + verify the ELECTRA model with both the discriminator and the generator.

Notice: pleas set up the `--electra_path` with the cloned path or get this electra repository packaged by `pip install -e .`.

## BART
```bash
# Need to use TF 1.13.2 to use contrib layer
pip install tensorflow==1.13.2 --upgrade --force-reinstall

# Actual conversion
bash convert_electra.sh
```

## Mobile Bert
```bash
bash convert_mobilebert.sh
for model in base large
do
mkdir bart_${model}
wget "https://dl.fbaipublicfiles.com/fairseq/models/bart.${model}.tar.gz"
tar zxf bart.${model}.tar.gz --directory bart_${model}
python convert_fairseq_bart.py --fairseq_model_path bart_${model}/bart.${model} --test
done
```
7 changes: 7 additions & 0 deletions scripts/conversion_toolkits/convert_bart.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
for model in base large
do
mkdir bart_${model}
wget "https://dl.fbaipublicfiles.com/fairseq/models/bart.${model}.tar.gz"
tar zxf bart.${model}.tar.gz --directory bart_${model}
python convert_fairseq_bart.py --fairseq_model_path bart_${model}/bart.${model} --test
done
17 changes: 10 additions & 7 deletions scripts/conversion_toolkits/convert_electra.py
Original file line number Diff line number Diff line change
@@ -53,7 +53,9 @@ def read_tf_checkpoint(path):
return tensors


def get_dict_config(model_size, electra_dir):
def get_dict_config(model_size, electra_path):
sys.path.append(electra_path)
electra_dir = os.path.abspath(os.path.join(os.path.dirname(electra_path), os.path.pardir))
sys.path.append(electra_dir)
from electra.util.training_utils import get_bert_config
from electra.configure_pretraining import PretrainingConfig
@@ -100,7 +102,7 @@ def convert_tf_config(config_dict, vocab_size):
return cfg


def convert_tf_assets(tf_assets_dir, model_size, electra_dir):
def convert_tf_assets(tf_assets_dir, model_size, electra_path):
"""Convert the assets file including config, vocab and tokenizer model"""
file_names = os.listdir(tf_assets_dir)
vocab_path = None
@@ -113,7 +115,7 @@ def convert_tf_assets(tf_assets_dir, model_size, electra_dir):
if vocab_path:
vocab_path = os.path.join(tf_assets_dir, vocab_path)
vocab_size = len(open(vocab_path, 'rU').readlines())
config_dict = get_dict_config(model_size, electra_dir)
config_dict = get_dict_config(model_size, electra_path)
cfg = convert_tf_config(config_dict, vocab_size)
return cfg, vocab_path

@@ -190,12 +192,12 @@ def get_name_map(tf_names, convert_type='backbone'):
return name_map


def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, electra_dir):
def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, electra_path):
ctx = mx.gpu(gpu) if gpu is not None else mx.cpu()
if not os.path.exists(save_dir):
os.makedirs(save_dir)

cfg, vocab_path = convert_tf_assets(model_dir, model_size, electra_dir)
cfg, vocab_path = convert_tf_assets(model_dir, model_size, electra_path)
with open(os.path.join(save_dir, 'model.yml'), 'w') as of:
of.write(cfg.dump())
new_vocab = HuggingFaceWordPieceTokenizer(
@@ -234,6 +236,8 @@ def convert_tf_model(model_dir, save_dir, test_conversion, model_size, gpu, elec
tf_names = list(tf_names)

# reload the electra module for this local scope
sys.path.append(electra_path)
electra_dir = os.path.abspath(os.path.join(os.path.dirname(electra_path), os.path.pardir))
sys.path.append(electra_dir)
from electra.util.training_utils import get_bert_config
from electra.configure_pretraining import PretrainingConfig
@@ -426,11 +430,10 @@ def convert_qkv_weights(tf_prefix, mx_prefix):
logging_config()
save_dir = args.save_dir if args.save_dir is not None else os.path.basename(
args.tf_model_path) + '_gluon'
electra_dir = os.path.abspath(os.path.join(os.path.dirname(args.electra_path), os.path.pardir))
convert_tf_model(
args.tf_model_path,
save_dir,
args.test,
args.model_size,
args.gpu,
electra_dir)
args.electra_path)
Loading