Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Commit

Permalink
Update URLs to consolidated OPT checkpoints (#701)
Browse files Browse the repository at this point in the history
Co-authored-by: Binh Tang <[email protected]>
  • Loading branch information
Binh Tang and tangbinh authored Apr 5, 2023
1 parent 99e95f1 commit eca010e
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 13 deletions.
4 changes: 2 additions & 2 deletions docs/faster-transformer.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ CKPT_DIR="${HOME}/checkpoints"
mkdir -p "${CKPT_DIR}/opt-125m"
wget https://github.com/facebookresearch/metaseq/raw/main/projects/OPT/assets/gpt2-merges.txt -P "${CKPT_DIR}"
wget https://github.com/facebookresearch/metaseq/raw/main/projects/OPT/assets/gpt2-vocab.json -P "${CKPT_DIR}"
for i in {0..1}; do wget "https://dl.fbaipublicfiles.com/opt/v1_20220502/125m/reshard-model_part-${i}.pt" -P "${CKPT_DIR}/opt-125m"; done
wget "https://dl.fbaipublicfiles.com/opt/v1_20230405/125m/reshard-model_part-0.pt" -P "${CKPT_DIR}/opt-125m"

# Install FasterTransformer
nvidia-docker run -tid --rm --shm-size 5g --name ft \
Expand All @@ -39,7 +39,7 @@ python "${SRC_DIR}/metaseq/scripts/convert_metaseq_ft.py" \

# Run interactive script
FT_PATH="lib/libth_transformer.so"
mpirun -n 2 --allow-run-as-root python "${SRC_DIR}/metaseq/cli/interactive_ft.py" \
mpirun -n 1 --allow-run-as-root python "${SRC_DIR}/metaseq/cli/interactive_ft.py" \
--num-layers 12 --num-heads 12 --embed-size 768 --vocab-size 50272 \
--vocab-file "${CKPT_DIR}/gpt2-vocab.json" --merges-file "${CKPT_DIR}/gpt2-merges.txt" \
--weight-path "${CKPT_DIR}/opt-125m-ft-mp2" --dtype fp16 \
Expand Down
21 changes: 10 additions & 11 deletions projects/OPT/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,16 @@
For notes regarding the development of all these models, please refer to our [chronicles](./chronicles/README.md).

## Pretrained Model Weights
| Model | Parameters | Pretrained weights |
|----------|:----------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| OPT-125M | 125M | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/125m/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/125m/reshard-model_part-1.pt) |
| OPT-350M | 350M | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/350m/reshard.pt) |
| OPT-1.3B | 1.3B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/1.3b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/1.3b/reshard-model_part-1.pt) |
| OPT-2.7B | 2.7B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/2.7b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/2.7b/reshard-model_part-1.pt), [part2](https://dl.fbaipublicfiles.com/opt/v1_20220502/2.7b/reshard-model_part-2.pt), [part3](https://dl.fbaipublicfiles.com/opt/v1_20220502/2.7b/reshard-model_part-3.pt) |
| OPT-6.7B | 6.7B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/6.7b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/6.7b/reshard-model_part-1.pt) |
| OPT-13B | 13B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/13b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/13b/reshard-model_part-1.pt) |
| OPT-30B | 30B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/30b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20220502/30b/reshard-model_part-1.pt) |
| OPT-66B | 66B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-1.pt), [part2](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-2.pt), [part3](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-3.pt), [part4](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-4.pt), [part5](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-5.pt), [part6](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-6.pt), [part7](https://dl.fbaipublicfiles.com/opt/v1_20221026/66b/reshard-model_part-7.pt) |
| OPT-175B | 175B | [request access here](https://forms.gle/BDB2i44QwCr2mCJN6) |
| Model | Parameters | Pretrained weights |
|-|:-:|:-:|
| OPT-125M | 125M | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/125m/reshard-model_part-0.pt) |
| OPT-350M | 350M| [part0](https://dl.fbaipublicfiles.com/opt/v1_20220502/350m/reshard.pt) |
| OPT-1.3B | 1.3B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/1.3b/reshard-model_part-0.pt) |
| OPT-2.7B | 2.7B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/2.7b/reshard-model_part-0.pt) |
| OPT-13B | 13B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/13b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20230405/13b/reshard-model_part-1.pt) |
| OPT-30B | 30B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/30b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20230405/30b/reshard-model_part-1.pt), [part2](https://dl.fbaipublicfiles.com/opt/v1_20230405/30b/reshard-model_part-2.pt), [part3](https://dl.fbaipublicfiles.com/opt/v1_20230405/30b/reshard-model_part-3.pt) |
| OPT-66B | 66B | [part0](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-0.pt), [part1](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-1.pt), [part2](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-2.pt), [part3](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-3.pt), [part4](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-4.pt), [part5](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-5.pt), [part6](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-6.pt), [part7](https://dl.fbaipublicfiles.com/opt/v1_20230405/66b/reshard-model_part-7.pt) |
| OPT-175B | 175B |[request access here](https://forms.gle/BDB2i44QwCr2mCJN6) |

For the 2.7B, 6.7B, and 13B, we also release intermediate checkpoints taken at every 10k steps. The full file list for all of these may be found [here](https://dl.fbaipublicfiles.com/OPT/filelist.txt).

Expand Down

0 comments on commit eca010e

Please sign in to comment.