Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md #275

Merged
merged 3 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,15 +76,15 @@ Optional dependencies can also be combines with [option1,option2].

# Where to find the models?

You can find llama v2 models on HuggingFace hub [here](https://huggingface.co/meta-llama), where models with `hf` in the name are already converted to HuggingFace checkpoints so no further conversion is needed. The conversion step below is only for original model weights from Meta that are hosted on HuggingFace model hub as well.
You can find llama v2 models on Hugging Face hub [here](https://huggingface.co/meta-llama), where models with `hf` in the name are already converted to Hugging Face checkpoints so no further conversion is needed. The conversion step below is only for original model weights from Meta that are hosted on Hugging Face model hub as well.

# Model conversion to Hugging Face
The recipes and notebooks in this folder are using the Llama 2 model definition provided by Hugging Face's transformers library.

Given that the original checkpoint resides under models/7B you can install all requirements and convert the checkpoint with:

```bash
## Install HuggingFace Transformers from source
## Install Hugging Face Transformers from source
pip freeze | grep transformers ## verify it is version 4.31.0 or higher

git clone [email protected]:huggingface/transformers.git
Expand Down Expand Up @@ -141,7 +141,7 @@ Here we use FSDP as discussed in the next section which can be used along with P

## Flash Attention and Xformer Memory Efficient Kernels

Setting `use_fast_kernels` will enable using of Flash Attention or Xformer memory-efficient kernels based on the hardware being used. This would speed up the fine-tuning job. This has been enabled in `optimum` library from HuggingFace as a one-liner API, please read more [here](https://pytorch.org/blog/out-of-the-box-acceleration/).
Setting `use_fast_kernels` will enable using of Flash Attention or Xformer memory-efficient kernels based on the hardware being used. This would speed up the fine-tuning job. This has been enabled in `optimum` library from Hugging Face as a one-liner API, please read more [here](https://pytorch.org/blog/out-of-the-box-acceleration/).

```bash
torchrun --nnodes 1 --nproc_per_node 4 examples/finetuning.py --enable_fsdp --use_peft --peft_method lora --model_name /patht_of_model_folder/7B --fsdp_config.pure_bf16 --output_dir Path/to/save/PEFT/model --use_fast_kernels
Expand Down
6 changes: 1 addition & 5 deletions scripts/spellcheck_conf/wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,6 @@ AWS
Benchmarking
Captum
Grafana
HuggingFace
JMeter
KMS
Kubeflow
Expand Down Expand Up @@ -444,7 +443,6 @@ tokenizer
vidhya
vocabs
AutoConfig
Huggingface's
ScriptFunction
transfomers
BBM
Expand Down Expand Up @@ -521,7 +519,6 @@ config
http
mnist
resnet
Huggingface
PyTorch
benchmarking
bert
Expand Down Expand Up @@ -577,7 +574,6 @@ mtail
scarpe
NVidia
WaveGlow
huggingface
torchServe
CProfile
KSERVE
Expand Down Expand Up @@ -1143,7 +1139,7 @@ dataclass
datafiles
davinci
GPU's
HuggingFace's
Face's
LoRA
bitsandbytes
CLA
Expand Down
Loading