2x faster Gemma 2
Gemma 2 support
We now support Gemma 2! It's 2x faster and uses 63% less VRAM than HF+FA2!
We have a Gemma 2 9b notebook here: https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing
To use Gemma 2, please update Unsloth:
pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git
Head over to our blog post: https://unsloth.ai/blog/gemma2 for more details.
We uploaded 4bit quants for 4x faster downloading to:
https://huggingface.co/unsloth/gemma-2-9b-bnb-4bit
https://huggingface.co/unsloth/gemma-2-27b-bnb-4bit
https://huggingface.co/unsloth/gemma-2-9b-it-bnb-4bit
https://huggingface.co/unsloth/gemma-2-27b-it-bnb-4bit
Continued pretraining
You can now do continued pretraining with Unsloth. See https://unsloth.ai/blog/contpretraining for more details!
Continued pretraining is 2x faster and uses 50% less VRAM than HF + FA2 QLoRA. We offload embed_tokens
and lm_head
to disk to save VRAM!
You can now simply use both in the target modules like below:
model = FastLanguageModel.get_peft_model(
model,
r = 128, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
"embed_tokens", "lm_head",], # Add for continual pretraining
lora_alpha = 32,
lora_dropout = 0, # Supports any, but = 0 is optimized
bias = "none", # Supports any, but = "none" is optimized
# [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
random_state = 3407,
use_rslora = True, # We support rank stabilized LoRA
loftq_config = None, # And LoftQ
)
We also allow 2 learning rates - one for the embedding matrices and another for the LoRA adapters:
from unsloth import is_bfloat16_supported
from unsloth import UnslothTrainer, UnslothTrainingArguments
trainer = UnslothTrainer(
args = UnslothTrainingArguments(
....
learning_rate = 5e-5,
embedding_learning_rate = 5e-6,
),
)
We also share a free Colab to finetune Mistral v3 to learn Korean (you can select any language you like) using Wikipedia and the Aya Dataset: https://colab.research.google.com/drive/1tEd1FrOXWMnCU9UIvdYhs61tkxdMuKZu?usp=sharing
And we're sharing our free Colab notebook for continued pretraining for text completion: https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing
What's Changed
- Ollama Chat Templates by @danielhanchen in #582
- Fix case where GGUF saving fails when model_dtype is torch.float16 ("f16") by @chrehall68 in #630
- Support revision parameter in FastLanguageModel.from_pretrained by @chrehall68 in #629
- clears any selected_adapters before calling internal_model.save_pretr… by @neph1 in #609
- Check for incompatible modules before importing unsloth by @xyangk in #602
- Fix #603 handling of formatting_func in tokenizer_utils for assitant/chat/completion training by @Oseltamivir in #604
- Add GGML saving option to Unsloth for easier Ollama model creation and testing. by @mahiatlinux in #345
- Add Documentation for LoraConfig Parameters by @sebdg in #619
- llama.cpp failing by @bet0x in #371
- fix libcuda_dirs import for triton 3.0 by @t-vi in #227
- Nightly by @danielhanchen in #632
- README: Fix minor typo. by @shaper in #559
- Qwen bug fixes by @danielhanchen in #639
- Fix segfaults by @danielhanchen in #641
- Nightly by @danielhanchen in #646
- Nightly by @danielhanchen in #648
- Nightly by @danielhanchen in #649
- Fix breaking bug in save.py with interpreting quantization_method as a string when saving to gguf by @ArcadaLabs-Jason in #651
- Revert "Fix breaking bug in save.py with interpreting quantization_method as a string when saving to gguf" by @danielhanchen in #652
- Revert "Revert "Fix breaking bug in save.py with interpreting quantization_method as a string when saving to gguf"" by @danielhanchen in #653
- Fix GGUF by @danielhanchen in #654
- Fix continuing LoRA finetuning by @danielhanchen in #656
New Contributors
- @chrehall68 made their first contribution in #630
- @neph1 made their first contribution in #609
- @xyangk made their first contribution in #602
- @Oseltamivir made their first contribution in #604
- @mahiatlinux made their first contribution in #345
- @sebdg made their first contribution in #619
- @bet0x made their first contribution in #371
- @t-vi made their first contribution in #227
- @shaper made their first contribution in #559
- @ArcadaLabs-Jason made their first contribution in #651
Full Changelog: https://github.com/unslothai/unsloth/commits/June-2024