Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Update index and quicktour #1191

Merged
merged 6 commits into from
Dec 4, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 16 additions & 12 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,24 +9,26 @@

- title: Task guides
sections:
- local: task_guides/image_classification_lora
title: Image classification using LoRA
- local: task_guides/seq2seq-prefix-tuning
title: Prefix tuning for conditional generation
- local: task_guides/clm-prompt-tuning
title: Prompt tuning for causal language modeling
- local: task_guides/semantic_segmentation_lora
title: Semantic segmentation using LoRA
- local: task_guides/ptuning-seq-classification
title: P-tuning for sequence classification
- local: task_guides/dreambooth_lora
title: Dreambooth fine-tuning with LoRA
- local: task_guides/token-classification-lora
title: LoRA for token classification
- local: task_guides/int8-asr
title: int8 training for automatic speech recognition
- local: task_guides/semantic-similarity-lora
title: Semantic similarity with LoRA
- title: LoRA
BenjaminBossan marked this conversation as resolved.
Show resolved Hide resolved
sections:
- local: task_guides/image_classification_lora
title: Image classification
- local: task_guides/semantic_segmentation_lora
title: Semantic segmentation
- local: task_guides/token-classification-lora
title: Token classification
- local: task_guides/semantic-similarity-lora
title: Semantic similarity
- local: task_guides/int8-asr
title: int8 training for automatic speech recognition
- local: task_guides/dreambooth_lora
title: DreamBooth

- title: Developer guides
sections:
Expand Down Expand Up @@ -57,6 +59,8 @@

- title: Reference
sections:
- local: package_reference/auto_class
title: AutoPeftModel
- local: package_reference/peft_model
title: PEFT model
- local: package_reference/config
Expand Down
109 changes: 8 additions & 101 deletions docs/source/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,9 @@ rendered properly in your Markdown viewer.

# PEFT

🤗 PEFT, or Parameter-Efficient Fine-Tuning (PEFT), is a library for efficiently adapting pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters.
PEFT methods only fine-tune a small number of (extra) model parameters, significantly decreasing computational and storage costs because fine-tuning large-scale PLMs is prohibitively costly.
Recent state-of-the-art PEFT techniques achieve performance comparable to that of full fine-tuning.
🤗 PEFT (Parameter-Efficient Fine-Tuning) is a library for efficiently adapting large pretrained models to various downstream applications without fine-tuning all of a model's parameters because it is prohibitively costly. PEFT methods only fine-tune a small number of (extra) model parameters - significantly decreasing computational and storage costs - while yielding performance comparable to a fully fine-tuned model. This makes it more accessible to train and store large language models (LLMs) on consumer hardware.

PEFT is seamlessly integrated with 🤗 Accelerate for large-scale models leveraging DeepSpeed and [Big Model Inference](https://huggingface.co/docs/accelerate/usage_guides/big_modeling).
PEFT is integrated with the Transformers, Diffusers, and Accelerate libraries to provide a faster and easier way to load, train, and use large models for inference.

<div class="mt-10">
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">
Expand All @@ -43,100 +41,9 @@ PEFT is seamlessly integrated with 🤗 Accelerate for large-scale models levera
</div>
</div>

## Supported methods

1. LoRA: [LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS](https://arxiv.org/pdf/2106.09685.pdf)
2. Prefix Tuning: [Prefix-Tuning: Optimizing Continuous Prompts for Generation](https://aclanthology.org/2021.acl-long.353/), [P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks](https://arxiv.org/pdf/2110.07602.pdf)
3. P-Tuning: [GPT Understands, Too](https://arxiv.org/pdf/2103.10385.pdf)
4. Prompt Tuning: [The Power of Scale for Parameter-Efficient Prompt Tuning](https://arxiv.org/pdf/2104.08691.pdf)
stevhliu marked this conversation as resolved.
Show resolved Hide resolved
5. AdaLoRA: [Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning](https://arxiv.org/abs/2303.10512)
6. [LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention](https://github.com/ZrrSkywalker/LLaMA-Adapter)
7. IA3: [Infused Adapter by Inhibiting and Amplifying Inner Activations](https://arxiv.org/abs/2205.05638)

## Supported models

The tables provided below list the PEFT methods and models supported for each task. To apply a particular PEFT method for
a task, please refer to the corresponding Task guides.

### Causal Language Modeling

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
|--------------| ---- | ---- | ---- | ---- | ---- |
| GPT-2 | ✅ | ✅ | ✅ | ✅ | ✅ |
| Bloom | ✅ | ✅ | ✅ | ✅ | ✅ |
| OPT | ✅ | ✅ | ✅ | ✅ | ✅ |
| GPT-Neo | ✅ | ✅ | ✅ | ✅ | ✅ |
| GPT-J | ✅ | ✅ | ✅ | ✅ | ✅ |
| GPT-NeoX-20B | ✅ | ✅ | ✅ | ✅ | ✅ |
| LLaMA | ✅ | ✅ | ✅ | ✅ | ✅ |
| ChatGLM | ✅ | ✅ | ✅ | ✅ | ✅ |

### Conditional Generation

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- |
| T5 | ✅ | ✅ | ✅ | ✅ | ✅ |
| BART | ✅ | ✅ | ✅ | ✅ | ✅ |

### Sequence Classification

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- |
| BERT | ✅ | ✅ | ✅ | ✅ | ✅ |
| RoBERTa | ✅ | ✅ | ✅ | ✅ | ✅ |
| GPT-2 | ✅ | ✅ | ✅ | ✅ | |
| Bloom | ✅ | ✅ | ✅ | ✅ | |
| OPT | ✅ | ✅ | ✅ | ✅ | |
| GPT-Neo | ✅ | ✅ | ✅ | ✅ | |
| GPT-J | ✅ | ✅ | ✅ | ✅ | |
| Deberta | ✅ | | ✅ | ✅ | |
| Deberta-v2 | ✅ | | ✅ | ✅ | |

### Token Classification

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | --- |
| BERT | ✅ | ✅ | | | |
| RoBERTa | ✅ | ✅ | | | |
| GPT-2 | ✅ | ✅ | | | |
| Bloom | ✅ | ✅ | | | |
| OPT | ✅ | ✅ | | | |
| GPT-Neo | ✅ | ✅ | | | |
| GPT-J | ✅ | ✅ | | | |
| Deberta | ✅ | | | | |
| Deberta-v2 | ✅ | | | | |

### Text-to-Image Generation

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- |
| Stable Diffusion | ✅ | | | | |


### Image Classification

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- | ---- |
| ViT | ✅ | | | | |
| Swin | ✅ | | | | |

### Image to text (Multi-modal models)

We have tested LoRA for [ViT](https://huggingface.co/docs/transformers/model_doc/vit) and [Swin](https://huggingface.co/docs/transformers/model_doc/swin) for fine-tuning on image classification.
However, it should be possible to use LoRA for any [ViT-based model](https://huggingface.co/models?pipeline_tag=image-classification&sort=downloads&search=vit) from 🤗 Transformers.
Check out the [Image classification](/task_guides/image_classification_lora) task guide to learn more. If you run into problems, please open an issue.

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- |
| Blip-2 | ✅ | | | | |


### Semantic Segmentation

As with image-to-text models, you should be able to apply LoRA to any of the [segmentation models](https://huggingface.co/models?pipeline_tag=image-segmentation&sort=downloads).
It's worth noting that we haven't tested this with every architecture yet. Therefore, if you come across any issues, kindly create an issue report.

| Model | LoRA | Prefix Tuning | P-Tuning | Prompt Tuning | IA3 |
| --------- | ---- | ---- | ---- | ---- | ---- |
| SegFormer | ✅ | | | | |

<iframe
src="https://stevhliu-peft-methods.hf.space"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice space. Is it somehow updated automatically or would it need to be updated each time we add a new model?

Also, I'd reword the description a little bit:

Discover what models and PEFT methods are supported for a given task

This could give the impression that other models don't work at all. However, almost all models work, if they're not supported, it's only a matter of setting the config correctly. I think this is important to highlight, as some other similar libraries are actually only working with specific model architectures.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the underlying dataset would need to be updated each time a new model is added or if a new method is supported. Let me know if you have any suggestions for automating it though, that'd be nice!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it would be possible. For example, we could set up a cron job or GitHub CI to parse the README (assuming that should be the single source of truth) with pandoc, extract the table, and create an update the on the dataset. But that seems like overkill for now. Probably we, the maintainers, just need to remember to update the dataset each time a new model is supported.

frameborder="0"
width="850"
height="620"
></iframe>
48 changes: 48 additions & 0 deletions docs/source/package_reference/auto_class.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->

# AutoPeftModels

The `AutoPeftModel` classes loads the appropriate PEFT model for the task type by automatically inferring it from the configuration file. They are designed to quickly and easily load a PEFT model in a single line of code without having to worry about which exact model class you need or manually loading a [`PeftConfig`].

## AutoPeftModel

[[autodoc]] auto.AutoPeftModel
- from_pretrained

## AutoPeftModelForCausalLM

[[autodoc]] auto.AutoPeftModelForCausalLM

## AutoPeftModelForSeq2SeqLM

[[autodoc]] auto.AutoPeftModelForSeq2SeqLM

## AutoPeftModelForSequenceClassification

[[autodoc]] auto.AutoPeftModelForSequenceClassification

## AutoPeftModelForTokenClassification

[[autodoc]] auto.AutoPeftModelForTokenClassification

## AutoPeftModelForQuestionAnswering

[[autodoc]] auto.AutoPeftModelForQuestionAnswering

## AutoPeftModelForFeatureExtraction

[[autodoc]] auto.AutoPeftModelForFeatureExtraction
Loading
Loading