Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[executorch] Update phi-3-mini lora export code and readme #5327

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion examples/models/phi-3-mini-lora/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
## Summary
In this example, we export to ExecuTorch a model ([phi-3-mini](https://github.com/pytorch/executorch/tree/main/examples/models/phi-3-mini)) appended with attention and mlp LoRA layers. The model is exported to ExecuTorch for both inference and training. Note: the exported training model can only train at the moment.
In this example, we showcase how to export a model ([phi-3-mini](https://github.com/pytorch/executorch/tree/main/examples/models/phi-3-mini)) appended with LoRA layers to ExecuTorch. The model is exported to ExecuTorch for both inference and training.

To see how you can use the model exported for training in a fully involved finetuning loop, please see our example on [LLM PTE Fintetuning](https://github.com/pytorch/executorch/tree/main/examples/llm_pte_finetuning).

## Instructions
### Step 1: [Optional] Install ExecuTorch dependencies
Expand Down
23 changes: 17 additions & 6 deletions examples/models/phi-3-mini-lora/export_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,13 @@ def __init__(self, model, loss):
self.model = model
self.loss = loss

def forward(self, input):
def forward(self, input: torch.Tensor, labels: torch.Tensor) -> torch.Tensor:
# Output is of the shape (seq_len, vocab_size).
output = self.model(input)
target = zeros((1, vocab_size), dtype=long)
return self.loss(output, target)
logits = self.model(input)
logits = logits[..., :-1, :].contiguous()
labels = labels[..., 1:].contiguous()
logits = logits.transpose(1, 2)
return self.loss(logits, labels)


@no_grad()
Expand All @@ -47,7 +49,11 @@ def export_phi3_mini_lora(model) -> None:
model.eval()
# 1. torch.export: Defines the program with the ATen operator set.
print("Exporting to aten dialect")
example_args = (randint(0, 100, (1, 100), dtype=long),)
batch_size = 1
vocab_size = 100
seq_len = 10
tokens = randint(0, vocab_size, (batch_size, seq_len), dtype=long)
example_args = (tokens,)
with sdpa_kernel([SDPBackend.MATH]):
aten_dialect: ExportedProgram = export(model, example_args)

Expand Down Expand Up @@ -80,7 +86,12 @@ def export_phi3_mini_lora_training(model) -> None:
print("Exporting phi3-mini with LoRA for training")
# 1. torch.export: Defines the program with the ATen operator set.
print("Exporting to aten dialect")
example_args = (randint(0, 100, (1, 100), dtype=long),)
batch_size = 1
vocab_size = 100
seq_len = 10
tokens = randint(0, vocab_size, (batch_size, seq_len), dtype=long)
labels = tokens
example_args = (tokens, labels)
with sdpa_kernel([SDPBackend.MATH]):
exported_graph: ExportedProgram = export(model, example_args)
print("Creating a joint forward-backwards graph for training")
Expand Down
Loading