Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train_on_responses_only #1514

Open
Zuozhuo opened this issue Jan 7, 2025 · 1 comment
Open

train_on_responses_only #1514

Zuozhuo opened this issue Jan 7, 2025 · 1 comment

Comments

@Zuozhuo
Copy link

Zuozhuo commented Jan 7, 2025

I saw the following code snippet in your qwen2.5 fine-tuning tutorial:

trainer = train_on_responses_only(
    trainer,
    instruction_part = "<|im_start|>user\n",
    response_part = "<|im_start|>assistant\n",
)

Here, trainer is an instance of SFTTrainer.

My question is, when I directly use the instantiated SFTTrainer to execute trainer.predict, the predictions in the result contains normal logits. However, after processing trainer with train_on_responses_only and then executing trainer.predict, I was surprised to find that the predictions in the result is an empty tuple.
image

Why does this happen? How can I make it return logits as expected?

@danielhanchen
Copy link
Contributor

Sorry on the delay - try doing at the very beginning before importing unsloth:

import os
os.environ["UNSLOTH_RETURN_LOGITS"] = "1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants