train_on_responses_only #1514

Zuozhuo · 2025-01-07T08:34:47Z

I saw the following code snippet in your qwen2.5 fine-tuning tutorial:

trainer = train_on_responses_only(
    trainer,
    instruction_part = "<|im_start|>user\n",
    response_part = "<|im_start|>assistant\n",
)

Here, trainer is an instance of SFTTrainer.

My question is, when I directly use the instantiated SFTTrainer to execute trainer.predict, the predictions in the result contains normal logits. However, after processing trainer with train_on_responses_only and then executing trainer.predict, I was surprised to find that the predictions in the result is an empty tuple.

Why does this happen? How can I make it return logits as expected?

The text was updated successfully, but these errors were encountered:

danielhanchen · 2025-01-10T12:29:35Z

Sorry on the delay - try doing at the very beginning before importing unsloth:

import os
os.environ["UNSLOTH_RETURN_LOGITS"] = "1"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train_on_responses_only #1514

train_on_responses_only #1514

Zuozhuo commented Jan 7, 2025

danielhanchen commented Jan 10, 2025

train_on_responses_only #1514

train_on_responses_only #1514

Comments

Zuozhuo commented Jan 7, 2025

danielhanchen commented Jan 10, 2025