You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/root/miniconda3/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 360, in train
output = super().train(*args, **kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 1780, in train
return inner_training_loop(
File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 2118, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/root/miniconda3/lib/python3.10/site-packages/transformers/trainer.py", line 3045, in training_step
self.accelerator.backward(loss)
File "/root/miniconda3/lib/python3.10/site-packages/accelerate/accelerator.py", line 2001, in backward
loss.backward(**kwargs)
File "/root/miniconda3/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/root/miniconda3/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/root/miniconda3/lib/python3.10/site-packages/torch/autograd/function.py", line 288, in apply
return user_fn(self, *args)
File "/root/miniconda3/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 288, in backward
torch.autograd.backward(outputs_with_grad, args_with_grad)
File "/root/miniconda3/lib/python3.10/site-packages/torch/autograd/__init__.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
Thanks.
The text was updated successfully, but these errors were encountered:
@crossxxd can you share your training code for mistral 7b base model ? I have been able to put the llama model on the training, however the training is very slow with no decrease in training loss. Your code for mistral 7b might help.
I've trained xlora with mistral 7b base model, it works fine. However, when switching base model to llama2 7b, it encountered an error.
This is my code for training.
And error is
Thanks.
The text was updated successfully, but these errors were encountered: