request for the training code #30

leonardxie · 2024-06-11T04:17:47Z

hi, thank you for your excellent work.
Do you have any plans to share the training code?
i want to reproduce the training but raises error as followers

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

The text was updated successfully, but these errors were encountered:

XiaoYiWeio · 2024-06-18T12:04:47Z

me too

marcio-afr · 2024-06-18T21:19:52Z

I also tried to run a training script using Trainer class from Huggingface and I faced several issues and errors including:

Diverging tensor devices (lora weights created on cpu when the model is in gpu)
Different tensor dtype (multiplication between float and bfloat16 when the model is on bfloat16)
Missing peft configs in xLoRAConfig class.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

request for the training code #30

request for the training code #30

leonardxie commented Jun 11, 2024

XiaoYiWeio commented Jun 18, 2024

marcio-afr commented Jun 18, 2024

request for the training code #30

request for the training code #30

Comments

leonardxie commented Jun 11, 2024

XiaoYiWeio commented Jun 18, 2024

marcio-afr commented Jun 18, 2024