What Works

Loader	Loading 1 LoRA	Loading 2 or more LoRAs	Training LoRAs	Multimodal extension	Perplexity evaluation
Transformers	✅	✅***	✅*	✅	✅
ExLlama_HF	✅	❌	❌	❌	✅
ExLlamav2_HF	✅	✅	❌	❌	✅
ExLlama	✅	❌	❌	❌	use ExLlama_HF
ExLlamav2	✅	✅	❌	❌	use ExLlamav2_HF
AutoGPTQ	✅	❌	❌	✅	✅
GPTQ-for-LLaMa	✅**	✅***	✅	✅	✅
llama.cpp	❌	❌	❌	❌	use llamacpp_HF
llamacpp_HF	❌	❌	❌	❌	✅
ctransformers	❌	❌	❌	❌	❌
AutoAWQ	?	❌	?	?	✅

❌ = not implemented

✅ = implemented

* Training LoRAs with GPTQ models also works with the Transformers loader. Make sure to check "auto-devices" and "disable_exllama" before loading the model.

** Requires the monkey-patch. The instructions can be found here.

*** Multi-LoRA in PEFT is tricky and the current implementation does not work reliably in all cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What Works.md

What Works.md

What Works

Files

What Works.md

Latest commit

History

What Works.md

File metadata and controls

What Works