You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reckon I must have been raised in a barn, because I was unaware that like Fooocus Diffusers can support transferring latents between two checkpoints at a certain point in the generation.
Max Woolf demonstrates this in passing while describing his SDXL negative Lora, describing it as a "Mixture of Experts":
high_noise_frac=0.8# equivalent to the "Refiner Switch At" in Fooocusimage=base(
prompt=prompt,
negative_prompt=negative_prompt,
denoising_end=high_noise_frac,
output_type="latent",
).imagesimage=refiner(
prompt=prompt,
negative_prompt=negative_prompt,
denoising_start=high_noise_frac,
image=image,
).images[0]
Notes:
Whether this implementation "can reuse the base model's momentum" (per lllyasviel's comment here) is not immediately clear.
Doubling the model memory required will pose some VRAM issues, but if I understand correctly both Fooocus and SD.Next manage low-VRAM problems by loading everything onto CPU and only using the GPU for the specific portions of the model required at each generation step. That might be overkill here, as merely loading the two checkpoints onto the CPU and then moving them sequentially over to the GPU could be close enough. That would be equivalent to diffusers' model_cpu_offload option, rather than the sequential_cpu_offload that minimizes as much memory as possible at the expense of speed.
Unlike Fooocus, this would not allow for latents to be interposed between SDXL (primary) and an SD1.5 checkpoint (refiner/secondary), although I reckon it would allow for an SD15 inpainting checkpoint to be used as a refiner. I believe a multi-architecture feature would need the Latent Interposer code and checkpoints, which would be a much larger effort outside the scope of this feature request.
The text was updated successfully, but these errors were encountered:
I reckon I must have been raised in a barn, because I was unaware that like Fooocus Diffusers can support transferring latents between two checkpoints at a certain point in the generation.
Max Woolf demonstrates this in passing while describing his SDXL negative Lora, describing it as a "Mixture of Experts":
Notes:
Whether this implementation "can reuse the base model's momentum" (per lllyasviel's comment here) is not immediately clear.
Doubling the model memory required will pose some VRAM issues, but if I understand correctly both Fooocus and SD.Next manage low-VRAM problems by loading everything onto CPU and only using the GPU for the specific portions of the model required at each generation step. That might be overkill here, as merely loading the two checkpoints onto the CPU and then moving them sequentially over to the GPU could be close enough. That would be equivalent to diffusers' model_cpu_offload option, rather than the sequential_cpu_offload that minimizes as much memory as possible at the expense of speed.
Unlike Fooocus, this would not allow for latents to be interposed between SDXL (primary) and an SD1.5 checkpoint (refiner/secondary), although I reckon it would allow for an SD15 inpainting checkpoint to be used as a refiner. I believe a multi-architecture feature would need the Latent Interposer code and checkpoints, which would be a much larger effort outside the scope of this feature request.
The text was updated successfully, but these errors were encountered: