[Feature Request] Incorporate HiDiffusion (monofy-org) #11

iwr-redmond · 2024-12-05T13:36:15Z

HiDiffusion allows for higher-resolution image-generation in SD15 and SDXL.

The original package is broken (the maintainer chamber-of-commerced it) but has been updated by John Street in this PR to fix the erroneous code.

This branch seems suitable to be incorporated into stablepy as another extension. It could then be called upon by adding the apply_hidiffusion and remove_hidiffusion activations in model.py at around about L1826 in a manner similar to the existing FreeU code.

if HiDiffusion and self.class_name != FLUX:
    logger.info("HiDiffusion active")
    apply_hidiffusion(pipe)
    self.HiDiffusion = True
elif self.HiDiffusion:
    remove_hidiffusion(pipe)
    self.HiDiffusion = False
fi

Note that if an SD15 model does not have a built-in (baked) VAE, it would be highly advantageous to discover this and apply the default SD15 VAE. There is code in the diffusers conversion pipeline (lines 636-643) to detect baked VAEs, and so I suggest trialing the following process for SD15/SDXL to replace your Default VAE SDXL code in model.py lines 602-622:

Probe the single file model for the VAE key per the Diffusers code:

vae_key = "first_stage_model." if any(k.startswith("first_stage_model.") for k in keys) else ""

Detecting a VAE in diffusers format can be done using path (per this gist):

vae_path = osp.join(args.model_path, "vae", "diffusion_pytorch_model.safetensors")

If there is a bakedVAE, load this (SD15 + SDXL)
If SDXL & no bakedVAE, load the vae-fp16-fix
If SD15 & HiDiffusion & no bakedVAE, load the SD15 default VAE (stabilityai/sd-vae-ft-mse-original)

The text was updated successfully, but these errors were encountered:

iwr-redmond · 2024-12-18T20:03:11Z

With the addition of your latest memory management commit, this code should be good enough for government work:

# HiDiffusion
if HiDiffusion and self.class_name != FLUX:
    # enable HiDiffusion
    logger.info("HiDiffusion active")
    self.pipe.enable_vae_tiling()
    apply_hidiffusion(self.pipe)
    self.HiDiffusion = True
    # warn about incompatible options
    if adetailer_A or adetailer_B or upscaler_model_path or hires_steps > 0 or face_restoration_model:
        logger.warning("Disabling incompatible generation options!")
    # always disable incompatible options just in case
    adetailer_A = False
    adetailer_A_params = None
    adetailer_B = False
    adetailer_B_params = None
    upscaler_model_path = None
    hires_steps = 0 # merely disabling the model path is insufficient
    face_restoration_model = None
elif self.HiDiffusion:
    remove_hidiffusion(self.pipe)
    self.HiDiffusion = False

Borrowing the a prompt from Civitai:

checkpoint = "realisticVisionV60B1_v60B1VAE.safetensors"
positive_prompt = "photo of old village, evening, clouds"
negative_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"

image, info_image = model(
    prompt=positive_prompt, 
    negative_prompt=negative_prompt, 
    num_steps=30, sampler="DPM++ SDE", 
    schedule_type="Karras", 
    img_width=2048, 
    img_height=2048, 
    FreeU = True, 
    HiDiffusion=True, # newly added parameter
)

The prompt needs work! But the technology appears fine, with good memory efficiency (~6GB for 4.4GB checkpoint) and acceptable speed (119s for 30 steps, shown as "4.00s/it"). Not great for generation necessarily, but I would wager great for inpainting at high-resolution.

John6666cat · 2024-12-19T14:51:11Z

I think it would be interesting if HiDiffusion could be used with stablepy.

But prioritizing Diffusers' baked VAE with that code might not be very reliable. The current implementation is more reliable.
Diffusers' from_single_file and save_pretrained do not check the contents of the actual VAE model. They only check whether or not it exists. Even if the model author has not explicitly baked the VAE using A1111 WebUI or something similar, the VAE folder will still be output during conversion.
Even if the VAE is broken during merging, as long as the tensor keys are still there, the file will still be in existence... Well, that's correct for a library.
Since the official HF conversion space is exactly the same as this implementation,😅 there should be quite a few model repos like this.
Well, just like the Civitai models, it's probably best to assume that we can't trust the VAE in the Diffusers models in HF either. If the repo has been adjusted to be usable with HF's Serverless Inference API, it's usually fine, but there's no reliable way to tell.

iwr-redmond · 2024-12-19T22:50:49Z

I reckon that HiDiffusion would have to be tested with specific models for compatibility, just like models and schedulers need to be tested together. As it doesn't work with Flux, HiDiffusion isn't a viable replacement for standard Hires Fix and upscaling pipelines - even if that was a good idea, which I doubt. That means it will only ever be an optional extra. And as such, if the integration were hard - now my code is only ever good for government work, so when it does work it's tinfoil hat time because anything easier would be a setup - I wouldn't have made his feature request. But it is straightforward, what means that if it fits within @R3gm's vision for the project it won't take a moment to add.

FWIW, HiDiffusion has been integrated into SD.Next (also diffusers based) since June and doesn't appear to have caused much trouble, although it was sidelined in the UI recently and therefore might have been underutilized. For stablepy, if there are specific configurations that are likely to put a big hole in the fence, these could be warned about and/or blocked. The sample code above already disables other high-resolution related options to prevent doubling up, so more of this type of scaffolding could be added. For example, stablepy could default to a known working VAE by default if HiDiffusion = True and require the user to override that default. If I understand correctly, that was the default SDXL behavior in general until 44942d7.

R3gm added the enhancement New feature or request label Dec 6, 2024

iwr-redmond mentioned this issue Jan 3, 2025

[Possible Feature] ELLA for SD1.5 #20

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

iwr-redmond commented Dec 5, 2024 •

edited

Loading

iwr-redmond commented Dec 18, 2024 •

edited

Loading

John6666cat commented Dec 19, 2024

iwr-redmond commented Dec 19, 2024

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

Comments

iwr-redmond commented Dec 5, 2024 • edited Loading

iwr-redmond commented Dec 18, 2024 • edited Loading

John6666cat commented Dec 19, 2024

iwr-redmond commented Dec 19, 2024

iwr-redmond commented Dec 5, 2024 •

edited

Loading

iwr-redmond commented Dec 18, 2024 •

edited

Loading