Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

Open
iwr-redmond opened this issue Dec 5, 2024 · 3 comments
Open

[Feature Request] Incorporate HiDiffusion (monofy-org) #11

iwr-redmond opened this issue Dec 5, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@iwr-redmond
Copy link

iwr-redmond commented Dec 5, 2024

HiDiffusion allows for higher-resolution image-generation in SD15 and SDXL.

The original package is broken (the maintainer chamber-of-commerced it) but has been updated by John Street in this PR to fix the erroneous code.

This branch seems suitable to be incorporated into stablepy as another extension. It could then be called upon by adding the apply_hidiffusion and remove_hidiffusion activations in model.py at around about L1826 in a manner similar to the existing FreeU code.

if HiDiffusion and self.class_name != FLUX:
    logger.info("HiDiffusion active")
    apply_hidiffusion(pipe)
    self.HiDiffusion = True
elif self.HiDiffusion:
    remove_hidiffusion(pipe)
    self.HiDiffusion = False
fi

Note that if an SD15 model does not have a built-in (baked) VAE, it would be highly advantageous to discover this and apply the default SD15 VAE. There is code in the diffusers conversion pipeline (lines 636-643) to detect baked VAEs, and so I suggest trialing the following process for SD15/SDXL to replace your Default VAE SDXL code in model.py lines 602-622:

  1. Probe the single file model for the VAE key per the Diffusers code:

vae_key = "first_stage_model." if any(k.startswith("first_stage_model.") for k in keys) else ""

Detecting a VAE in diffusers format can be done using path (per this gist):

vae_path = osp.join(args.model_path, "vae", "diffusion_pytorch_model.safetensors")

  1. If there is a bakedVAE, load this (SD15 + SDXL)
  2. If SDXL & no bakedVAE, load the vae-fp16-fix
  3. If SD15 & HiDiffusion & no bakedVAE, load the SD15 default VAE (stabilityai/sd-vae-ft-mse-original)
@R3gm R3gm added the enhancement New feature or request label Dec 6, 2024
@iwr-redmond
Copy link
Author

iwr-redmond commented Dec 18, 2024

With the addition of your latest memory management commit, this code should be good enough for government work:

# HiDiffusion
if HiDiffusion and self.class_name != FLUX:
    # enable HiDiffusion
    logger.info("HiDiffusion active")
    self.pipe.enable_vae_tiling()
    apply_hidiffusion(self.pipe)
    self.HiDiffusion = True
    # warn about incompatible options
    if adetailer_A or adetailer_B or upscaler_model_path or hires_steps > 0 or face_restoration_model:
        logger.warning("Disabling incompatible generation options!")
    # always disable incompatible options just in case
    adetailer_A = False
    adetailer_A_params = None
    adetailer_B = False
    adetailer_B_params = None
    upscaler_model_path = None
    hires_steps = 0 # merely disabling the model path is insufficient
    face_restoration_model = None
elif self.HiDiffusion:
    remove_hidiffusion(self.pipe)
    self.HiDiffusion = False

Borrowing the a prompt from Civitai:

checkpoint = "realisticVisionV60B1_v60B1VAE.safetensors"
positive_prompt = "photo of old village, evening, clouds"
negative_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime), text, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"

image, info_image = model(
    prompt=positive_prompt, 
    negative_prompt=negative_prompt, 
    num_steps=30, sampler="DPM++ SDE", 
    schedule_type="Karras", 
    img_width=2048, 
    img_height=2048, 
    FreeU = True, 
    HiDiffusion=True, # newly added parameter
)

00005_realisticVisionV60B1_v60B1VAE_386572327

The prompt needs work! But the technology appears fine, with good memory efficiency (~6GB for 4.4GB checkpoint) and acceptable speed (119s for 30 steps, shown as "4.00s/it"). Not great for generation necessarily, but I would wager great for inpainting at high-resolution.

@John6666cat
Copy link

I think it would be interesting if HiDiffusion could be used with stablepy.

But prioritizing Diffusers' baked VAE with that code might not be very reliable. The current implementation is more reliable.
Diffusers' from_single_file and save_pretrained do not check the contents of the actual VAE model. They only check whether or not it exists. Even if the model author has not explicitly baked the VAE using A1111 WebUI or something similar, the VAE folder will still be output during conversion.
Even if the VAE is broken during merging, as long as the tensor keys are still there, the file will still be in existence... Well, that's correct for a library.
Since the official HF conversion space is exactly the same as this implementation,😅 there should be quite a few model repos like this.
Well, just like the Civitai models, it's probably best to assume that we can't trust the VAE in the Diffusers models in HF either. If the repo has been adjusted to be usable with HF's Serverless Inference API, it's usually fine, but there's no reliable way to tell.

@iwr-redmond
Copy link
Author

I reckon that HiDiffusion would have to be tested with specific models for compatibility, just like models and schedulers need to be tested together. As it doesn't work with Flux, HiDiffusion isn't a viable replacement for standard Hires Fix and upscaling pipelines - even if that was a good idea, which I doubt. That means it will only ever be an optional extra. And as such, if the integration were hard - now my code is only ever good for government work, so when it does work it's tinfoil hat time because anything easier would be a setup - I wouldn't have made his feature request. But it is straightforward, what means that if it fits within @R3gm's vision for the project it won't take a moment to add.

FWIW, HiDiffusion has been integrated into SD.Next (also diffusers based) since June and doesn't appear to have caused much trouble, although it was sidelined in the UI recently and therefore might have been underutilized. For stablepy, if there are specific configurations that are likely to put a big hole in the fence, these could be warned about and/or blocked. The sample code above already disables other high-resolution related options to prevent doubling up, so more of this type of scaffolding could be added. For example, stablepy could default to a known working VAE by default if HiDiffusion = True and require the user to override that default. If I understand correctly, that was the default SDXL behavior in general until 44942d7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants