GithubHelp home page GithubHelp logo

Comments (10)

waffletower avatar waffletower commented on August 18, 2024 2

Good news -- I was able to render a 1536x1024 image via Diffusers and the IF-I-XL -> IF-II-L -> 4x scaler pipeline configuration. This was done on an RTX 3090 -- memory usage just squeaked by at 23031MiB during the final scaling phase. I needed to make the following simple change to diffusers:

waffletower/diffusers@035b010

and provide correct dimension values (width in my case) for the first two stages:

width=96
width=384

respectively.

The pipeline configuration:

import sys
from diffusers import DiffusionPipeline, IFPipeline, IFSuperResolutionPipeline
from diffusers.utils import pt_to_pil
import torch
import numpy as np

# stage 1
stage_1 = IFPipeline.from_pretrained("DeepFloyd/IF-I-XL-v1.0", variant="fp16", torch_dtype=torch.float16)
stage_1.enable_model_cpu_offload()

# stage 2
stage_2 = IFSuperResolutionPipeline.from_pretrained("DeepFloyd/IF-II-L-v1.0", text_encoder=None, variant="fp16",
                                            torch_dtype=torch.float16)
stage_2.enable_model_cpu_offload()

# stage 3
safety_modules = {"feature_extractor": stage_1.feature_extractor, "safety_checker": stage_1.safety_checker, "wate\
rmarker": stage_1.watermarker}

stage_3 = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-x4-upscaler", **safety_modules, torch_dt\
ype=torch.float16)
stage_3.enable_model_cpu_offload()

And the invocation:

prompt = 'Jennifer Aniston throwing her shoe at Tucker Carlson'

# text embeds
prompt_embeds, negative_embeds = stage_1.encode_prompt(prompt)

base_seed = np.random.randint(0, sys.maxsize)
for x in range(1):
    generator = torch.manual_seed(base_seed + x)

    image = stage_1(prompt_embeds=prompt_embeds,
                    negative_prompt_embeds=negative_embeds,
                    generator=generator,
                    output_type="pt",
                    width=96).images
    pt_to_pil(image)[0].save("./if_stage_I.png")

    image = stage_2(image=image,
                    prompt_embeds=prompt_embeds,
                    negative_prompt_embeds=negative_embeds,
                    generator=generator,
                    output_type="pt",
                    width=384).images
    pt_to_pil(image)[0].save("./if_stage_II.png")

    image = stage_3(prompt=prompt,
                    image=image,
                    generator=generator,
                    noise_level=100).images
    image[0].save(f"{base_seed + x}.png")

I think an aspect ratio argument is preferable, but that can be easily built in the calling code, and can coordinate the differences between the pipeline stages.

from if.

Bigfield77 avatar Bigfield77 commented on August 18, 2024 1

Hello,
I was able to replicate the change in aspect ratio for stage_1 but stage_2 complains about and unknown argument width

IFSuperResolutionPipeline.call() got an unexpected keyword argument 'width'

I have deepfloyd if 1.0.1

If I don't specify with for stage_2, I get an image with the correct aspect ratio for stage 1 but stage 2 squishes everything in a 256*256 image

Edit
My bad, I just noticed the link to the modification in src/diffusers/pipelines/deepfloyd_if/pipeline_if_superresolution.py

will give this a try!

Edit 2
Works fine! :)

cheers!

from if.

tildebyte avatar tildebyte commented on August 18, 2024

This isn't available when using 🤗Diffusers pipelines; you have to run as indicated in Run The Code Locally. Notice the args to dream() in the "I. Dream" section - you can add e.g. aspect_ratio='3:2' there.

IIUC, there's no way to specify custom resolutions per se - the resolutions are hard-coded(maybe?), but the aspect ratio can be varied.

from if.

Gitterman69 avatar Gitterman69 commented on August 18, 2024

i tried to generate custom aspect ratio with my code below.... stage 3 gets me OOM eventhough im using a 3090.... maybe you guys can try to run it as well? it would be great to find out how exactly to get the custom Aspect Ratio running

from deepfloyd_if.modules import IFStageI, IFStageII, StableStageIII
from deepfloyd_if.modules.t5 import T5Embedder
from deepfloyd_if.pipelines import dream

device = 'cuda:0'

print("Starting IF Stage I...")
if_I = IFStageI('IF-I-XL-v1.0', device=device)
print("IF Stage I completed.")

print("Starting IF Stage II...")
if_II = IFStageII('IF-II-L-v1.0', device=device)
print("IF Stage II completed.")

print("Starting Stable Stage III...")
if_III = StableStageIII('stable-diffusion-x4-upscaler', device=device)
print("Stable Stage III completed.")

print("Initializing T5 Embedder...")
t5 = T5Embedder(device="cpu")
print("T5 Embedder initialized.")

prompt = 'ultra close-up color photo portrait of rainbow owl with deer horns in the woods'
count = 1

print("Starting dream pipeline...")
result = dream(
    t5=t5, if_I=if_I, if_II=if_II, if_III=if_III,
    prompt=[prompt]*count,
    seed=42,
    if_I_kwargs={
        "guidance_scale": 7.0,
        "sample_timestep_respacing": "smart100",
        "aspect_ratio": "3:2",
    },
    if_II_kwargs={
        "guidance_scale": 4.0,
        "sample_timestep_respacing": "smart50",
        #"aspect_ratio": "3:2",
    },
    if_III_kwargs={
        "guidance_scale": 9.0,
        "noise_level": 20,
        "sample_timestep_respacing": "75",
        #"aspect_ratio": "3:2",
    },
)
print("Dream pipeline completed.")

if_III.show(result['III'], size=14)

from if.

tildebyte avatar tildebyte commented on August 18, 2024

#66 claims that it's possible to run inference with IF using only 6G VRAM, but I have not tested it myself

from if.

waffletower avatar waffletower commented on August 18, 2024

I haven't tried the dream pipeline yet. I have been able to provide a width argument to a stage I pipeline (using both the base DiffusionPipeline and the IFPipeline classes) successfully, but haven't succeeded for subsequent stage pipelines.

from if.

waffletower avatar waffletower commented on August 18, 2024

i tried to generate custom aspect ratio with my code below.... stage 3 gets me OOM eventhough im using a 3090.... maybe you guys can try to run it as well? it would be great to find out how exactly to get the custom Aspect Ratio running

I also have a 3090 and have been testing with an identical model configuration, albeit using the DiffusionPipeline APIs. With the default resolution, the maximum GPU memory utilized during processing is over 20gb, so I would expect it very possible that an aspect ratio change (particularly 3:2) would put it over the 24gb available to the 3090. You can try 4:3, 5:4, etc. and see if that squeaks by.

from if.

waffletower avatar waffletower commented on August 18, 2024

#66 claims that it's possible to run inference with IF using only 6G VRAM, but I have not tested it myself

You might be able to perform 2-stage pipelines on 6gb using the IF-I-M and IF-II-M models. The poster is using IF-I-XL and IF-II-L and a third scaling stage instead. They could certainly try again with smaller models.

from if.

Gitterman69 avatar Gitterman69 commented on August 18, 2024

what custom resolutions are currently supported? 1920x1024 works whereas 1920x1080 doesnt... super strange? any ideas???

from if.

Bigfield77 avatar Bigfield77 commented on August 18, 2024

I only do the first 2 stages as the SD upscaler doesn't work on my install right now.

For the first stage anything in the range 8080 pixels and above starts to generate strange images
so that would be like 1280
1280 after 4x * 4x

from if.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.