GithubHelp home page GithubHelp logo

Comments (2)

bghira avatar bghira commented on June 3, 2024

hmm so 86 isn't divisible by 8.

if i adjust the script like so:

from diffusers import DiffusionPipeline, IFSuperResolutionPipeline
import torch
from PIL import Image
import numpy as np

torch.manual_seed(42)

# Configuration for initial image and desired output
initial_width = 86  # Adjusted width to be one-fourth of 344 (approximately)
initial_height = 64  # Adjusted height to be one-fourth of 256

# Adjust initial_width to be divisible by 8
initial_width = int(np.ceil(initial_width / 8) * 8)
print(f"Resolution: {initial_width}x{initial_height}")
# Initialize your device setting based on availability
torch_device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "xpu" if torch.xpu.is_available() else "cpu"

# Create a dummy image (86x64)
dummy_image = torch.rand((3, initial_height, initial_width), dtype=torch.float32)  # Random noise image
dummy_image = (dummy_image * 255).to(torch.uint8)  # Convert to 8-bit format
dummy_pil_image = Image.fromarray(dummy_image.numpy().transpose(1, 2, 0))  # Convert to PIL image for compatibility
dummy_pil_image.save("dummy_input.png")  # Save the initial dummy image

# Load your stage 2 pipeline
print(f"Image resolution: {dummy_pil_image.size}")
stage2_pipe = IFSuperResolutionPipeline.from_pretrained("DeepFloyd/IF-II-M-v1.0", watermarker=None, safety_checker=None, local_files_only=False).to(device=torch_device, dtype=torch.bfloat16)

# Upscale the dummy image using stage 2 of the pipeline
upscaled_image = stage2_pipe(
    prompt="A simple upscaled image", 
    image=dummy_pil_image, 
    guidance_scale=5.5, 
    num_inference_steps=20, 
    width=initial_width * 4, 
    height=initial_height * 4
).images[0]

upscaled_image.save("upscaled_dummy_output.png")

there is no crash

from diffusers.

bghira avatar bghira commented on June 3, 2024

note: i understand deepfloyd is not often used by commercial outfits due to its restrictive license, but it apparently has research value and i've run into this during research into deepfloyd's characteristics with the T5 text encoder (which is worthwhile to explore, now that there are more models available to compare against). this PR is an effort to improve the experience for research use of these weights.

from diffusers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.