GithubHelp home page GithubHelp logo

mattmacfarlaneml / stable-diffusion-docker Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fboulnois/stable-diffusion-docker

0.0 0.0 0.0 614 KB

Run the official Stable Diffusion releases in a Docker container with txt2img, img2img, upscale4x, and inpaint.

License: GNU Affero General Public License v3.0

Shell 20.78% Python 73.84% Dockerfile 5.38%

stable-diffusion-docker's Introduction

Stable Diffusion in Docker

Run the official Stable Diffusion releases on Huggingface in a GPU accelerated Docker container.

./build.sh run 'An impressionist painting of a parakeet eating spaghetti in the desert'

An impressionist painting of a parakeet eating spaghetti in the desert 1 An impressionist painting of a parakeet eating spaghetti in the desert 2

./build.sh run --image parakeet_eating_spaghetti.png --strength 0.6 'Abstract art'

Abstract art 1 Abstract art 2

Before you start

By default, the pipeline uses the full model and weights which requires a CUDA capable GPU with 8GB+ of VRAM. It should take a few seconds to create one image. On less powerful GPUs you may need to modify some of the options; see the Examples section for more details. If you lack a suitable GPU you can set the option --device cpu instead. If you are using Docker Desktop and the container is terminated you may need to give Docker more resources by increasing the CPU, memory, and swap in the Settings -> Resources section.

Since it uses the official model, you will need to create a user access token in your Huggingface account. Save the user access token in a file called token.txt and make sure it is available when building the container.

Quickstart

The pipeline is managed using a single build.sh script. You must build the image before it can be run.

Build

Make sure your user access token is saved in a file called token.txt. The token content should begin with hf_...

To build:

./build.sh build  # or just ./build.sh

Run

Text-to-Image (txt2img)

To run:

./build.sh run 'Andromeda galaxy in a bottle'

Image-to-Image (img2img)

First, copy an image to the input folder. Next, to run:

./build.sh run --image image.png 'Andromeda galaxy in a bottle'

Image Upscaling (upscale4x)

First, copy an image to the input folder. Next, to run:

./build.sh run --model 'stabilityai/stable-diffusion-x4-upscaler' \
  --image image.png 'A detailed description of the image'

Diffusion Inpainting (inpaint)

First, copy an image and an image mask to the input folder. White areas of the mask will be diffused and black areas will be kept untouched. Next, to run:

./build.sh run --model 'runwayml/stable-diffusion-inpainting' \
  --image image.png --mask mask.png 'Andromeda galaxy in a bottle'

Options

Some of the options from txt2img.py are implemented for compatibility:

  • --prompt [PROMPT]: the prompt to render into an image
  • --n_samples [N_SAMPLES]: number of images to create per run (default 1)
  • --n_iter [N_ITER]: number of times to run pipeline (default 1)
  • --H [H]: image height in pixels (default 512, must be divisible by 64)
  • --W [W]: image width in pixels (default 512, must be divisible by 64)
  • --scale [SCALE]: unconditional guidance scale (default 7.5)
  • --seed [SEED]: RNG seed for repeatability (default is a random seed)
  • --ddim_steps [DDIM_STEPS]: number of sampling steps (default 50)

Other options:

  • --attention-slicing: use less memory at the expense of inference speed (default is no attention slicing)
  • --device [DEVICE]: the cpu or cuda device to use to render images (default cuda)
  • --half: use float16 tensors instead of float32 (default float32)
  • --image [IMAGE]: the input image to use for image-to-image diffusion (default None)
  • --mask [MASK]: the input mask to use for diffusion inpainting (default None)
  • --model [MODEL]: the model used to render images (default is CompVis/stable-diffusion-v1-4)
  • --negative-prompt [NEGATIVE_PROMPT]: the prompt to not render into an image (default None)
  • --scheduler [SCHEDULER]: override the scheduler used to denoise the image (default None)
  • --skip: skip safety checker (default is the safety checker is on)
  • --strength [STRENGTH]: diffusion strength to apply to the input image (default 0.75)
  • --token [TOKEN]: specify a Huggingface user access token at the command line instead of reading it from a file (default is a file)
  • --xformers-memory-efficient-attention: use less memory but require the xformers library (default is that xformers is not required)

Examples

These commands are both identical:

./build.sh run 'abstract art'
./build.sh run --prompt 'abstract art'

Set the seed to 42:

./build.sh run --seed 42 'abstract art'

Options can be combined:

./build.sh run --scale 7.0 --seed 42 'abstract art'

On systems with <8GB of GPU RAM, you can try mixing and matching options:

  • Make images smaller than 512x512 using --W and --H to decrease memory use and increase image creation speed
  • Use --half to decrease memory use but slightly decrease image quality
  • Use --attention-slicing to decrease memory use but also decrease image creation speed
  • Use --xformers-memory-efficient-attention to decrease memory use if the pipeline and the hardware supports the option
  • Decrease the number of samples and increase the number of iterations with --n_samples and --n_iter to decrease overall memory use
  • Skip the safety checker with --skip to run less code
./build.sh run --W 256 --H 256 --half \
  --attention-slicing --xformers-memory-efficient-attention \
  --n_samples 1 --n_iter 1 --skip --prompt 'abstract art'

On Windows, if you aren't using WSL2 and instead use MSYS, MinGW, or Git Bash, prefix your commands with MSYS_NO_PATHCONV=1 (or export it beforehand):

MSYS_NO_PATHCONV=1 ./build.sh run --half --prompt 'abstract art'

Outputs

Model

The model and other files are cached in a volume called huggingface. The models are stored in <volume>/diffusers/<model>/snapshots/<githash>/unet/<weights>. Checkpoint files (ckpts) are unofficial versions of the official models, and so these are not part of the official release.

Images

The images are saved as PNGs in the output folder using the prompt text. The build.sh script creates and mounts this folder as a volume in the container.

stable-diffusion-docker's People

Contributors

fboulnois avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.