pesser / stable-diffusion Goto Github PK

License: MIT License

Python 13.00% Shell 1.25% Jupyter Notebook 85.75%

stable-diffusion's Introduction

Development repository. Please see CompVis/stable-diffusion for the Stable Diffusion release.

Latent Diffusion Models

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach*, Andreas Blattmann*, Dominik Lorenz, Patrick Esser, Björn Ommer
* equal contribution

News

April 2022

Thanks to Katherine Crowson, classifier-free guidance received a ~2x speedup and the PLMS sampler is available. See also this PR.
Our 1.45B latent diffusion LAION model was integrated into Huggingface Spaces 🤗 using Gradio. Try out the Web Demo:
More pre-trained LDMs are available:
- A 1.45B model trained on the LAION-400M database.
- A class-conditional model on ImageNet, achieving a FID of 3.6 when using classifier-free guidance Available via a colab notebook .

Requirements

A suitable conda environment named ldm can be created and activated with:

conda env create -f environment.yaml
conda activate ldm

Pretrained Models

A general list of all available checkpoints is available in via our model zoo. If you use any of these models in your work, we are always happy to receive a citation.

Text-to-Image

Download the pre-trained weights (5.7GB)

mkdir -p models/ldm/text2img-large/
wget -O models/ldm/text2img-large/model.ckpt https://ommer-lab.com/files/latent-diffusion/nitro/txt2img-f8-large/model.ckpt

and sample with

python scripts/txt2img.py --prompt "a virus monster is playing guitar, oil on canvas" --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0  --ddim_steps 50

This will save each sample individually as well as a grid of size n_iter x n_samples at the specified output location (default: outputs/txt2img-samples). Quality, sampling speed and diversity are best controlled via the scale, ddim_steps and ddim_eta arguments. As a rule of thumb, higher values of scale produce better samples at the cost of a reduced output diversity.
Furthermore, increasing ddim_steps generally also gives higher quality samples, but returns are diminishing for values > 250. Fast sampling (i.e. low values of ddim_steps) while retaining good quality can be achieved by using --ddim_eta 0.0.
Faster sampling (i.e. even lower values of ddim_steps) while retaining good quality can be achieved by using --ddim_eta 0.0 and --plms (see Pseudo Numerical Methods for Diffusion Models on Manifolds).

Beyond 256²

For certain inputs, simply running the model in a convolutional fashion on larger features than it was trained on can sometimes result in interesting results. To try it out, tune the H and W arguments (which will be integer-divided by 8 in order to calculate the corresponding latent size), e.g. run

python scripts/txt2img.py --prompt "a sunset behind a mountain range, vector image" --ddim_eta 1.0 --n_samples 1 --n_iter 1 --H 384 --W 1024 --scale 5.0

to create a sample of size 384x1024. Note, however, that controllability is reduced compared to the 256x256 setting.

The example below was generated using the above command.

Inpainting

Download the pre-trained weights

wget -O models/ldm/inpainting_big/last.ckpt https://heibox.uni-heidelberg.de/f/4d9ac7ea40c64582b7c9/?dl=1

and sample with

python scripts/inpaint.py --indir data/inpainting_examples/ --outdir outputs/inpainting_results

indir should contain images *.png and masks <image_fname>_mask.png like the examples provided in data/inpainting_examples.

Class-Conditional ImageNet

Available via a notebook .

Unconditional Models

We also provide a script for sampling from unconditional LDMs (e.g. LSUN, FFHQ, ...). Start it via

CUDA_VISIBLE_DEVICES=<GPU_ID> python scripts/sample_diffusion.py -r models/ldm/<model_spec>/model.ckpt -l <logdir> -n <\#samples> --batch_size <batch_size> -c <\#ddim steps> -e <\#eta>

Train your own LDMs

Data preparation

Faces

For downloading the CelebA-HQ and FFHQ datasets, proceed as described in the taming-transformers repository.

LSUN

The LSUN datasets can be conveniently downloaded via the script available here. We performed a custom split into training and validation images, and provide the corresponding filenames at https://ommer-lab.com/files/lsun.zip. After downloading, extract them to ./data/lsun. The beds/cats/churches subsets should also be placed/symlinked at ./data/lsun/bedrooms/./data/lsun/cats/./data/lsun/churches, respectively.

ImageNet

The code will try to download (through Academic Torrents) and prepare ImageNet the first time it is used. However, since ImageNet is quite large, this requires a lot of disk space and time. If you already have ImageNet on your disk, you can speed things up by putting the data into ${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/ (which defaults to ~/.cache/autoencoders/data/ILSVRC2012_{split}/data/), where {split} is one of train/validation. It should have the following structure:

${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/
├── n01440764
│   ├── n01440764_10026.JPEG
│   ├── n01440764_10027.JPEG
│   ├── ...
├── n01443537
│   ├── n01443537_10007.JPEG
│   ├── n01443537_10014.JPEG
│   ├── ...
├── ...

If you haven't extracted the data, you can also place ILSVRC2012_img_train.tar/ILSVRC2012_img_val.tar (or symlinks to them) into ${XDG_CACHE}/autoencoders/data/ILSVRC2012_train/ / ${XDG_CACHE}/autoencoders/data/ILSVRC2012_validation/, which will then be extracted into above structure without downloading it again. Note that this will only happen if neither a folder ${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/data/ nor a file ${XDG_CACHE}/autoencoders/data/ILSVRC2012_{split}/.ready exist. Remove them if you want to force running the dataset preparation again.

Model Training

Logs and checkpoints for trained models are saved to logs/<START_DATE_AND_TIME>_<config_spec>.

Training autoencoder models

Configs for training a KL-regularized autoencoder on ImageNet are provided at configs/autoencoder. Training can be started by running

CUDA_VISIBLE_DEVICES=<GPU_ID> python main.py --base configs/autoencoder/<config_spec>.yaml -t --gpus 0,

where config_spec is one of {autoencoder_kl_8x8x64(f=32, d=64), autoencoder_kl_16x16x16(f=16, d=16), autoencoder_kl_32x32x4(f=8, d=4), autoencoder_kl_64x64x3(f=4, d=3)}.

For training VQ-regularized models, see the taming-transformers repository.

Training LDMs

In configs/latent-diffusion/ we provide configs for training LDMs on the LSUN-, CelebA-HQ, FFHQ and ImageNet datasets. Training can be started by running

CUDA_VISIBLE_DEVICES=<GPU_ID> python main.py --base configs/latent-diffusion/<config_spec>.yaml -t --gpus 0,

where <config_spec> is one of {celebahq-ldm-vq-4(f=4, VQ-reg. autoencoder, spatial size 64x64x3),ffhq-ldm-vq-4(f=4, VQ-reg. autoencoder, spatial size 64x64x3), lsun_bedrooms-ldm-vq-4(f=4, VQ-reg. autoencoder, spatial size 64x64x3), lsun_churches-ldm-vq-4(f=8, KL-reg. autoencoder, spatial size 32x32x4),cin-ldm-vq-8(f=8, VQ-reg. autoencoder, spatial size 32x32x4)}.

Model Zoo

Pretrained Autoencoding Models

All models were trained until convergence (no further substantial improvement in rFID).

Model	rFID vs val	train steps	PSNR	PSIM	Link	Comments
f=4, VQ (Z=8192, d=3)	0.58	533066	27.43 +/- 4.26	0.53 +/- 0.21	https://ommer-lab.com/files/latent-diffusion/vq-f4.zip
f=4, VQ (Z=8192, d=3)	1.06	658131	25.21 +/- 4.17	0.72 +/- 0.26	https://heibox.uni-heidelberg.de/f/9c6681f64bb94338a069/?dl=1	no attention
f=8, VQ (Z=16384, d=4)	1.14	971043	23.07 +/- 3.99	1.17 +/- 0.36	https://ommer-lab.com/files/latent-diffusion/vq-f8.zip
f=8, VQ (Z=256, d=4)	1.49	1608649	22.35 +/- 3.81	1.26 +/- 0.37	https://ommer-lab.com/files/latent-diffusion/vq-f8-n256.zip
f=16, VQ (Z=16384, d=8)	5.15	1101166	20.83 +/- 3.61	1.73 +/- 0.43	https://heibox.uni-heidelberg.de/f/0e42b04e2e904890a9b6/?dl=1

f=4, KL	0.27	176991	27.53 +/- 4.54	0.55 +/- 0.24	https://ommer-lab.com/files/latent-diffusion/kl-f4.zip
f=8, KL	0.90	246803	24.19 +/- 4.19	1.02 +/- 0.35	https://ommer-lab.com/files/latent-diffusion/kl-f8.zip
f=16, KL (d=16)	0.87	442998	24.08 +/- 4.22	1.07 +/- 0.36	https://ommer-lab.com/files/latent-diffusion/kl-f16.zip
f=32, KL (d=64)	2.04	406763	22.27 +/- 3.93	1.41 +/- 0.40	https://ommer-lab.com/files/latent-diffusion/kl-f32.zip

Get the models

Running the following script downloads und extracts all available pretrained autoencoding models.

bash scripts/download_first_stages.sh

The first stage models can then be found in models/first_stage_models/<model_spec>

Pretrained LDMs

Datset	Task	Model	FID	IS	Prec	Recall	Link	Comments
CelebA-HQ	Unconditional Image Synthesis	LDM-VQ-4 (200 DDIM steps, eta=0)	5.11 (5.11)	3.29	0.72	0.49	https://ommer-lab.com/files/latent-diffusion/celeba.zip
FFHQ	Unconditional Image Synthesis	LDM-VQ-4 (200 DDIM steps, eta=1)	4.98 (4.98)	4.50 (4.50)	0.73	0.50	https://ommer-lab.com/files/latent-diffusion/ffhq.zip
LSUN-Churches	Unconditional Image Synthesis	LDM-KL-8 (400 DDIM steps, eta=0)	4.02 (4.02)	2.72	0.64	0.52	https://ommer-lab.com/files/latent-diffusion/lsun_churches.zip
LSUN-Bedrooms	Unconditional Image Synthesis	LDM-VQ-4 (200 DDIM steps, eta=1)	2.95 (3.0)	2.22 (2.23)	0.66	0.48	https://ommer-lab.com/files/latent-diffusion/lsun_bedrooms.zip
ImageNet	Class-conditional Image Synthesis	LDM-VQ-8 (200 DDIM steps, eta=1)	7.77(7.76)* /15.82**	201.56(209.52)* /78.82**	0.84* / 0.65**	0.35* / 0.63**	https://ommer-lab.com/files/latent-diffusion/cin.zip	: w/ guiding, classifier_scale 10 *: w/o guiding, scores in bracket calculated with script provided by ADM
Conceptual Captions	Text-conditional Image Synthesis	LDM-VQ-f4 (100 DDIM steps, eta=0)	16.79	13.89	N/A	N/A	https://ommer-lab.com/files/latent-diffusion/text2img.zip	finetuned from LAION
OpenImages	Super-resolution	LDM-VQ-4	N/A	N/A	N/A	N/A	https://ommer-lab.com/files/latent-diffusion/sr_bsr.zip	BSR image degradation
OpenImages	Layout-to-Image Synthesis	LDM-VQ-4 (200 DDIM steps, eta=0)	32.02	15.92	N/A	N/A	https://ommer-lab.com/files/latent-diffusion/layout2img_model.zip
Landscapes	Semantic Image Synthesis	LDM-VQ-4	N/A	N/A	N/A	N/A	https://ommer-lab.com/files/latent-diffusion/semantic_synthesis256.zip
Landscapes	Semantic Image Synthesis	LDM-VQ-4	N/A	N/A	N/A	N/A	https://ommer-lab.com/files/latent-diffusion/semantic_synthesis.zip	finetuned on resolution 512x512

Get the models

The LDMs listed above can jointly be downloaded and extracted via

bash scripts/download_models.sh

The models can then be found in models/ldm/<model_spec>.

Coming Soon...

More inference scripts for conditional LDMs.
In the meantime, you can play with our colab notebook https://colab.research.google.com/drive/1xqzUi2iXQXDqXBHQGP9Mqt2YrYW6cx-J?usp=sharing

Comments

Our codebase for the diffusion models builds heavily on OpenAI's ADM codebase and https://github.com/lucidrains/denoising-diffusion-pytorch. Thanks for open-sourcing!
The implementation of the transformer encoder is from x-transformers by lucidrains.

BibTeX

@misc{rombach2021highresolution,
      title={High-Resolution Image Synthesis with Latent Diffusion Models}, 
      author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
      year={2021},
      eprint={2112.10752},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

stable-diffusion's People

Contributors

Stargazers

Watchers

Forkers

techthiyanes siat-bit-cxh abraham-ai joetm cg-design jags111 dango233 caspillaga bungeeapp perk11 anshengqiang novelai dongxiaohuang srelbo fastrocket codeaudit waltherius999 andrewmatheny lixiangyu4 bartman081523 lee-b somewheresy mbrukman cafetechne djwhitt haorand matteliot entmike omontano voonhongv 3a1b2c3 kevinmunson commotum briancg42 algonacci bellyfat nathanshipley justinpinkney nikolayrag zeroshot-ai ryanrussell dtorey lucassilvaferreira mehrdad-shokri oceanswave furyhawk lianqi-kevin blackhole1504 rockjicks cybworks blefaudeux eliasolie harubaru dinosaurtirex toriningen niloysaha ipsoblender jd-p caiohsf razoryhang ethanzhu90 ohmji smeshing telerikarsov mcx capsadmin lopho rsh4d0w tiankuan93 ouhenio algovenus yzy-thu celiswen minkowski0125 psimyn gennaro-farina occupymars2025 smirkingface trantrungpixta cryptoopera tensorfly-gpu franklinosei shunsunsun toowzh yangsenwxy yang-h khoanguyen1806 lakssrini haoranlv yunlm muximuxi rentainhe zero506 isidrolv zhaishengfang lorenzo-stacchio joannesyu liangbinxie cloudengio witwait

stable-diffusion's Issues

CUDA runs out of memory with lots of memory reserved

I'm trying to run the text-to-image model with the example but CUDA keeps running out of memory, despite it barely trying to allocate anything. It's trying to allocate 20MB when there's 7.3GB reserved. Is there any way to fix this? I've searched all over but I couldn't find a clear answer.

meet problems when text2img sampling

I am using this pretrained model for text2img sampling:

and i get a result like this:

when i use this pretrained model:

the result is normal, like this:

is there any suggestion?

Stable Diffusion latest version reportedly infected with a BeEF Framework attack payload

For some reason, after updating my SD installation today from GitHub, Norton blocked my attempt to connect to the local IP with the following warning:

Web Attack: BeEF Framework Attack

Just happens on Edge and not on Chrome. I've disabled ALL Chrome Add-Ins, but is still happening. Happened to anyone else?

Broken dependency(?) on a fresh installation

Trying to set the repo up and get it working but got the below error
Win 11

(ldm) D:\github\stable-diffusion>python scripts/txt2img.py --prompt "a virus monster is playing guitar, oil on canvas" --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
Traceback (most recent call last):
File "scripts/txt2img.py", line 11, in
from pytorch_lightning import seed_everything
File "C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning_init_.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics_init_.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\classification_init_.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\classification\accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics, void
File "C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (C:\Users\nikol\anaconda3\envs\ldm\lib\site-packages\torchmetrics\utilities\data.py)

Stable Diffusion Master Tutorials List - Including SDXL 0.9 - 43 Tutorials - Not An Issue Thread

Hello dear Patrick Esser, I hope you let this thread stay to help newcomers. This is not an issue thread. Thank you.

Expert-Level Tutorials on Stable Diffusion: Master Advanced Techniques and Strategies

Greetings everyone. I am Dr. Furkan Gözükara. I am an Assistant Professor in Software Engineering department of a private university (have PhD in Computer Engineering). My professional programming skill is unfortunately C# not Python :)

My linkedin : https://www.linkedin.com/in/furkangozukara

Our channel address if you like to subscribe : https://www.youtube.com/@SECourses

Our discord to get more help : https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

I am keeping this list up-to-date. I got upcoming new awesome video ideas. Trying to find time to do that.

I am open to any criticism you have. I am constantly trying to improve the quality of my tutorial guide videos. Please leave comments with both your suggestions and what you would like to see in future videos.

All videos have manually fixed subtitles and properly prepared video chapters. You can watch with these perfect subtitles or look for the chapters you are interested in.

Since my profession is teaching, I usually do not skip any of the important parts. Therefore, you may find my videos a little bit longer.

Playlist link on YouTube: Stable Diffusion Tutorials, Automatic1111 Web UI & Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Video to Anime

1.) Automatic1111 Web UI - PC - Free

How To Install Python, Setup Virtual Environment VENV, Set Default Python System Path & Install Git

2.) Automatic1111 Web UI - PC - Free

Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer

3.) Automatic1111 Web UI - PC - Free

How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3

4.) Automatic1111 Web UI - PC - Free

Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed

5.) Automatic1111 Web UI - PC - Free

DreamBooth Got Buffed - 22 January Update - Much Better Success Train Stable Diffusion Models Web UI

6.) Automatic1111 Web UI - PC - Free

How to Inject Your Trained Subject e.g. Your Face Into Any Custom Stable Diffusion Model By Web UI

7.) Automatic1111 Web UI - PC - Free

How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1

8.) Automatic1111 Web UI - PC - Free

8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI

9.) Automatic1111 Web UI - PC - Free

How To Do Stable Diffusion Textual Inversion (TI) / Text Embeddings By Automatic1111 Web UI Tutorial

10.) Automatic1111 Web UI - PC - Free

How To Generate Stunning Epic Text By Stable Diffusion AI - No Photoshop - For Free - Depth-To-Image

11.) Python Code - Hugging Face Diffusers Script - PC - Free

How to Run and Convert Stable Diffusion Diffusers (.bin Weights) & Dreambooth Models to CKPT File

12.) NMKD Stable Diffusion GUI - Open Source - PC - Free

Forget Photoshop - How To Transform Images With Text Prompts using InstructPix2Pix Model in NMKD GUI

13.) Google Colab Free - Cloud - No PC Is Required

Transform Your Selfie into a Stunning AI Avatar with Stable Diffusion - Better than Lensa for Free

14.) Google Colab Free - Cloud - No PC Is Required

Stable Diffusion Google Colab, Continue, Directory, Transfer, Clone, Custom Models, CKPT SafeTensors

15.) Automatic1111 Web UI - PC - Free

Become A Stable Diffusion Prompt Master By Using DAAM - Attention Heatmap For Each Used Token - Word

16.) Python Script - Gradio Based - ControlNet - PC - Free

Transform Your Sketches into Masterpieces with Stable Diffusion ControlNet AI - How To Use Tutorial

17.) Automatic1111 Web UI - PC - Free

Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI

18.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required

Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI

19.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required

How To Install DreamBooth & Automatic1111 On RunPod & Latest Libraries - 2x Speed Up - cudDNN - CUDA

20.) Automatic1111 Web UI - PC - Free

Fantastic New ControlNet OpenPose Editor Extension & Image Mixing - Stable Diffusion Web UI Tutorial

21.) Automatic1111 Web UI - PC - Free

Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test

22.) Automatic1111 Web UI - PC - Free

Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods

23.) Automatic1111 Web UI - PC - Free

New Style Transfer Extension, ControlNet of Automatic1111 Stable Diffusion T2I-Adapter Color Control

24.) Automatic1111 Web UI - PC - Free

Generate Text Arts & Fantastic Logos By Using ControlNet Stable Diffusion Web UI For Free Tutorial

25.) Automatic1111 Web UI - PC - Free

How To Install New DREAMBOOTH & Torch 2 On Automatic1111 Web UI PC For Epic Performance Gains Guide

26.) Automatic1111 Web UI - PC - Free

Training Midjourney Level Style And Yourself Into The SD 1.5 Model via DreamBooth Stable Diffusion

27.) Automatic1111 Web UI - PC - Free

Video To Anime - Generate An EPIC Animation From Your Phone Recording By Using Stable Diffusion AI

28.) Python Script - Jupyter Based - PC - Free

Midjourney Level NEW Open Source Kandinsky 2.1 Beats Stable Diffusion - Installation And Usage Guide

29.) Automatic1111 Web UI - PC - Free

RTX 3090 vs RTX 3060 Ultimate Showdown for Stable Diffusion, ML, AI & Video Rendering Performance

30.) Kohya Web UI - Automatic1111 Web UI - PC - Free

Generate Studio Quality Realistic Photos By Kohya LoRA Stable Diffusion Training - Full Tutorial

31.) Kaggle NoteBook - Free

DeepFloyd IF By Stability AI - Is It Stable Diffusion XL or Version 3? We Review and Show How To Use

32.) Python Script - Automatic1111 Web UI - PC - Free

How To Find Best Stable Diffusion Generated Images By Using DeepFace AI - DreamBooth / LoRA Training

33.) Kohya Web UI - RunPod - Paid

How To Install And Use Kohya LoRA GUI / Web UI on RunPod IO With Stable Diffusion & Automatic1111

34.) PC - Google Colab - Free

Mind-Blowing Deepfake Tutorial: Turn Anyone into Your Favorite Movie Star! PC & Google Colab - roop

35.) Automatic1111 Web UI - PC - Free

Stable Diffusion Now Has The Photoshop Generative Fill Feature With ControlNet Extension - Tutorial

36.) Automatic1111 Web UI - PC - Free

Human Cropping Script & 4K+ Resolution Class / Reg Images For Stable Diffusion DreamBooth / LoRA

37.) Automatic1111 Web UI - PC - Free

Stable Diffusion 2 NEW Image Post Processing Scripts And Best Class / Regularization Images Datasets

38.) Automatic1111 Web UI - PC - Free

How To Use Roop DeepFake On RunPod Step By Step Tutorial With Custom Made Auto Installer Script

39.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required

How To Install DreamBooth & Automatic1111 On RunPod & Latest Libraries - 2x Speed Up - cudDNN - CUDA

40.) Automatic1111 Web UI - PC - Free + RunPod

Zero to Hero ControlNet Tutorial: Stable Diffusion Web UI Extension | Complete Feature Guide

41.) Automatic1111 Web UI - PC - Free + RunPod

The END of Photography - Use AI to Make Your Own Studio Photos, FREE Via DreamBooth Training

42.) Google Colab - Gradio - Free

How To Use Stable Diffusion XL (SDXL 0.9) On Google Colab For Free

43.) Local - PC - Free - Gradio

Stable Diffusion XL (SDXL) Locally On Your PC - 8GB VRAM - Easy Tutorial With Automatic Installer

KeyError: 'image' in stable-diffusion/ldm/models/diffusion/ddpm.py"

trying to run inpainting with the inpaint_big downloaded model. Changed the checkpoint and config path in inpainting-demo.
but this error appears:
(I think ddpm.py wants 512x512 RGBA image and streamlit gives 2x 512x512 RGB one image and one mask. But I have no clue.)

2022-08-05 11:42:22.357 Uncaught app exception
Traceback (most recent call last):
  File "/usr/local/envs/ldm/lib/python3.8/site-packages/streamlit/scriptrunner/script_runner.py", line 557, in _run_script
    exec(code, module.__dict__)
  File "/content/stable-diffusion/scripts/demo/inpainting.py", line 194, in <module>
    samples = sample(
  File "/content/stable-diffusion/scripts/demo/inpainting.py", line 38, in sample
    z, c, x, xrec, xc = self.get_input(batch, self.first_stage_key, bs=N, return_first_stage_outputs=True)
  File "/usr/local/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/content/stable-diffusion/ldm/models/diffusion/ddpm.py", line 718, in get_input
    x = super().get_input(batch, k)
  File "/content/stable-diffusion/ldm/models/diffusion/ddpm.py", line 383, in get_input
    x = batch[k]
KeyError: 'image'
###
(512, 512, 3)
(512, 512, 3)
###

this is the colab notebook:
https://colab.research.google.com/drive/1iglh0P7CxYtJEf4N5K68RhNr9CJMzYa_?usp=sharing

EOFError: Ran out of input

ModuleNotFoundError: No module named 'ldm'

After following the directions, the txt2img.py script itself doesn't seem to recognize the LDM we created with the .yaml, though it exists in the .\anaconda3\envs. Do I perhaps have the wrong version of python or something? (I'm using 3.10) I don't see anything specified.

txt2img.py", line 15, in <module>
    from ldm.util import instantiate_from_config
ModuleNotFoundError: No module named 'ldm'

Also I found manually running 'pip install ldm' would install the wrong package, then it will ask for ldm.utils if I go this route. ref: CompVis/latent-diffusion#71 but this looks like it was using an online notebook

Missing Parenthesis?

Got this error:
Traceback (most recent call last):
File "/content/latent-diffusion/scripts/txt2img.py", line 10, in
from ldm.util import instantiate_from_config
File "/usr/local/lib/python3.7/dist-packages/ldm.py", line 20
print self.face_rec_model_path
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print(self.face_rec_model_path)?

Dockerfile?

Hi,
just asking ... are you planning to make a Dockerfile? Looks like people having problems making the stuff run

Upgrade to Lightning 1.7

Hey! Needless to say incredible work with Stable Diffusion and latent diffusion in general.

I saw Stable Diffusion is using a old-ish version of PyTorch Lightning (1.4.2), I'm wondering if you'd like help upgrading to Lightning 1.7, happy to provide it. The idea would be to create a test, ensure there's (at least) parity on results and upgrade.

Here's a breakdown of what was released since 1.4.x just in case:

Extra note: Lighting 1.7 supports PyTorch 1.9+

AttributeError: module 'rich' has no attribute 'version'

keep getting this error even though I tried different version of rich module. How do I go about fixing it?

About fine-tuning for inpainting

Huge thanks for your code contribution first!

I used your config file "v1-finetune-for-inpainting-laion-iaesthe.yaml" to fine-tune the model for text-conditioned inpainting. The dataset I used is this subset of the Liaon dataset.

It turns out the results finally become the naive inpainting (simply fills the missing region), and were no longer controlled by the text conditioning as the training proceeds (as shown below, the txt prompt is "a cat on the bench", but no cat appears). Maybe i miss some tricks, I wonder did you meet the same issue when you trained the model?

Thank you in advance :)

when I finetune sd model, and set trainer(precision=16), an error occurred

Traceback (most recent call last):
File "main.py", line 851, in
trainer.fit(model, data)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 553, in fit
self._run(model)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 918, in _run
self._dispatch()
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 986, in _dispatch
self.accelerator.start_training(self)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
self.training_type_plugin.start_training(trainer)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
self._results = trainer.run_stage()
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 996, in run_stage
return self._run_train()
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1045, in _run_train
self.fit_loop.run()
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 111, in run
self.advance(*args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 200, in advance
epoch_output = self.epoch_loop.run(train_dataloader)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 111, in run
self.advance(*args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 130, in advance
batch_output = self.batch_loop.run(batch, self.iteration_count, self._dataloader_idx)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 101, in run
super().run(batch, batch_idx, dataloader_idx)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/base.py", line 111, in run
self.advance(*args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 148, in advance
result = self._run_optimization(batch_idx, split_batch, opt_idx, optimizer)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 194, in _run_optimization
closure()
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 236, in _training_step_and_backward_closure
result = self.training_step_and_backward(split_batch, batch_idx, opt_idx, optimizer, hiddens)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 549, in training_step_and_backward
self.backward(result, optimizer, opt_idx)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 590, in backward
result.closure_loss = self.trainer.accelerator.backward(result.closure_loss, optimizer, *args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 276, in backward
self.precision_plugin.backward(self.lightning_module, closure_loss, *args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/precision_plugin.py", line 78, in backward
model.backward(closure_loss, optimizer, *args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/core/lightning.py", line 1481, in backward
loss.backward(*args, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/_tensor.py", line 363, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/autograd/init.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/autograd/function.py", line 253, in apply
return user_fn(self, *args)
File "/root/data/juicefs_hz_cv_v3/11120102/project/generative-model/pesser-stable-diffusion/ldm/modules/diffusionmodules/util.py", line 138, in backward
output_tensors = ctx.run_function(*shallow_copies)
File "/root/data/juicefs_hz_cv_v3/11120102/project/generative-model/pesser-stable-diffusion/ldm/modules/attention.py", line 215, in _forward
x = self.attn1(self.norm1(x), context=context if self.disable_self_attn else None) + x
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/normalization.py", line 189, in forward
return F.layer_norm(
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/functional.py", line 2486, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Half but found Float

inpaint.sd won't work out of the box.

instructions are pretty clear, yet it doesn't work out of the box

inpaint_sd.py been placed in root project folder to catch the subfolders and structure
inpaint_sd.py, string 109, default="models/ldm/inpainting_big/last.ckpt",
inpaint_sd.py, string 122, config="configs/stable-diffusion/inpainting/v1-finetune-for-inpainting-laion-iaesthe.yaml"
stable-diffusion\configs\stable-diffusion\inpainting, v1-finetune-for-inpainting-laion-iaesthe.yaml: string 18 changed to ckpt_path: "models/ldm/inpainting_big/last.ckpt

so looks like everything is hooked right now, yet, when I run the script with:
python inpaint_sd.py --indir data/inpainting_examples/ --outdir outputs/inpainting_results

it gives out me these errors:
\stable-diffusion\inpaint_sd.py", line 124, in <module> model = instantiate_from_config(config.model)
\stable-diffusion\ldm\util.py", line 79, in instantiate_from_config return get_obj_from_str(config["target"])(**config.get("params", dict()))
\stable-diffusion\ldm\models\diffusion\ddpm.py", line 1627, in __init__ self.init_from_ckpt(ckpt_path, ignore_keys)
\stable-diffusion\ldm\models\diffusion\ddpm.py", line 1648, in init_from_ckpt new_entry[:, :self.keep_dims, ...] = sd[k]

RuntimeError: The expanded size of the tensor (4) must match the existing size (7) at non-singleton dimension 1. Target sizes: [320, 4, 3, 3]. Tensor sizes: [256, 7, 3, 3]

So, what I can do to fix the issue? why it even happening? 🤔

latent diffusion vs stable diffusion

for clarity - does this repo actually do stable diffusion?
it doesn't seem like the models for stable diffusion are available.
or am I mistaken?
https://huggingface.co/CompVis/stable-diffusion/blob/main/README.md

UPDATE
seems like they are interchangable. but....
cfreude/stable-diffusion@7fd86d0

ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

Getting the following error following the instructions

python scripts/txt2img.py --prompt "a virus monster is playing guitar, oil on canvas" --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0 --ddim_steps 50
Traceback (most recent call last):
File "scripts/txt2img.py", line 11, in
from pytorch_lightning import seed_everything
File "usr\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning_init_.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "usr\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics_init_.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "usr\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\classification_init_.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "usr\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\classification\accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics, void
File "usr\anaconda3\envs\ldm\lib\site-packages\pytorch_lightning\metrics\utils.py", line 22, in
from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (usr\anaconda3\envs\ldm\lib\site-packages\torchmetrics\utilities\data.py)

AttributeError: partially initialized module 'torch' has no attribute 'Tensor' (most likely due to a circular import)

I tried running the test command and got this error. I wouldn't be surprised if I screwed something up. I uninstalled and reinstalled torch and tensor to no avail.

H:\stable>python scripts/txt2img.py --prompt "a virus monster is playing guitar, oil on canvas" --ddim_eta 0.0 --n_samples 4 --n_iter 4 --scale 5.0  --ddim_steps 50
Traceback (most recent call last):
  File "H:\stable\scripts\txt2img.py", line 2, in <module>
    import torch
  File "E:\anaconda3\lib\site-packages\torch\__init__.py", line 255, in <module>
    from .random import set_rng_state, get_rng_state, manual_seed, initial_seed, seed
  File "E:\anaconda3\lib\site-packages\torch\random.py", line 9, in <module>
    def set_rng_state(new_state: torch.Tensor) -> None:
AttributeError: partially initialized module 'torch' has no attribute 'Tensor' (most likely due to a circular import)

Won't run out of the box

Missing logs/f8-kl-clip-encoder-256x256-run1/configs/2022-06-01T22-11-40-project.yaml

Repeated inpainting leads to saturated pixels

Repeated inpainting leads to saturated pixels. Quick and dirty example:

import subprocess
import os
import numpy as np
from PIL import Image, ImageDraw
import shutil

directory = lambda x: "./Diffusion/Diffusion_{}/".format(x)

for i in range(240):
	if i!=0:
		if os.path.exists(directory(i)):
			shutil.rmtree(directory(i))
for i in range(240):
	im = Image.new('RGB', (512, 512), (0, 0, 0))
	draw = ImageDraw.Draw(im)

	x = np.random.randint(512-128)
	y = np.random.randint(512-128)

	draw.rectangle([(x,y),(x+128,y+128)], fill=(255, 255, 255))
	im.save('{}Diffusion_mask.png'.format(directory(i)))

	os.mkdir(directory(i+1))
	subprocess.run('python scripts/inpaint.py --steps 20 --indir {} --outdir {}'.format(directory(i),directory(i+1)), shell=True)
	
	im = Image.open('{}Diffusion.png'.format(directory(i+1)))
	
	# pixels = 2
	# im = im.crop((pixels, pixels, 512-pixels, 512-pixels))
	# im = im.resize((512,512), resample=Image.BICUBIC, box=None, reducing_gap=None)
	# im.save('{}Diffusion.png'.format(directory(i+1)))
	im.save('./DiffusionOut/{0:06d}.png'.format(i))
	if i!=0:
		shutil.rmtree(directory(i))

Add folders/files:
./Diffusion/Diffusion_0/Diffusion.png
./DiffusionOut/

In scripts/inpainting changing

inpainted = inpainted.cpu().numpy().transpose(0,2,3,1)[0]*255

inpainted = np.round(inpainted.cpu().numpy().transpose(0,2,3,1)[0]*255)

Fixes the issue I think

Not sure why i can't pull SD through the batchweb ui.

Hi everyone.
i recently came across this weird error,
Not sure why i can't load SD through theuser batchweb ui.

I am runing 1.5

i can only run SD when i double click on the lanchpy and the python webui only but i get a long cmd message

Upscaling task

Hello,

Thanks for this great work.

I'm wondering if you could provide instructions on how to perform the Upscaling task?

Thanks!

ModuleNotFoundError: No module named 'torchtext.legacy'

Everything installed well except I get 2 errors. 1 upon install and another when I try to run it. install error is:
ERROR: File "setup.py" or "setup.cfg" not found. Directory cannot be installed in editable mode: /content
I'm trying to run this on a colab server in standalone mode from the command line.
Thanks for any help!
Traceback (most recent call last):
File "stable-diffusion/scripts/txt2img.py", line 11, in
from pytorch_lightning import seed_everything
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/init.py", line 20, in
from pytorch_lightning import metrics # noqa: E402
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/init.py", line 15, in
from pytorch_lightning.metrics.classification import ( # noqa: F401
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/classification/init.py", line 14, in
from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in
from pytorch_lightning.metrics.utils import deprecated_metrics, void
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/metrics/utils.py", line 29, in
from pytorch_lightning.utilities import rank_zero_deprecation
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/init.py", line 18, in
from pytorch_lightning.utilities.apply_func import move_data_to_device # noqa: F401
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/utilities/apply_func.py", line 31, in
from torchtext.legacy.data import Batch
ModuleNotFoundError: No module named 'torchtext.legacy'

cannot import name 'autocast' from 'torch'

Got the conda env, installs, downloads and everything all working smoothly now, no error messages but upon running this pops up:

Traceback (most recent call last):
File "stable-diffusion/scripts/txt2img.py", line 12, in
from torch import autocast
ImportError: cannot import name 'autocast' from 'torch' (/usr/local/envs/ldm/lib/python3.8/site-packages/torch/init.py)

Text Conditioning Dropout

Thank you for this repo. It has more training related stuff, so I can try it on my own.
Can you please point me where 10 % text conditioning dropout is happening?
I'm afraid I will dropout twice if I dropout it on my own.
Thank you again. LDM is really awesome.

pesser / stable-diffusion Goto Github PK

stable-diffusion's Introduction

Latent Diffusion Models

News

April 2022

Requirements

Pretrained Models

Text-to-Image

Beyond 256²

Inpainting

Class-Conditional ImageNet

Unconditional Models

Train your own LDMs

Data preparation

Faces

LSUN

ImageNet

Model Training

Training autoencoder models

Training LDMs

Model Zoo

Pretrained Autoencoding Models

Get the models

Pretrained LDMs

Get the models

Coming Soon...

Comments

BibTeX

stable-diffusion's People

Contributors

Stargazers

Watchers

Forkers

stable-diffusion's Issues

Hello dear Patrick Esser, I hope you let this thread stay to help newcomers. This is not an issue thread. Thank you.

Expert-Level Tutorials on Stable Diffusion: Master Advanced Techniques and Strategies

Our channel address if you like to subscribe : https://www.youtube.com/@SECourses

Our discord to get more help : https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

I am open to any criticism you have. I am constantly trying to improve the quality of my tutorial guide videos. Please leave comments with both your suggestions and what you would like to see in future videos.

All videos have manually fixed subtitles and properly prepared video chapters. You can watch with these perfect subtitles or look for the chapters you are interested in.

Recommend Projects

Recommend Topics

Recommend Org

Jobs