kjsman / stable-diffusion-pytorch Goto Github PK

View Code? Open in Web Editor NEW

463.0 6.0 57.0 26 KB

Yet another PyTorch implementation of Stable Diffusion (probably easy to read)

License: MIT License

Python 89.27% Jupyter Notebook 10.73%

diffusion image-generation pytorch stable-diffusion

stable-diffusion-pytorch's People

Contributors

Stargazers

Watchers

stable-diffusion-pytorch's Issues

where is DownSample？

Hi, thanks for your coding. but i don't find the downsample operation.

Can you add inpaint, erase and replace and outpainting in such simple manner as well with this?

How are the models in data.zip made?

Thank you for making this repo its very educational. This minimal implementation is brilliant. The bigger SD repos are very hard to understand.

Did you have a script to convert them for official models like this one: https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned-emaonly.ckpt to the format you use in this repo?

Or are you using a model from some other source?

Are you using SD 1.5 model?

How hard is it to make this repo to use models trained by others? Like Inkpunk for example? https://huggingface.co/Envvi/Inkpunk-Diffusion/blob/main/Inkpunk-Diffusion-v2.ckpt

Image in latent space gets shifted during encoding.

I am using a simple red image as input:

from stable_diffusion_pytorch import pipeline
from PIL import Image

prompts = ["a photograph of an astronaut riding a horse"]
input_images = [Image.open('red.png')]
images = pipeline.generate(prompts, input_images=input_images)
images[0].save('output.png')

But I am getting the input image shifted down 8px,8px and it generates ugly brown border:

I am pretty sure it happens during the Encode pass as its already shifter in latent space. Here is custom dumping of latent space to image:

Some thing in the Encode pass that is shifting it by a pixel in the latent space. And I can't figure out what.

Hello, @kjsman, thanks for this easily readable implementation. I have a question: am I correct that I should scale the input with a sampler before passing the image to the U-Net model during the training process?

thanks kjsman ❤

this is beautiful code ✨ thanks for this implementation ❤

Do you have a plan to implement training code?

Thank you for sharing awesome code!
Do you have a plan to implement training code?

Is there sdxl simplified code?

Is 4GB VRAM too small for this program?

Thanks for the implementation!
I got
OutOfMemoryError: CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 4.00 GiB total capacity; 3.31 GiB already allocated; 0 bytes free; 3.37 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
for running demo.ipynb. Is there some solution to it?

query

I hope this message finds you well. I recently came across your repository for Stable Diffusion in PyTorch and I must say, your effort in making the codebase minimal, and easy to read is commendable. I am new to generative models and your implementation has piqued my interest.
I was wondering if you could provide some insights into the training process of your Stable Diffusion model. Specifically, I am curious about the following:

Training Data: Could you please let me know on which dataset you trained your model? Understanding the dataset used would help me get a better understanding of the capabilities and limitations of the model.
Training Time: I'm also interested to know how much time it took for your model to train. This information will help me gauge the computational requirements and plan accordingly for any experiments or projects involving Stable Diffusion.

Moreover, I would like to know more about your approach to writing this code. Did you primarily refer to research papers, or did you take inspiration from other implementations? For instance, you mentioned using Andrej Karpathy's miniGPT. Could you share your thought process behind choosing this reference or any other methods you considered during your implementation?
I greatly appreciate your assistance and expertise in this matter. Thank you for your time and for sharing your work with the community. I look forward to your response.

[Enhancement] automate weights download without user action

Hello @kjsman,
this is more a feature proposal than an actual issue. Instead of requiring the user to download and open the tar file containing the weights and the vocabulary from your huggingface hub repository, one can directly make the model_loader and the Tokenizer download and cache them.

For the first part, it only requires replacing torch.load(...) here (and for the other 3 functions in the same file) with

torch.hub.load_state_dict_from_url(weights_url, check_hash=True)

All it takes on your side is to upload on hugginface hub the 4 pt files (not in a zipped file) and thats' it.

As regards the tokenizer, just takes to add a default_bpe() method / function

@lru_cache()
def default_bpe():
    p = os.path.join(
        os.path.dirname(os.path.abspath(__file__)), "bpe_simple_vocab_16e6.txt.gz"
    )

    if os.path.exists(p):
        return p
    else:
        p = urlretrieve(
            "https://github.com/openai/CLIP/blob/main/clip/bpe_simple_vocab_16e6.txt.gz?raw=true",
            "bpe_simple_vocab_16e6.txt.gz",
        )
        if len(p) != 1:
            # if it also contains the
            # HTTP message as second entry
            return p[0]
        else:
            return p

Another option is, if you prefer to keep your vocab.json and merges.txt, to upload them as well to Hugginface hub (not in a tar file) or directly to GitHub like the original reposiotry does with its vocab.

If you like it, I will open a new PR, otherwise please let me know if you have any better idea or close this issue if you are not interested in this feature 😄

kjsman / stable-diffusion-pytorch Goto Github PK

stable-diffusion-pytorch's People

Contributors

Stargazers

Watchers

Forkers

stable-diffusion-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs