rbbrdckybk / ai-art-generator Goto Github PK

View Code? Open in Web Editor NEW

636.0 636.0 129.0 13.39 MB

For automating the creation of large batches of AI-generated artwork locally.

License: Other

Python 99.34% CSS 0.66%

clip-guided-diffusion deep-learning generative-art image-generation machine-learning stable-diffusion vqgan-clip

ai-art-generator's People

Contributors

Stargazers

Watchers

Forkers

vicmoh niittymaa devgangsters vanderh0ff cyberstorm-game arb0real wn1695173791 wx0122 eoffermann edenbuaa coien1983 all4gis skrivov akxion defchen talharizwan093 monkeywhisperer technonerdz zar3bski lyfeonedge nichols-356 simrit1 k4h3ny4 makratanakkh zerocool940711 1basileus derikatwork frezznezz marcus-arcadius mrz-777 raychorn aminsg curtisjones00 danilalala m0rty01 vnikolov88 sleigh nosahama xapheks kainblake zcgraas amirdia99 pixiake gitupdates neilrjones ice7ero cyjanry wmyao timmymiller10 jmlucasusc4 lightness17 vb33601 keel1982 ge35tay zzpuser siweilai drzee3 amilc4r wdjopa ichoake kschlender smilingprogrammer m2m0ryz tsonglew awesomediffusion hamzakhedraoui mindutch03 macguyversmusic haziqwahid orzpond johndpope techonomics69 andra297 abhi199496 fetchmat xinzhang-ops fernandabackenddeveloper fr3on vibhootikishor bazzzaka synctrust taronator 5l1v3r1 enderjua mukul116 dturska12 limkokholefork hanakoga duartejuca lyu571 hosein-moayedi poolooloo pikot myegos totoro2205 pllz7 hannahmcreates harrison-ijoyful admyre koushikrohi

ai-art-generator's Issues

Can't figure out how to run multiple instances pointing at multiple graphics cards

When doing red apple test.

Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips\vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Using device: cuda:0
Optimising using: Adam
Using text prompts: ['a red apple']
Using seed: 16544599355900
i: 0, loss: 0.913828, losses: 0.913828
0it [00:02, ?it/s]
Traceback (most recent call last):
File "C:\Users\mikeb\ai-art-generator\vqgan.py", line 867, in
train(i)
File "C:\Users\mikeb\ai-art-generator\vqgan.py", line 751, in train
checkin(i, lossAll)
File "C:\Users\mikeb\anaconda3\envs\ai-art\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "C:\Users\mikeb\ai-art-generator\vqgan.py", line 721, in checkin
TF.to_pil_image(out[0].cpu()).save(args.output, pnginfo=info)
File "C:\Users\mikeb\anaconda3\envs\ai-art\lib\site-packages\PIL\Image.py", line 2317, in save
fp = builtins.open(filename, "w+b")
FileNotFoundError: [Errno 2] No such file or directory: 'output/output.png'

Implementation of DALL-E 2

Or forked Version. DALL-E 2

AttributeError: 'Image' object has no attribute 'getexif'

Hello,
I get this error when trying to run make.art.py.

python make_art.py test5.txt
Queued 1 work items from test5.txt.

Worker starting job #1:
Command: python vqgan.py -s 384 384 -i 500 -cuts 32 -p "some random prompt artistic, masterpiece | cartoonish" -lr 0.1 -m ViT-B/32 -cd "cuda:0" -sd 3831269064 -o output/2023-05-21-test5/some-random-prompt-cartoonish.png
/root/miniconda3/lib/python3.7/site-packages/pytorch_lightning/utilities/distributed.py:259: LightningDeprecationWarning: pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will be removed in v2.0.0. You can import it from pytorch_lightning.utilities instead.
"pytorch_lightning.utilities.distributed.rank_zero_only has been deprecated in v1.8.1 and will"
Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Using device: cuda:0
Optimising using: Adam
Using text prompts: ['some random prompt', 'cartoonish']
Using seed: 3831269064
i: 0, loss: 1.84399, losses: 0.912632, 0.931359
i: 50, loss: 1.66447, losses: 0.776792, 0.887676
i: 100, loss: 1.52617, losses: 0.664702, 0.861466
i: 150, loss: 1.54887, losses: 0.686982, 0.861885
i: 200, loss: 1.58915, losses: 0.718493, 0.870659
i: 250, loss: 1.48152, losses: 0.622581, 0.858936
i: 300, loss: 1.47938, losses: 0.622537, 0.856846
i: 350, loss: 1.47611, losses: 0.62103, 0.85508
i: 400, loss: 1.58315, losses: 0.718842, 0.86431
i: 450, loss: 1.4626, losses: 0.60904, 0.853565
i: 500, loss: 1.4565, losses: 0.603882, 0.852619
500it [02:30, 3.32it/s]
Exception in thread Thread-1:
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "make_art.py", line 226, in run
exif = im.getexif()
AttributeError: 'Image' object has no attribute 'getexif'

What can i do to fix this?

No GPU Found?

i7 - 9700k
GTX 1660 Super
Did all commands and first trained models downloaded
"No GPU Found, using CPU"
Desktop PC, 16gb ram

Can run step #12 test CLIP guided diffusion

Traceback (most recent call last):
  File "F:\Projects\ai-art-generator\diffusion.py", line 2211, in <module>
    model.load_state_dict(torch.load(f'{model_path}/{diffusion_model}.pt', map_location='cpu'))
  File "E:\anaconda\envs\ai-art\lib\site-packages\torch\serialization.py", line 815, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "E:\anaconda\envs\ai-art\lib\site-packages\torch\serialization.py", line 1033, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

Is this basically the same as discoart disco diffusion?

Hi i was wondering why there is difference among vram requirements? Is this still Disco Diffusion?

RuntimeError: CUDA out of memory. diffusion

RuntimeError: CUDA out of memory. Tried to allocate 36.00 MiB (GPU 0; 4.00 GiB total capacity; 3.45 GiB already allocated; 0 bytes free; 3.47 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Error when running + Where do I get the images?

RuntimeError: CUDA out of memory. Tried to allocate 72.00 MiB (GPU 0; 4.00 GiB total capacity; 3.35 GiB already allocated; 0 bytes free; 3.41 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Also the images arent in the output folder.

AMD Graphics Thread

Just letting you know. I have tried and failed to run this configuration on WSL today. I have concluded that it is not possible on Windows 10.

Problem: AMD GPU on Windows 10. Dual booting Ubuntu is an option, but comes with it's own headaches, namely, the time it takes to install the OS, and the time it takes to make them play nice together so I can choose the right one at startup. I think Windows 10 is actually worse than previous OS's to do this with too. It's been a while since I've done this, but I remember getting everything all set up nice and then a few weeks later Windows 10 would just delete the other boot record. In short, AI Art should be fun, and dual booting is not fun.

Proposed and Attempted Solution: Install WSL2 with the Ubuntu 20.04 image. Install the dreadful AMD 2020 WSL preview driver, which sadly overwrites the newer driver. Then follow the steps for setup, but with the AMD version of PyTorch.

I made it over multiple hurdles, but when I finally got to the prompt step, it simply gave me the same error I get in Windows: Warning: No GPU found! Using the CPU instead. The iterations will be slow..

I kept looking for things to try, but I was forced to admit defeat after reading this and seeing the feature that would even allow me to use graphics in WSL2 is on Windows 11 (this article is geared toward GUI apps, but you can see it's showing you how to set up graphics in WSL2 here, so I believe these are one an the same issue; if I am wrong about that, maybe there's a way to do this but I'm kind of at a dead end with the information I've found, so if anybody knows anything feel free to correct me):
https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps

Here are the takeaways.

You can put in the documentation that it doesn't support WSL2 for Windows 10 and save everyone the time. I can't rule out AMD graphics on Windows 11 for WSL2, and more power to somebody with a newer system who can try that combination.

Here are the notes for how I got as far as I did in case somebody wants to try it on Windows 11 using WSL2:

Enable WSL and Virtual Machine Platform via Control Panel
Install Ubuntu 22.04 from Windows Store.
Enable Virtualization in your bios.
Read here for more info on enabling graphics in WSL2: https://learn.microsoft.com/en-us/windows/wsl/tutorials/gui-apps
- You should install the driver specified here.
- Update WSL to WSL2 (probably should do this at the beginning so your Ubuntu image is on WSL2, although it's an easy fix if you don't).
Use this command pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.1.1 to install PyTorch for AMD.
- https://pytorch.org/get-started/locally/ - PyTorch Installation on supported platforms.
- https://rocmdocs.amd.com/en/latest/Archived_Documentation/4_1_Installation_Guide.html - Might not be necessary, but something here might have enabled me to run the above command.
Everything else is just following the same steps in this project's readme.

I'll open another issue if I get it working on Ubuntu and let everybody know any changes to the steps I took to do so.

Can't run python diffusion.py

Traceback (most recent call last):
  File "F:\Projects\ai-art-generator\diffusion.py", line 1438, in <module>
    sr_model = get_model('superresolution')
  File "F:\Projects\ai-art-generator\diffusion.py", line 1219, in get_model
    model, step = load_model_from_config(config, path_ckpt)
  File "F:\Projects\ai-art-generator\diffusion.py", line 1209, in load_model_from_config
    model = instantiate_from_config(config.model)
  File "F:\Projects\ai-art-generator\./latent-diffusion\ldm\util.py", line 85, in instantiate_from_config
    return get_obj_from_str(config["target"])(**config.get("params", dict()))
  File "F:\Projects\ai-art-generator\./latent-diffusion\ldm\util.py", line 93, in get_obj_from_str
    return getattr(importlib.import_module(module, package=None), cls)
  File "E:\anaconda\envs\ai-art\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "F:\Projects\ai-art-generator\./latent-diffusion\ldm\models\diffusion\ddpm.py", line 19, in <module>
    from pytorch_lightning.utilities.distributed import rank_zero_only
ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed'

pytorch-lightning 2.0.4 is installed

Add LICENSE file

What license is this code provided under?

reproducing the art style based on people's picture

Thank you for sharing your great work!
I wonder what settings you used to generate the second rows of your example (two people's art style picture: sample03.jpg and sample04.jpg). Would you share the exact settings for that please? I tried the ffhq transformer but the results are quite different from yours and somewhat disappointing..!

Smaller Chunks

Congrats, amazing stuff! The results I'm getting are mind blowing!

Unfortunately, like you, I only have an 8GB VRAM GPU.
Is there a way to increase the output size to 720 or 1080 without having to upscale? Perhaps by using a smaller chunk size and waiting longer?

Thanks again for releasing this (and for spending the time to create detailed instructions)

CUDA out of memory issue

Hi,

I ran the first test successfully with the lighthouse prompt. It works great! But for the second time, I ran into this issue:

RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 23.69 GiB total capacity; 20.71 GiB already allocated; 187.50 MiB free; 20.78 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I dont understand why, because when I do nvidia-smi, it shows that I have enough space. I have ran torch.cuda.empty_cache as well, but it doesnt seem to work.

Any help would be appreciated. Thanks, and great work!

no module named clip

i followed all the steps up to the test, and when i try 'python vqgan.py -s 128 128 -i 200 -p "a red apple" -o output/output.png' it says that there is no module named clip.

It can be M1 chip (arm64) ?

Stable Diffusion can running in M1, (convert cuda to mps),Can this Repo do the same?

Diffusion _pickle.UnpicklingError: invalid load key, '<'.

I am getting the following error while running diffusion.py

 Traceback (most recent call last):
  File "/home/ali/Desktop/ai-art-generator/diffusion.py", line 1650, in <module>
    secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth', map_location='cpu'))
  File "/home/ali/miniconda3/envs/ai-art/lib/python3.9/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/ali/miniconda3/envs/ai-art/lib/python3.9/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

The command I am running is

python diffusion.py -s 400 400 -i 250 -p "a shining lighthouse on the shore of a tropical island in a raging storm" -o output/diff.png

Thank you

Error while running the test command

Working with z of shape (1, 256, 16, 16) = 65536 dimensions.
loaded pretrained LPIPS loss from taming/modules/autoencoder/lpips/vgg.pth
VQLPIPSWithDiscriminator running with hinge loss.
Restored from checkpoints/vqgan_imagenet_f16_16384.ckpt
Using device: cuda:0
Optimising using: Adam
Using text prompts: ['a red apple']
Using seed: 1260714667749887246
i: 0, loss: 0.906549, losses: 0.906549
0it [00:00, ?it/s]
Traceback (most recent call last):
File "/home/gaurav/Desktop/ai_art_generators/ai-art-generator/vqgan.py", line 867, in
train(i)
File "/home/gaurav/Desktop/ai_art_generators/ai-art-generator/vqgan.py", line 751, in train
checkin(i, lossAll)
File "/home/gaurav/anaconda3/envs/ai-art/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/gaurav/Desktop/ai_art_generators/ai-art-generator/vqgan.py", line 721, in checkin
TF.to_pil_image(out[0].cpu()).save(args.output, pnginfo=info)
File "/home/gaurav/anaconda3/envs/ai-art/lib/python3.9/site-packages/PIL/Image.py", line 2209, in save
fp = builtins.open(filename, "w+b")
FileNotFoundError: [Errno 2] No such file or directory: 'output/output.png'

Torch

how do you fix "AssertionError: Torch not compiled with CUDA enabled"
and " ModuleNotFoundError: No module named 'torch._six' "

Running on Anaconda Powershell Error code

I am attempting to run this on Anaconda Powershell, so maybe this is my first mistake, but everything was installing and running smoothly up until putting in the curl prompts under "6] Download the default VQGAN pre-trained model checkpoint files:" step.

I am getting the following error, immediately after the input "curl -L -o checkpoints/vqgan_imagenet_f16_16384.yaml -C - "https://heibox.uni-heidelberg.de/d/a7530b09fed84f80a887/files/?p=%2Fconfigs%2Fmodel.yaml&dl=1" :

Invoke-WebRequest : Parameter cannot be processed because the parameter name 'C' is ambiguous. Possible matches include: -Credential -CertificateThumbprint -Certificate -ContentType. At line:1 char:54 + curl -L -o checkpoints/vqgan_imagenet_f16_16384.yaml -C - "https://he ... + ~~ + CategoryInfo : InvalidArgument: (:) [Invoke-WebRequest], ParameterBindingException + FullyQualifiedErrorId : AmbiguousParameter,Microsoft.PowerShell.Commands.InvokeWebRequestCommand

Thanks in advance!

Tagging Outputs with Source Images?

EDIT: I had several questions originally, but most of them arose from being completely new to this, but I'm reading more and more, and kind of seeing what's there. I have one question that I still think is relevant.

Is it possible to have the outputs tagged with information on the source images the AI mixed together and a fair percentage of the image that came from each of them?

Up-Scaling and Diffusion Not Working?

Is there any way to upscale a generated image? Would that be done by diffusion? but if so the problem then is my Diffusion is missing cv2 Module. Everything was installed correctly also. But just a general question of how Upscaling is done

Style is a required field, not optional

First, thanks for the work on this project, it's exactly what I was looking for, and the instructions are really good.

I found an issue where if you don't have a style listed, make_art.sh queues up 0 items and ends. The examples file notes that styles are optional, but since you must iterate on them to generate a prompt, it's actually not.

ai-art-generator/make_art.py

Line 433 in 03d7131

for style in self.styles:

I think it was your intention but I'd like to be able to just list several complete prompts under "subject" and not require other fields.

Thanks!

Training the AI?

how does training work? I know it uses checkpoints as trained data but is there a way I can train it manually for my art and specific images?

Output image corners/borders getting deformed.

Hi there, after messing around with the code on this repo I noticed that some of my images were getting deformed corners/borders, mostly the top-right and bottom-left corners but in general all the corners seem to have this issue. This happened a few times on multiple images but there were other images were this didnt happen, maybe it's some of the settings Im using that are messing things up. I would appreciate any help as I havent been able to figure out the issue by myself and I definetely want to keep using this tool to see what I can generate with it. Here are some of the images I generated that have this issue, not sure if github will preserve the metadata on the images but if it doesnt let me know and I can just share the parameters I used when running the code to get those images so other people can easy reproduce the same or similar result.