somewheresy / s2ml-generators Goto Github PK

View Code? Open in Web Editor NEW

178.0 8.0 50.0 48 KB

Multiple notebooks which allow the use of various machine learning methods to generate or modify multimedia content

License: MIT License

Jupyter Notebook 100.00%

ml generative-art vqgan-clip clip-guided-diffusion

s2ml-generators's Introduction

S2ML Generators

Changelog

Version 1.7

Working on a new notebook: Somewhere Diffusion. This notebook will combine three processes:
1. Generating a dataset from images retrieved by proximity to a text prompt in the CLIP latent space
1. Using that model, or a combination of models to generate images via Diffusion at a reasonable resolution
1. Upscaling that image with ESRGAN
The repository for this notebook is located here: https://github.com/somewheresy/somewhere-diffusion

Version 1.6.1

Fixed the directory for Guided Diffusion models
Minor clean-up, fixes, etc.

Version 1.6

No changes to code. People have asked if they can tip me for working on this for free. In light of a new employment opportunity, I am now accepting donations -- not here, but I will match any donation made to The Okra Project (https://www.artsbusinesscollaborative.org/fiscal-sponsorship/okra-project) up to $5000 annually. The donation match will start in June 2022 and will count retroactively towards any donations made from the date of this update.

Version 1.5.4

Fixed Issue #9 #9, ESRGAN upscaling will no longer transpose the colors of your image wrong
Added setting in diffusion method for enabling gradient checkpointing, which saves VRAM but takes longer to compute images (useful if you're having memory issues, or trying to load a heavy model)
Removed some informal text

Version 1.5.3

To make room for new notebooks I am forking from the S2ML Image Generator (neé S2ML Art Generator), the repo has been renamed to S2ML-Generators
S2ML Art Generator renamed to S2ML Image Generator
Keep an eye out for the S2ML Video Generator

Version 1.5.2

CLIP-Guided diffusion method now allows for a variable number of steps.

Version 1.5.1

Name change! Since this notebook contains methods which aren't constrained specifically to utilized GANs (generative adversarial networks), a new name has been chosen: S2ML Art Generator! Future tools which are in development will carry the S2ML prefix so long as those tools leverage machine learning, in order to build out the S2ML ecosystem.

Version 1.5.0

September 21, 2021

Removed ISR for image upscaling and replaced it with an ESRGAN implementation
Added the ability to upscale a folder of images or a single target image
Added the option to generate a video using ffmpeg using either default outputs or upscaled image sequence ({abs_root_path}/ESRGAN/results/ directory)
Fixed some markdown issues, removed bad wording from older notebooks
Added a block to delete all generated output for advanced troubleshooting & tidy-ness

Version 1.4.0

September 4, 2021

Fixed CLIP-guided diffusion method
Exposed new CLIP model selection for both VQGAN+CLIP and diffusion methods
Removed excess instructional text ahead of Wiki launch
Added the ability to generate a video regardless of method
Exposed new parameters in the Generate a Video block

Version 1.3.0

August 30, 2021

Moved changelog to README.md
VQGAN+CLIP and CLIP-guided diffusion blocks are now separate.
Parameters and Execution blocks merged into single blocks.
Fixed potential "interestingness" bug with VQGAN+CLIP method
Exposed four new experimental/advanced parameters for fine-tuning VQGAN+CLIP method
Updating ffmpeg block in 1.3.1 to work with CLIP-guided diffusion method, started fixing this in 1.3.0

Version 1.2.3

August 26, 2021

Bug Fixes
Added "VQGAN Classic" link to older notebook (legacy copy) at the top of the updated notebook

Version 1.2.2

August 23, 2021

Bug Fixes

Version 1.2.1

August 22, 2021

Fixed issues with temp filesystem not importing os, causing errors when not using Google Drive
Removed Wikiart 1024 dataset because the hosting provider went offline
Fixed ImageNet datasets to use new hosting provider

Version 1.2.0

August 21, 2021

Bug Fixes

Version 1.1.2

August 18, 2021

Forked notebook from the original copy
Integrated Katherine Crowson's CLIP-guided diffusion method as a secondary mode to the notebook
Removed automatic video naming and exposed it as a parameter instead.

s2ml-generators's People

Contributors

Stargazers

Watchers

s2ml-generators's Issues

Error when trying to run Download pre-trained models

Output:

Executing using VQGAN+CLIP method
Using device: cuda:0
Using text prompt: ['Happy flowers in a sunlit field']
Using image prompts: ['/content/test.jpg']
Using seed: 5476245293527641887
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
[<ipython-input-41-541d444115c2>](https://localhost:8080/#) in <module>()
     77 print('Using seed:', seed)
     78 
---> 79 model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
     80 perceptor = clip.load(args.clip_model, jit=False)[0].eval().requires_grad_(False).to(device)
     81 

NameError: name 'load_vqgan_model' is not defined

Color problem in ESRGAN output

I thought ESRGAN was simply inverting the colors of the original but I took the image results of ESRGAN into a graphics program and this isn't the case. Hue is shifted in parts of the image, and maintained in other parts.
Examples attached. The original is in blue and green, ESRGAN's result gave a red background but foreground elements are maintained somehow. I have my own drive folder that I upscale images from, but ESRGAN still changes the colors even if you change the target directory to '../ESRGAN/LR/' to use the given example images.

name 'torch' is not defined

When trying to run this it comes with the issue of torch not being defined and as some one who knows little to nothing of code I do not know how to fix this. Many thanks for any help.

torch.cuda.empty_cache()
with torch.no_grad():
torch.cuda.empty_cache()

Diffusion model fails to load with smaller image size setting

Currently when adjusting the image size parameter on the guided diffusion model in the s2ML notebook, it will fail with the following error:

RuntimeError: Error(s) in loading state_dict for UNetModel:
	Missing key(s) in state_dict: "input_blocks.7.0.skip_connection.weight", "input_blocks.7.0.skip_connection.bias", "input_blocks.7.1.norm.weight", "input_blocks.7.1.norm.bias", "input_blocks.7.1.qkv.weight", "input_blocks.7.1.qkv.bias", "input_blocks.7.1.proj_out.weight", "input_blocks.7.1.proj_out.bias", "input_blocks.8.1.norm.weight", "input_blocks.8.1.norm.bias", "input_blocks.8.1.qkv.weight", "input_blocks.8.1.qkv.bias", "input_blocks.8.1.proj_out.weight", "input_blocks.8.1.proj_out.bias", "input_blocks.10.1.norm.weight", "input_blocks.10.1.norm.bias", "input_blocks.10.1.qkv.weight", "input_blocks.10.1.qkv.bias", "input_blocks.10.1.proj_out.weight", "input_blocks.10.1.proj_out.bias", "input_blocks.11.1.norm.weight", "input_blocks.11.1.norm.bias", "input_blocks.11.1.qkv.weight", "input_blocks.11.1.qkv.bias", "input_blocks.11.1.proj_out.weight", "input_blocks.11.1.proj_out.bias", "input_blocks.13.0.skip_connection.weight", "input_blocks.13.0.skip_connection.bias". 
	Unexpected key(s) in state_dict: "input_blocks.15.0.in_layers.0.weight", 
... [omitted for brevity]
"input_blocks.17.0.out_layers.3.weight", "input_blocks.17.0.out_layers....
	size mismatch for input_blocks.0.0.weight: copying a param with shape torch.Size([128, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 3, 3, 3]).
	size mismatch for input_blocks.0.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for input_blocks.1.0.in_layers.0.weight: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for input_blocks.1.0.in_layers.0.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for input_blocks.1.0.in_layers.2.weight: copying a param with shape torch.Size([128, 128, 3, 3]) from checkpoint, the shape in current model is torch.Size([256, 256, 3, 3]).
... [omitted for brevity]

I would be happy to take a stab at this if you have any ideas. But i'm not entirely sure this request even makes sense as I'm still learning about the diffusion model and how it works. Is it possible to have it generate smaller images? (and ultimately reduce the memory footprint of the model).

If you have other ideas of how to reduce vram usage i would be interested in hearing that as well / discussing further!

Any clue what to do with RuntimeError: CUDA out of memory. Tried to allocate…

After running the Colab once, on the second try, I get the message that CUDA is out of memory. What is the best practice to reset this? I've tried a few different solutions from StackOverflow and forums, but none that has worked.

'RuntimeError: CUDA out of memory' on P100

Hi there,

I'm using Colab Pro to do some ML experiments and often fail to initialise the RN50x4 CLIP model. Sometimes I even have trouble getting the RN101 model to load up. RN50x16 has never worked in my experience. The notebook is running on a P100 GPU and mentions that "x4 and x16 models for CLIP may not work reliably on lower-memory machines".

I'm just wondering if I need an even more capable GPU (in terms of VRAM) or if there is some problem with the code? I'm not an expert with Tensorflow/ML so apologies if there's a simple solution to this.

FileNotFoundError: [Errno 2] No such file or directory: 'content/drive/MyDrive/FolderName/ProjectName/vqgan-steps/0235.png'

when i try to use ISR Execution that error pops up... also if i try to use Generate a video after that error i get this error FileNotFoundError: [Errno 2] No such file or directory: '/content/vqgan-steps/0009.png'......... am i doing something wrong?

2 vqgan models don't load

Justin,
So many thanks for developing this version of vqgan art generator. I thought I'd send you a note on a persistent problem I'm having. I can't load vqgan models - wikiart_16384 & coco off the collab notebook.
Should I run this off Linux for better reliability?
Thanks again!

Running Diffusion makes each rendered frame replace the previous frame

I'm unable to render a video based on the frames made while having Diffusion checked, seems that each frame is replacing the former.

facehq not working

Hi everything is working but facehq model won't run for some reason - error message is -

ModuleNotFoundError Traceback (most recent call last)
in ()
77 print('Using seed:', seed)
78
---> 79 model = load_vqgan_model(args.vqgan_config, args.vqgan_checkpoint).to(device)
80 perceptor = clip.load(args.clip_model, jit=False)[0].eval().requires_grad_(False).to(device)
81

12 frames
/usr/lib/python3.7/importlib/_bootstrap.py in find_and_load_unlocked(name, import)

ModuleNotFoundError: No module named 'taming.modules.misc'

any fix for this?

video.mp4 not found

FileNotFoundError: [Errno 2] No such file or directory: 'video.mp4'

cant download 512 unconditional diffusion model

Hello,
It seems the-eye.eu is unreachable, therefore I cannot download the 512x512 unconditional diffusion model, and it is not provided on the openai repo (there is only the 256x256 one)

Any idea where i could find the model ?

Thanks a lot

Disconnect

i am trying to generate a image by collab clound processors, but it keeps disconnecing after 2 to 3 hours, i have tryed to use autoclick, but it dies nor work, some one knows how to stay more time processing?

`MakeCutouts` overwritten in notebook s2ml

This class is defined twice. I think one version is intended for CLIP without diffusion? if so perhaps different names for the classes is the solution?

ModuleNotFoundError: No module named 'tensorflow.python.keras.engine.keras_tensor'

I just started seeing this error when running this notebook on Colab recently, during the "Load libraries and definitions" block. Here's a screenshot...

I've run hundreds of cycles of this notebook in the past, but within the last week or two, it started failing every time.

I'm not an experienced python dev, so I'm a bit out of my element, but I can help troubleshoot from my side if that helps!

Error when trying to run VQGAN+CLIP