GithubHelp home page GithubHelp logo

eps696 / aphantasia Goto Github PK

View Code? Open in Web Editor NEW
769.0 23.0 105.0 36.06 MB

CLIP + FFT/DWT/RGB = text to image/video

License: MIT License

Jupyter Notebook 43.73% Python 56.27%
text-to-image clip text-to-video

aphantasia's People

Contributors

dribnet avatar eps696 avatar interfect avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aphantasia's Issues

Alternate Subtraction Method, Faster

I was trying out ways of manipulating the encoded text and one that I tried was subtracting encoded text from the encoded text prompt. I tried four renders for each and they look about the same, except the one that changes the encoded text had less of the subtract prompt which suggests to me that it's more effective at subtracting a prompt. Also it ends up using just the one txt_enc rather than 2, and just the one cosine similarity.

Prompt: "a photo of a human face" and Negative: "a photo of a face"

Subtracting Subtract's txt_enc0 from text_enc resulted in these
enc_sub

Existing negative method what uses cosine similarity with the image and negative prompt for loss resulted in these
enc_neg

And for fun, using subtract to increase the difference between the two by txt_enc + (txt_enc - text_enc0) resulted in these
enc_subdiff

The encoded text and images seem to be explorable like latent space.

ReadTimeOutError when installing OpenAi

In the instructions: pip install git+https://github.com/openai/CLIP.git results in the following error:

Collecting git+https://github.com/openai/CLIP.git
Cloning https://github.com/openai/CLIP.git to /tmp/pip-req-build-ham_skxz
Running command git clone -q https://github.com/openai/CLIP.git /tmp/pip-req-build-ham_skxz
Requirement already satisfied: ftfy in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (6.0.1)
Requirement already satisfied: regex in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (2020.6.8)
Requirement already satisfied: tqdm in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (4.47.0)
Collecting torch~=1.7.1
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Max retries exceeded with url: /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl (Caused by ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)"))

TypeError: 'float' object is not subscriptable

Seems like something broke the IllusTrip3D.ipynb in the new update. Settings of possibly relevant parameters: zoom = 0.0005, shift = 0, animate_them = False

Here is the stack trace:

using fast aug transforms
 using RGB method, 95 samples
 ref text:  ethereal cosmology
 ref style:  
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-09ff3d020318> in <module>()
    229 pbar = ProgressBar(glob_steps)
    230 for i in range(count):
--> 231   process(i)
    232 
    233 HTML(makevid(tempdir))

1 frames
<ipython-input-6-4929578fc46a> in depth_transform(img_t, img_np, depth_infer, depth_mask, size, depthX, scale, shift, colors, depth_dir, save_num)
     46     dY = 100. * shift[1] / size[0]
     47     # dZ = movement direction: 1 away (zoom out), 0 towards (zoom in), 0.5 stay
---> 48     dZ = 0.5 + 23. * (scale[0]-1)
     49     # dZ += 0.5 * float(math.sin(((save_num % 70)/70) * math.pi * 2))
     50 

TypeError: 'float' object is not subscriptable

init_image support

Do any of the drawing modules have support for initialising from an image? I looked but didn't see any inverse FFT code currently in the codebase. If not this might be an interesting feature to consider adding for any models that would easily support such an operation.

: cannot connect to X server - Kaggle?

I am unsuccessfully trying to run clip_fft.py in a Kaggle notebook. The error message is seen below:

Start...
/kaggle/working/aphantasia/clip_fft.py:128: UserWarning: The function torch.irfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.ifft or torch.fft.irfft. (Triggered internally at  /pytorch/aten/src/ATen/native/SpectralOps.cpp:602.)
  image = torch.irfft(scaled_spectrum_t, 2, normalized=True, signal_sizes=(h, w))
: cannot connect to X server 

Previously I had errors that seemed to stem from dependencies, which I resolved. However, : cannot connect to X server is a dead end to me, although apparently it has something to do with displays. The last thing I tried was turning off verbose, thinking that displaying the previews may be the problem, but this did not solve the issue. I have no idea where to begin troubleshooting further. In fact, I'm not even sure if the issue is rooted in this tool, or in Kaggle. This tool works fine in the intended Colab environment*, and I'm not sure if you (eps696) have used Kaggle, so I understand if this issue is out of scope. Nonetheless I would appreciate any insight you or others might have!

*except that I keep running out of GPU time - Kaggle displays the time limits, at least

Can't generate video

When it finishes, now it asks for some text in a text box, but I have no clue what it's asking me for. Please help.

[Feature] Locational Prompts

What if image slices from the left and image slices from the right were compared to different prompts? Slices in the center and slices at the edge?

Perhaps if it was set to check for a jungle on the edges, and for a toucan in the center, it would create an image of a toucan in a jungle with a broad background of jungle.

While an interface for such a thing may get convoluted quickly, as long as it's kept to simple options like left/out and right/in and midpoint with a toggle for left/right or edge/center or up/down or what have you it shouldn't get too outrageous for a user.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 2.00 GiB total capacity; 1.58 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Hi @eps696

I am keep on getting below error. I am unable to run the code for 30 samples and 30 steps too.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 2.00 GiB total capacity; 1.58 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Can you please help me how to resolve the issue. I spent almost a day of time to resolve issue but I am helpless.

I would like to run the model with 500 samples and steps to see a perfect image generated which gives me immense happy of success.

Looking forward to hear from you.

Thank you.

DeepSpeed integration for training on local cheaper GPUs.

This is an awesome repository that you've made.
It would be helpful if you'd integrate DeepSpeed for ZeRO-Infinity NVME offload, for beginners like meπŸ˜…, as Im only having a 1660ti for training, and trying to generate Full HD images for my PPTs.

Something changed since 04/19/2022

Now it won't create the video, & I get this on the Colab:

.. generating video ..
_out/invisible_man_behind_the_1%_curtain-Film_noir/%04d.jpg: No such file or directory

FileNotFoundError Traceback (most recent call last)
in ()
209 _ = pbar.upd()
210
--> 211 HTML(makevid(tempdir))
212 torch.save(params, tempdir + '.pt')
213 files.download(tempdir + '.pt')

in makevid(seq_dir, size)
67 get_ipython().system('ffmpeg -y -v warning -i $out_sequence -crf 20 $out_video')
68 # moviepy.editor.ImageSequenceClip(img_list(seq_dir), fps=25).write_videofile(out_video, verbose=False)
---> 69 data_url = "data:video/mp4;base64," + b64encode(open(out_video,'rb').read()).decode()
70 wh = '' if size is None else 'width=%d height=%d' % (size, size)
71 return """<video %s controls>""" % (wh, data_url)

FileNotFoundError: [Errno 2] No such file or directory: '_out/invisible_man_behind_the_1%_curtain-Film_noir.mp4'

Error when running "Generate"

Everything works well, until I click generate.
Here's the error:

 using 300 samples
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 338M/338M [00:05<00:00, 62.9MiB/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-6a6046cf5f2d> in <module>()
     85 use_jit = True if float(torch.__version__[:3]) < 1.8 else False
     86 model_clip, _ = clip.load(model, jit=use_jit)
---> 87 modsize = model_clip.visual.input_resolution
     88 xmem = {'ViT-B/16':0.25, 'RN50':0.5, 'RN50x4':0.16, 'RN50x16':0.06, 'RN101':0.33}
     89 if model in xmem.keys():

2 frames
/usr/local/lib/python3.7/dist-packages/torch/jit/_script.py in __getattr__(self, attr)
    755                 return script_method
    756 
--> 757             return super(RecursiveScriptModule, self).__getattr__(attr)
    758 
    759         def __setattr__(self, attr, value):

/usr/local/lib/python3.7/dist-packages/torch/jit/_script.py in __getattr__(self, attr)
    472         def __getattr__(self, attr):
    473             if "_actual_script_module" not in self.__dict__:
--> 474                 return super(ScriptModule, self).__getattr__(attr)
    475             return getattr(self._actual_script_module, attr)
    476 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1176                 return modules[name]
   1177         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1178             type(self).__name__, name))
   1179 
   1180     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'RecursiveScriptModule' object has no attribute 'input_resolution'

I've tried running it without changing any settings, and it does the same thing. Help?

AssertionError: Torch not compiled with CUDA enabled

Mac OS,python3

Traceback (most recent call last):
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 511, in <module>
    main()
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 243, in main
    key_txt_encs = [enc_text(txt) for txt in texts]
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 243, in <listcomp>
    key_txt_encs = [enc_text(txt) for txt in texts]
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 203, in enc_text
    emb = model_clip.encode_text(clip.tokenize(subtxt).cuda()[:77])
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/cuda/__init__.py", line 208, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Pytorch Import torch.irfft Update to torch.fft.irfft

You may need to update your function call as below:

clip_fft.py:137: UserWarning: The function torch.irfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.ifft or torch.fft.irfft. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:

I have tried to use illustra in colab but it gives me the following error

ModuleNotFoundError Traceback (most recent call last)
in ()
73 get_ipython().system('git clone https://github.com/eps696/aphantasia')
74 get_ipython().magic('cd aphantasia/')
---> 75 from clip_fft import to_valid_rgb, fft_image, slice_imgs, checkout
76 from utils import pad_up_to, basename, file_list, img_list, img_read, txt_clean, plot_text
77 from progress_bar import ProgressIPy as ProgressBar

/content/aphantasia/clip_fft.py in ()
16 os.environ['KMP_DUPLICATE_LIB_OK']='True'
17 from sentence_transformers import SentenceTransformer
---> 18 import lpips
19
20 from utils import pad_up_to, basename, img_list, img_read, plot_text, txt_clean

ModuleNotFoundError: No module named 'lpips'

Specify GPU

Could something be added to specify the GPU to use?

[Feature] Non-Random Slices for Locational Encoding of Input Image

Rather than doing multiple random slices of the input image to match using CLIP, what if it was an orderly grid covering the image? Such that each chunk of an image can be encoded and stored in an array, and later when encoding the output image to compare to the input image it can follow the same orderly grid and compare to entries in the array, with the intent of comparing the same locations of the images. I imagine some overlap in the slices would be required.

I reckon that with this, if CLIP detects a feature in a certain place, Aphantasia will try to match that same feature in that same place, instead of trying to match the overall description of an image.

clip_fft.py won't start

I installed requirements.txt and git+https://github.com/openai/CLIP.git and after that I ran

python clip_fft.py -t "city" -t2 "gradient" --size 1280-720

And after that I got the error

c:\etc\aphantasia-master>python clip_fft.py -t "city" -t2 "gradient" --size 1280-720
Traceback (most recent call last):
  File "clip_fft.py", line 23, in <module>
    from utils import slice_imgs, derivat, sim_func, basename, img_list, img_read, plot_text, txt_clean, checkout, old_torch
  File "c:\etc\aphantasia-master\utils.py", line 13, in <module>
    from kornia.filters.sobel import spatial_gradient
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\__init__.py", line 19, in <module>    from kornia import jit
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\jit\__init__.py", line 9, in <module>
    spatial_soft_argmax2d = torch.jit.script(K.geometry.spatial_soft_argmax2d)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\_recursive.py", line 568, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\_recursive.py", line 568, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError:
Unknown type name 'torch.dtype':
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\utils\grid.py", line 12
        normalized_coordinates: bool = True,
        device: Optional[torch.device] = torch.device('cpu'),
        dtype: torch.dtype = torch.float32) -> torch.Tensor:
               ~~~~~~~~~~~ <--- HERE
    """Generates a coordinate grid for an image.
'create_meshgrid' is being compiled since it was called from 'spatial_expectation2d'
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\geometry\subpix\dsnt.py", line 100
    # Create coordinates grid.
    grid: torch.Tensor = create_meshgrid(height, width, normalized_coordinates, input.device)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    grid = grid.to(input.dtype)
'spatial_expectation2d' is being compiled since it was called from 'spatial_soft_argmax2d'
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\geometry\subpix\spatial_soft_argmax.py", line 516
    """
    input_soft: torch.Tensor = dsnt.spatial_softmax2d(input, temperature)
    output: torch.Tensor = dsnt.spatial_expectation2d(input_soft, normalized_coordinates)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    return output

How can I make the program work?

Error

Hello, trying to run the code but i get an error, how do i solve it ?Thanks!

NameError Traceback (most recent call last)
in ()
2
3 get_ipython().system('rm -rf $tempdir')
----> 4 os.makedirs(tempdir, exist_ok=True)
5
6 sideX = 900 #@param {type:"integer"}

NameError: name 'tempdir' is not defined

New CLIP Models

OpenAI put out 2 new clip models, RN50x4 and RN101. I've tried them and I genuinely don't know what visible difference there is, but maybe you'll want to add them as options.

Illustrip3D - problems with video output quality

Hi Vadim!

IllusTrip3d is an outstanding tool! Great! Unfortunately, the video generated directly by the Colab notebook is very pixelated and of poor quality. When I stitch the JPGs myself in a video editor, I have jumps and dropouts in between (it also seems to lack 3D depth processing). How and where can I change the values in the notebook so that I have a decent HD output of the video? Thanks in advance for an answer!

All the best!
Patrick

[Feature/Trick] Image Slice Rotation

ViT-B/32 appears to only recognize things that are correctly oriented, like a tree won't be crooked and a face won't be upside-down, so adding rotation to the slice_imgs section should result in wider diversity at low levels of random rotation, and something approaching chaos at high levels of random rotation.

Alternatively, I suppose it could be rotated in such a way that the slices all point toward the center of the image, making a radial design converging in the center, rather than rectangular. I reckon that'd work like (offsetx - csize * 0.5) and (offsety - csize * 0.5) to get the center of the slice, and then take those as slicex and slicey for atan2(slicex - centerx, slicey - centery)*(180.0/Ο€) for the angle of rotation for the slice, or rather the angle to rotate the image before slicing and the inverse rotation after slicing.

Colab notebook "How to just use Aphantasia"

Although this repository contains several colab notebooks, they are all too complex and cluttered. Wouldn't it be possible to make a simple sample notebook that just installs and runs Aphantasia? Without tons of code and complicated colab forms?

I've tried to create such a notebook...

!git clone https://github.com/eps696/aphantasia.git /content/aphantasia
%cd /content/aphantasia
!pip -q install -r requirements.txt
!pip install git+https://github.com/openai/CLIP.git
...
...
!python clip_fft.py -t "some text" --size 1280-720

...but it always ends up with a cannot connect to X server error. Of course the X-server doesn't run in Colab. Wouldn't it be possible to make a simplified, purely command line, version that doesn't need the X-server?

New error with previous aphantasia versions

Hi, first of thanks for your code and congrats for this amazing tool.
I have some customised versions of aphantasia, starting from your previous colab versions. They worked fine, but now when I try to run them I get the error below. Is this a bug? Or those versions are no longer usable with the library now? Is there a way of making them work again :)?
Thanks again


ModuleNotFoundError Traceback (most recent call last)
in ()
58 get_ipython().system('git clone https://github.com/eps696/aphantasia')
59 get_ipython().magic('cd /content/aphantasia/')
---> 60 from clip_fft import to_valid_rgb, fft_image, slice_imgs, checkout
61 from utils import pad_up_to, basename, img_list, img_read
62 from progress_bar import ProgressIPy as ProgressBar

/content/aphantasia/clip_fft.py in ()
15 import clip
16 os.environ['KMP_DUPLICATE_LIB_OK']='True'
---> 17 from sentence_transformers import SentenceTransformer
18 import pytorch_ssim as ssim
19

ModuleNotFoundError: No module named 'sentence_transformers'


IndexError in IllusTrip3D

When running a notebook, the program crashes when generating 401st frame, producing an IndexError. Stack trace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-7-2714cc85de30> in <module>()
    230 pbar = ProgressBar(glob_steps)
    231 for i in range(count):
--> 232   process(i)
    233 
    234 HTML(makevid(tempdir))

<ipython-input-7-2714cc85de30> in process(num)
    127 
    128     # transform frame for motion
--> 129     scale =       m_scale[glob_step]    if animate_them else 1-zoom
    130     trans = tuple(m_shift[glob_step])   if animate_them else [0, shift]
    131     angle =       m_angle[glob_step][0] if animate_them else rotate

IndexError: index 401 is out of bounds for axis 0 with size 401

It crashes on the 401st frame specifically (tested several times).

Doesn't work with PyTorch 1.8

PyTorch recently had a 1.8 release, bringing much better support for backing torch.cuda tensors with AMD GPUs.

However, clip_fft.py at least hasn't been ported to PyTorch 1.8 yet.

In particular, it still uses the deprecated and now removed pytorch.irfft, which needs to be replaced with calls to methods in the torch.fft namespace to work on PyTorch 1.8.

Unfortunately, the PR that removed support for the old methods doesn't provide a recipe for translating calls that can be executed by someone who doesn't understand the finer points of FFTs. It seems to me that the square-root-of-a-bunch-of-stuff normalization method of the old function isn't available as any of the normalization modes of torch.fft.irfft, and I'm not sure of the number of dimensions involved here, or whether we have the input versus the output sizes handy.

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! πŸŽ‰
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI ⚑
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... πŸ˜• We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... πŸ‘

What is needed to do?

What will you get?

  • scheduled nightly testing configured for development/stable versions
  • slack notification if something went wrong to investigate
  • testing also on multi-GPU machine as our gift to you 🐰

cc: @Borda

How do you upload a photo to work with it?

NameError Traceback (most recent call last)
in ()
12 text = translator.translate(text, dest='en').text
13 if upload_image:
---> 14 uploaded = files.upload()

NameError: name 'files' is not defined

SSIM Alternative: DISTS

I'm trying DISTS as an alternative to SSIM and so far it's... It works to make the supplied image show up in the results. I don't know a whole lot about it though, to be frank it's just something different that works. I haven't decided if it's of any worth yet.

To get it running in Aphantasia I just needed to !pip install dists-pytorch and in clip_fft.py add from DISTS_pytorch import DISTS and replace ssim_loss = ssim.SSIM(window_size = 11) with ssim_loss = DISTS(require_grad=True, batch_average=True)

The two images to compare might need to be blurred before comparison to focus more on shapes.

[Feature] Learning Rate Modified by Steps

I've experimented with a learning rate that changes as the steps increase due to seeing Aphantasia develop a general image very quickly, but then slowing down to make small details. I believe that my proposed alternative puts more focus on larger shapes, and less on details.

I expose the learning_rate variable and add a learning_rate_max variable in the Generate cell, remove the optimizer = torch.optim.Adam(params, learning_rate) line and instead add this to def train(i):

learning_rate_new = learning_rate + (i / steps) * (learning_rate_max - learning_rate)
optimizer_new = torch.optim.Adam(params, learning_rate_new)

With this, I find that a learning_rate of 0.0001 and a learning_rate_max of 0.008 at the highest value works well, for 300-400 steps and about 50 samples at least.

Invalid Syntax when trying to run the first time

I followed all the instructions and still I'm getting this error -

aphantasia git:(master) python clip_fft.py -t "the text" --size 1280-720 
  File "clip_fft.py", line 112
    Ys = [torch.randn(*Y.shape).cuda() for Y in [Yl_in, *Yh_in]]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.