eps696 / aphantasia Goto Github PK

View Code? Open in Web Editor NEW

769.0 23.0 105.0 36.06 MB

CLIP + FFT/DWT/RGB = text to image/video

License: MIT License

Jupyter Notebook 43.73% Python 56.27%

text-to-image clip text-to-video

aphantasia's People

Contributors

Stargazers

Watchers

Forkers

fernandorpalvarez jeulianation aoue dendrome interfect rakomw limarossogas cedalexandre c00renut maciejdomagala vipermu silenzio777 calculusoflambdas ml-and-ai-repo aniball-github modmorph popkir guanchu strategist922 miguelbandera rogalag rkelln dberzon ryanapfel zhanghongyong123456 tsuki1646 ethanphan sub314xxl pixray bburrough asears alexwelcing moiseshorta rinnesixpaths ksburaya monk0062006 shi-weili barber5 tobran kendrick90 wanghaisheng dyorn skyfrei geotyper limbicnation metavai anonymousdestroyer jeanbrazeau lelo-byte basemdabbour pfakanator sshuster ascenderr dattgoswami vsewall web3pm boytjj fr-og maushev ofirbb tmcphee sleeping-girl mjahmadee sojew iamscifi cdosrunwild peternara smithee77 kkringg imclab sugi-san cedro3 tanglespace yohannawaliya cleancoindev llirikkcoder abhisekk781 detrading lizcoultersmith cedececa rinatb2017 hamzakhedraoui gg-big-org reallybigname edwardchalon falaky thusithawijethunga fireboyvt un1tz3r0 cynth3tik john-codes scenographist zero506 bonabobo cephdon ranamom bheemaiahnn huidaiwang 5l1v3r1 soheilpaper

aphantasia's Issues

Alternate Subtraction Method, Faster

I was trying out ways of manipulating the encoded text and one that I tried was subtracting encoded text from the encoded text prompt. I tried four renders for each and they look about the same, except the one that changes the encoded text had less of the subtract prompt which suggests to me that it's more effective at subtracting a prompt. Also it ends up using just the one txt_enc rather than 2, and just the one cosine similarity.

Prompt: "a photo of a human face" and Negative: "a photo of a face"

Subtracting Subtract's txt_enc0 from text_enc resulted in these

Existing negative method what uses cosine similarity with the image and negative prompt for loss resulted in these

And for fun, using subtract to increase the difference between the two by txt_enc + (txt_enc - text_enc0) resulted in these

The encoded text and images seem to be explorable like latent space.

ReadTimeOutError when installing OpenAi

In the instructions: pip install git+https://github.com/openai/CLIP.git results in the following error:

Collecting git+https://github.com/openai/CLIP.git
Cloning https://github.com/openai/CLIP.git to /tmp/pip-req-build-ham_skxz
Running command git clone -q https://github.com/openai/CLIP.git /tmp/pip-req-build-ham_skxz
Requirement already satisfied: ftfy in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (6.0.1)
Requirement already satisfied: regex in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (2020.6.8)
Requirement already satisfied: tqdm in /home/steven/anaconda3/lib/python3.8/site-packages (from clip==1.0) (4.47.0)
Collecting torch~=1.7.1
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)")': /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl
ERROR: Could not install packages due to an EnvironmentError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Max retries exceeded with url: /packages/1d/a9/f349273a0327fdf20a73188c9c3aa7dbce68f86fad422eadd366fd2ed7a0/torch-1.7.1-cp38-cp38-manylinux1_x86_64.whl (Caused by ReadTimeoutError("HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out. (read timeout=15)"))

NameError: name 'aug_transform' is not defined

TypeError: 'float' object is not subscriptable

Seems like something broke the IllusTrip3D.ipynb in the new update. Settings of possibly relevant parameters: zoom = 0.0005, shift = 0, animate_them = False

Here is the stack trace:

using fast aug transforms
 using RGB method, 95 samples
 ref text:  ethereal cosmology
 ref style:  
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-09ff3d020318> in <module>()
    229 pbar = ProgressBar(glob_steps)
    230 for i in range(count):
--> 231   process(i)
    232 
    233 HTML(makevid(tempdir))

1 frames
<ipython-input-6-4929578fc46a> in depth_transform(img_t, img_np, depth_infer, depth_mask, size, depthX, scale, shift, colors, depth_dir, save_num)
     46     dY = 100. * shift[1] / size[0]
     47     # dZ = movement direction: 1 away (zoom out), 0 towards (zoom in), 0.5 stay
---> 48     dZ = 0.5 + 23. * (scale[0]-1)
     49     # dZ += 0.5 * float(math.sin(((save_num % 70)/70) * math.pi * 2))
     50 

TypeError: 'float' object is not subscriptable

init_image support

Do any of the drawing modules have support for initialising from an image? I looked but didn't see any inverse FFT code currently in the codebase. If not this might be an interesting feature to consider adding for any models that would easily support such an operation.

: cannot connect to X server - Kaggle?

I am unsuccessfully trying to run clip_fft.py in a Kaggle notebook. The error message is seen below:

Start...
/kaggle/working/aphantasia/clip_fft.py:128: UserWarning: The function torch.irfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.ifft or torch.fft.irfft. (Triggered internally at  /pytorch/aten/src/ATen/native/SpectralOps.cpp:602.)
  image = torch.irfft(scaled_spectrum_t, 2, normalized=True, signal_sizes=(h, w))
: cannot connect to X server

Previously I had errors that seemed to stem from dependencies, which I resolved. However, : cannot connect to X server is a dead end to me, although apparently it has something to do with displays. The last thing I tried was turning off verbose, thinking that displaying the previews may be the problem, but this did not solve the issue. I have no idea where to begin troubleshooting further. In fact, I'm not even sure if the issue is rooted in this tool, or in Kaggle. This tool works fine in the intended Colab environment*, and I'm not sure if you (eps696) have used Kaggle, so I understand if this issue is out of scope. Nonetheless I would appreciate any insight you or others might have!

*except that I keep running out of GPU time - Kaggle displays the time limits, at least

Can't generate video

When it finishes, now it asks for some text in a text box, but I have no clue what it's asking me for. Please help.

[Feature] Locational Prompts

What if image slices from the left and image slices from the right were compared to different prompts? Slices in the center and slices at the edge?

Perhaps if it was set to check for a jungle on the edges, and for a toucan in the center, it would create an image of a toucan in a jungle with a broad background of jungle.

While an interface for such a thing may get convoluted quickly, as long as it's kept to simple options like left/out and right/in and midpoint with a toggle for left/right or edge/center or up/down or what have you it shouldn't get too outrageous for a user.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 2.00 GiB total capacity; 1.58 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Hi @eps696

I am keep on getting below error. I am unable to run the code for 30 samples and 30 steps too.

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 2.00 GiB total capacity; 1.58 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Can you please help me how to resolve the issue. I spent almost a day of time to resolve issue but I am helpless.

I would like to run the model with 500 samples and steps to see a perfect image generated which gives me immense happy of success.

Looking forward to hear from you.

Thank you.

DeepSpeed integration for training on local cheaper GPUs.

This is an awesome repository that you've made.
It would be helpful if you'd integrate DeepSpeed for ZeRO-Infinity NVME offload, for beginners like me😅, as Im only having a 1660ti for training, and trying to generate Full HD images for my PPTs.

Something changed since 04/19/2022

Now it won't create the video, & I get this on the Colab:

.. generating video ..
_out/invisible_man_behind_the_1%_curtain-Film_noir/%04d.jpg: No such file or directory

FileNotFoundError Traceback (most recent call last)
in ()
209 _ = pbar.upd()
210
--> 211 HTML(makevid(tempdir))
212 torch.save(params, tempdir + '.pt')
213 files.download(tempdir + '.pt')

in makevid(seq_dir, size)
67 get_ipython().system('ffmpeg -y -v warning -i $out_sequence -crf 20 $out_video')
68 # moviepy.editor.ImageSequenceClip(img_list(seq_dir), fps=25).write_videofile(out_video, verbose=False)
---> 69 data_url = "data:video/mp4;base64," + b64encode(open(out_video,'rb').read()).decode()
70 wh = '' if size is None else 'width=%d height=%d' % (size, size)
71 return """<video %s controls>""" % (wh, data_url)

FileNotFoundError: [Errno 2] No such file or directory: '_out/invisible_man_behind_the_1%_curtain-Film_noir.mp4'

Recent change broke Aphantasia.ipynb

This (accidental?) change has caused the notebook to run into blocking errors regarding the existence of the aphantasia/_out/ttt directory.

7328c00#diff-5e754e4f0c2e9323a4a4c199581e6396efc5b7af4165acc368c97bd8401ecb23R216

RuntimeError: CUDA out of memory. Tried to allocate 60.00 MiB (GPU 0; 6.00 GiB total capacity; 4.11 GiB already allocated; 34.82 MiB free; 4.42 GiB reserved in total by PyTorch)

i think i have to reduce the batch size but downt know how.

Draw only one object?

Error when running "Generate"

Everything works well, until I click generate.
Here's the error:

 using 300 samples
100%|███████████████████████████████████████| 338M/338M [00:05<00:00, 62.9MiB/s]
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-6a6046cf5f2d> in <module>()
     85 use_jit = True if float(torch.__version__[:3]) < 1.8 else False
     86 model_clip, _ = clip.load(model, jit=use_jit)
---> 87 modsize = model_clip.visual.input_resolution
     88 xmem = {'ViT-B/16':0.25, 'RN50':0.5, 'RN50x4':0.16, 'RN50x16':0.06, 'RN101':0.33}
     89 if model in xmem.keys():

2 frames
/usr/local/lib/python3.7/dist-packages/torch/jit/_script.py in __getattr__(self, attr)
    755                 return script_method
    756 
--> 757             return super(RecursiveScriptModule, self).__getattr__(attr)
    758 
    759         def __setattr__(self, attr, value):

/usr/local/lib/python3.7/dist-packages/torch/jit/_script.py in __getattr__(self, attr)
    472         def __getattr__(self, attr):
    473             if "_actual_script_module" not in self.__dict__:
--> 474                 return super(ScriptModule, self).__getattr__(attr)
    475             return getattr(self._actual_script_module, attr)
    476 

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in __getattr__(self, name)
   1176                 return modules[name]
   1177         raise AttributeError("'{}' object has no attribute '{}'".format(
-> 1178             type(self).__name__, name))
   1179 
   1180     def __setattr__(self, name: str, value: Union[Tensor, 'Module']) -> None:

AttributeError: 'RecursiveScriptModule' object has no attribute 'input_resolution'

I've tried running it without changing any settings, and it does the same thing. Help?

AssertionError: Torch not compiled with CUDA enabled

Mac OS,python3

Traceback (most recent call last):
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 511, in <module>
    main()
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 243, in main
    key_txt_encs = [enc_text(txt) for txt in texts]
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 243, in <listcomp>
    key_txt_encs = [enc_text(txt) for txt in texts]
  File "/Users/51pwn/MyWork/ai/aphantasia/illustrip.py", line 203, in enc_text
    emb = model_clip.encode_text(clip.tokenize(subtxt).cuda()[:77])
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/torch/cuda/__init__.py", line 208, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Pytorch Import torch.irfft Update to torch.fft.irfft

You may need to update your function call as below:

clip_fft.py:137: UserWarning: The function torch.irfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.ifft or torch.fft.irfft. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:

I have tried to use illustra in colab but it gives me the following error

ModuleNotFoundError Traceback (most recent call last)
in ()
73 get_ipython().system('git clone https://github.com/eps696/aphantasia')
74 get_ipython().magic('cd aphantasia/')
---> 75 from clip_fft import to_valid_rgb, fft_image, slice_imgs, checkout
76 from utils import pad_up_to, basename, file_list, img_list, img_read, txt_clean, plot_text
77 from progress_bar import ProgressIPy as ProgressBar

/content/aphantasia/clip_fft.py in ()
16 os.environ['KMP_DUPLICATE_LIB_OK']='True'
17 from sentence_transformers import SentenceTransformer
---> 18 import lpips
19
20 from utils import pad_up_to, basename, img_list, img_read, plot_text, txt_clean

ModuleNotFoundError: No module named 'lpips'

Specify GPU

Could something be added to specify the GPU to use?

[Feature] Non-Random Slices for Locational Encoding of Input Image

Rather than doing multiple random slices of the input image to match using CLIP, what if it was an orderly grid covering the image? Such that each chunk of an image can be encoded and stored in an array, and later when encoding the output image to compare to the input image it can follow the same orderly grid and compare to entries in the array, with the intent of comparing the same locations of the images. I imagine some overlap in the slices would be required.

I reckon that with this, if CLIP detects a feature in a certain place, Aphantasia will try to match that same feature in that same place, instead of trying to match the overall description of an image.

clip_fft.py won't start

I installed requirements.txt and git+https://github.com/openai/CLIP.git and after that I ran

python clip_fft.py -t "city" -t2 "gradient" --size 1280-720

And after that I got the error

c:\etc\aphantasia-master>python clip_fft.py -t "city" -t2 "gradient" --size 1280-720
Traceback (most recent call last):
  File "clip_fft.py", line 23, in <module>
    from utils import slice_imgs, derivat, sim_func, basename, img_list, img_read, plot_text, txt_clean, checkout, old_torch
  File "c:\etc\aphantasia-master\utils.py", line 13, in <module>
    from kornia.filters.sobel import spatial_gradient
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\__init__.py", line 19, in <module>    from kornia import jit
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\jit\__init__.py", line 9, in <module>
    spatial_soft_argmax2d = torch.jit.script(K.geometry.spatial_soft_argmax2d)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\_recursive.py", line 568, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\_recursive.py", line 568, in try_compile_fn
    return torch.jit.script(fn, _rcb=rcb)
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\jit\__init__.py", line 1290, in script
    fn = torch._C._jit_script_compile(qualified_name, ast, _rcb, get_default_args(obj))
RuntimeError:
Unknown type name 'torch.dtype':
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\utils\grid.py", line 12
        normalized_coordinates: bool = True,
        device: Optional[torch.device] = torch.device('cpu'),
        dtype: torch.dtype = torch.float32) -> torch.Tensor:
               ~~~~~~~~~~~ <--- HERE
    """Generates a coordinate grid for an image.
'create_meshgrid' is being compiled since it was called from 'spatial_expectation2d'
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\geometry\subpix\dsnt.py", line 100
    # Create coordinates grid.
    grid: torch.Tensor = create_meshgrid(height, width, normalized_coordinates, input.device)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    grid = grid.to(input.dtype)
'spatial_expectation2d' is being compiled since it was called from 'spatial_soft_argmax2d'
  File "C:\Users\user\AppData\Local\Programs\Python\Python38\lib\site-packages\kornia\geometry\subpix\spatial_soft_argmax.py", line 516
    """
    input_soft: torch.Tensor = dsnt.spatial_softmax2d(input, temperature)
    output: torch.Tensor = dsnt.spatial_expectation2d(input_soft, normalized_coordinates)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    return output

How can I make the program work?

Incorporating the -notext option into IllusTrip3D.ipynb

Love this implementation greatly! I was wondering if it would be possible to incorporate the -notext option into the IllusTrip3D.ipynb notebook.

Error

Hello, trying to run the code but i get an error, how do i solve it ?Thanks!

NameError Traceback (most recent call last)
in ()
2
3 get_ipython().system('rm -rf $tempdir')
----> 4 os.makedirs(tempdir, exist_ok=True)
5
6 sideX = 900 #@param {type:"integer"}

NameError: name 'tempdir' is not defined

New CLIP Models

OpenAI put out 2 new clip models, RN50x4 and RN101. I've tried them and I genuinely don't know what visible difference there is, but maybe you'll want to add them as options.

Illustrip3D - problems with video output quality

Hi Vadim!

IllusTrip3d is an outstanding tool! Great! Unfortunately, the video generated directly by the Colab notebook is very pixelated and of poor quality. When I stitch the JPGs myself in a video editor, I have jumps and dropouts in between (it also seems to lack 3D depth processing). How and where can I change the values in the notebook so that I have a decent HD output of the video? Thanks in advance for an answer!

All the best!
Patrick

[Feature/Trick] Image Slice Rotation

ViT-B/32 appears to only recognize things that are correctly oriented, like a tree won't be crooked and a face won't be upside-down, so adding rotation to the slice_imgs section should result in wider diversity at low levels of random rotation, and something approaching chaos at high levels of random rotation.

Alternatively, I suppose it could be rotated in such a way that the slices all point toward the center of the image, making a radial design converging in the center, rather than rectangular. I reckon that'd work like (offsetx - csize * 0.5) and (offsety - csize * 0.5) to get the center of the slice, and then take those as slicex and slicey for atan2(slicex - centerx, slicey - centery)*(180.0/π) for the angle of rotation for the slice, or rather the angle to rotate the image before slicing and the inverse rotation after slicing.

Colab notebook "How to just use Aphantasia"

Although this repository contains several colab notebooks, they are all too complex and cluttered. Wouldn't it be possible to make a simple sample notebook that just installs and runs Aphantasia? Without tons of code and complicated colab forms?

I've tried to create such a notebook...

!git clone https://github.com/eps696/aphantasia.git /content/aphantasia
%cd /content/aphantasia
!pip -q install -r requirements.txt
!pip install git+https://github.com/openai/CLIP.git
...
...
!python clip_fft.py -t "some text" --size 1280-720

...but it always ends up with a cannot connect to X server error. Of course the X-server doesn't run in Colab. Wouldn't it be possible to make a simplified, purely command line, version that doesn't need the X-server?

New error with previous aphantasia versions

Hi, first of thanks for your code and congrats for this amazing tool.
I have some customised versions of aphantasia, starting from your previous colab versions. They worked fine, but now when I try to run them I get the error below. Is this a bug? Or those versions are no longer usable with the library now? Is there a way of making them work again :)?
Thanks again

ModuleNotFoundError Traceback (most recent call last)
in ()
58 get_ipython().system('git clone https://github.com/eps696/aphantasia')
59 get_ipython().magic('cd /content/aphantasia/')
---> 60 from clip_fft import to_valid_rgb, fft_image, slice_imgs, checkout
61 from utils import pad_up_to, basename, img_list, img_read
62 from progress_bar import ProgressIPy as ProgressBar

/content/aphantasia/clip_fft.py in ()
15 import clip
16 os.environ['KMP_DUPLICATE_LIB_OK']='True'
---> 17 from sentence_transformers import SentenceTransformer
18 import pytorch_ssim as ssim
19

ModuleNotFoundError: No module named 'sentence_transformers'

IndexError in IllusTrip3D

When running a notebook, the program crashes when generating 401st frame, producing an IndexError. Stack trace:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-7-2714cc85de30> in <module>()
    230 pbar = ProgressBar(glob_steps)
    231 for i in range(count):
--> 232   process(i)
    233 
    234 HTML(makevid(tempdir))

<ipython-input-7-2714cc85de30> in process(num)
    127 
    128     # transform frame for motion
--> 129     scale =       m_scale[glob_step]    if animate_them else 1-zoom
    130     trans = tuple(m_shift[glob_step])   if animate_them else [0, shift]
    131     angle =       m_angle[glob_step][0] if animate_them else rotate

IndexError: index 401 is out of bounds for axis 0 with size 401

It crashes on the 401st frame specifically (tested several times).

Doesn't work with PyTorch 1.8

PyTorch recently had a 1.8 release, bringing much better support for backing torch.cuda tensors with AMD GPUs.

However, clip_fft.py at least hasn't been ported to PyTorch 1.8 yet.

In particular, it still uses the deprecated and now removed pytorch.irfft, which needs to be replaced with calls to methods in the torch.fft namespace to work on PyTorch 1.8.

Unfortunately, the PR that removed support for the old methods doesn't provide a recipe for translating calls that can be executed by someone who doesn't understand the finer points of FFTs. It seems to me that the square-root-of-a-bunch-of-stuff normalization method of the old function isn't available as any of the normalization modes of torch.fft.irfft, and I'm not sure of the number of dimensions involved here, or whether we have the input versus the output sizes handy.

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! 🎉
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI ⚡
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... 😕 We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... 👍

What is needed to do?

have some tests, including PL integration
add config to ecosystem CI - https://github.com/PyTorchLightning/ecosystem-ci

What will you get?

scheduled nightly testing configured for development/stable versions
slack notification if something went wrong to investigate
testing also on multi-GPU machine as our gift to you 🐰

cc: @Borda

How do you upload a photo to work with it?

NameError Traceback (most recent call last)
in ()
12 text = translator.translate(text, dest='en').text
13 if upload_image:
---> 14 uploaded = files.upload()

NameError: name 'files' is not defined

SSIM Alternative: DISTS

I'm trying DISTS as an alternative to SSIM and so far it's... It works to make the supplied image show up in the results. I don't know a whole lot about it though, to be frank it's just something different that works. I haven't decided if it's of any worth yet.

To get it running in Aphantasia I just needed to !pip install dists-pytorch and in clip_fft.py add from DISTS_pytorch import DISTS and replace ssim_loss = ssim.SSIM(window_size = 11) with ssim_loss = DISTS(require_grad=True, batch_average=True)

The two images to compare might need to be blurred before comparison to focus more on shapes.

[Feature] Learning Rate Modified by Steps

I've experimented with a learning rate that changes as the steps increase due to seeing Aphantasia develop a general image very quickly, but then slowing down to make small details. I believe that my proposed alternative puts more focus on larger shapes, and less on details.

I expose the learning_rate variable and add a learning_rate_max variable in the Generate cell, remove the optimizer = torch.optim.Adam(params, learning_rate) line and instead add this to def train(i):

learning_rate_new = learning_rate + (i / steps) * (learning_rate_max - learning_rate)
optimizer_new = torch.optim.Adam(params, learning_rate_new)

With this, I find that a learning_rate of 0.0001 and a learning_rate_max of 0.008 at the highest value works well, for 300-400 steps and about 50 samples at least.

Invalid Syntax when trying to run the first time

I followed all the instructions and still I'm getting this error -

aphantasia git:(master) python clip_fft.py -t "the text" --size 1280-720 
  File "clip_fft.py", line 112
    Ys = [torch.randn(*Y.shape).cuda() for Y in [Yl_in, *Yh_in]]

eps696 / aphantasia Goto Github PK

aphantasia's People

Contributors

Stargazers

Watchers

Forkers

aphantasia's Issues

.. generating video .. _out/invisible_man_behind_the_1%_curtain-Film_noir/%04d.jpg: No such file or directory

Recommend Projects

Recommend Topics

Recommend Org

Jobs

.. generating video ..
_out/invisible_man_behind_the_1%_curtain-Film_noir/%04d.jpg: No such file or directory