Comments (4)
- good point, will move subtraction out of the training loop.
- your method of "increasing the difference" in fact just decreasing the effect of subtraction (like adding weight < 1): here 2x-y ~ x-0.5y. and the examples did show that - some kind of "faces" appeared with such weighing down.
- sure; on my understanding, any continuous embedding is a latent vector by definition. we just don't have decoder for that, like from proper dall-e (not the stripped down published version, but the photorealistic one from the article), so have to move around with optimization techniques instead.
from aphantasia.
Ha! Whoops, I was so focused on trying to do something involving the tendency for CLIP to label an image with a face as "a photo of a human face" with a higher score than "a photo of a human face" that I done went and did 2*enc1-enc2, shit. Back to the drawing board.
from aphantasia.
regarding preliminary text subtraction txt_enc - text_enc0
: after second thinking, it's not the same. when we compare the losses after cossimilarity, we check how far or close we're to those prompts/concepts (that's what we probably want). if we subtract it at once, we will check instead how close we are to the difference between the two, essentially losing the position of "center of mass" of the pair (in the embedding space). so the resulting vector may have nothing in common with either of prompts, and most likely we'd get smth rather different.
from aphantasia.
just to ensure - i've tried direct subtraction method on a few meaningful sentences, and it predictably went totally aside of main topic. and just to make it clear - encoded embeddings are NOT losses, their summation/subtraction have different impact.
finally, cossim comparison is just an op, it's probably few orders of magnitude faster than encoding (and even slicing), so "time savings" should not be measurable
from aphantasia.
Related Issues (20)
- RuntimeError: CUDA out of memory. Tried to allocate 60.00 MiB (GPU 0; 6.00 GiB total capacity; 4.11 GiB already allocated; 34.82 MiB free; 4.42 GiB reserved in total by PyTorch) HOT 3
- Pytorch Import torch.irfft Update to torch.fft.irfft HOT 1
- Specify GPU HOT 2
- DeepSpeed integration for training on local cheaper GPUs. HOT 3
- Invalid Syntax when trying to run the first time HOT 10
- Error when running "Generate" HOT 2
- Illustrip3D - problems with video output quality HOT 3
- Can't generate video HOT 14
- init_image support HOT 4
- Colab notebook "How to just use Aphantasia" HOT 3
- clip_fft.py won't start HOT 4
- Error HOT 2
- IndexError in IllusTrip3D HOT 1
- TypeError: 'float' object is not subscriptable HOT 3
- integrate with Lightning ecosystem CI HOT 1
- Something changed since 04/19/2022 HOT 1
- Incorporating the -notext option into IllusTrip3D.ipynb HOT 2
- NameError: name 'aug_transform' is not defined HOT 1
- AssertionError: Torch not compiled with CUDA enabled HOT 1
- torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 2.00 GiB total capacity; 1.58 GiB already allocated; 0 bytes free; 1.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aphantasia.