dangeng / visual_anagrams Goto Github PK

View Code? Open in Web Editor NEW

802.0 802.0 76.0 420.87 MB

Code for the paper "Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models"

License: MIT License

Python 0.91% Shell 0.06% Jupyter Notebook 99.03%

visual_anagrams's People

Contributors

Stargazers

Watchers

Forkers

stanleyjacob francescofugazzi kartheekmedathati codeaudit wikibuda coldra1n sorokinvld evelynmitchell anminhhung jomsk1e keyman9848 ayushchatur ilyamk scchess pk00095 animesh shantanunair techthiyanes svennidal chigozienri kinddevil gmc soccertary-98 thesoyithugzhoff aniketgurav windumbo81beachroon bhkgithub54 godenas k2m5t2 captainnclonumell o-cookiecere soccertarycubacken kroolspicenogginne unlimitorbe80 a-larvional choneprep22chikdot spinti-cornyslip missingol-promnica x-parmatr curiousli50 billite-jiggyough narcommaf-freddarth soundnsmilefibrouppl icytwilight-muslanet muzaffersaylan rainberlohalicket dragonusangelife wdspjm ali5ac lenalanini so-vincent popupbuddy soi-20 zcfrank1st f901107 mykeehu peterzs hhy5277 slyracoon23 stevenshaw1999 lukebarlow connorbadams jbochi tmzh hemanthk-12 paperwave jackzhousz playfultechnology wesley-yang hiyyg njustzandyz tijl andupotorac lemonsearch 5ky9uy

visual_anagrams's Issues

How to create animations?

I can run the example scripts but I only get static dual image results.
How can I generate the animated/rotating gif examples you show in the git repo and https://dangeng.github.io/visual_anagrams/ page.

Deepfloyd alternative?

Hi,
Is there an alternative we can use instead of DeepFloyd? thanks

Code for Figure 7 in the paper

Can you please share the code on Random Orthogonal transformation in Figure 7 in the paper?

Many thanks

Issue creating conda environment with freshly cloned repository

Hello, I'm running into an issue when cloning the repository and following the steps to install the conda environment. I've pasted the output below. Any help that you can provide is appreciated!

os: macos/arm64
conda version: 23.10.0
python: 3.11.5

conda env create -f environment.yml            
Channels:
 - pytorch
 - nvidia
 - conda-forge
 - defaults
Platform: osx-arm64
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

  - _libgcc_mutex==0.1=main
  - _openmp_mutex==5.1=1_gnu
  - blas==1.0=mkl
  - bottleneck==1.3.5=py39h7deecbd_0
  - brotli==1.0.9=h9c3ff4c_4
  - brotlipy==0.7.0=py39h27cfd23_1003
  - bzip2==1.0.8=h7b6447c_0
  - ca-certificates==2023.7.22=hbcca054_0
  - cffi==1.15.1=py39h5eee18b_3
  - contourpy==1.0.5=py39hdb19cb5_0
  - cryptography==41.0.3=py39hdda0065_0
  - cuda-cudart==12.1.105=0
  - cuda-cupti==12.1.105=0
  - cuda-libraries==12.1.0=0
  - cuda-nvrtc==12.1.105=0
  - cuda-nvtx==12.1.105=0
  - cuda-opencl==12.2.140=0
  - cuda-runtime==12.1.0=0
  - cyrus-sasl==2.1.28=h52b45da_1
  - dbus==1.13.18=hb2f20db_0
  - expat==2.5.0=h6a678d5_0
  - ffmpeg==4.3=hf484d3e_0
  - fontconfig==2.14.1=h4c34cd2_2
  - freetype==2.12.1=h4a9f257_0
  - giflib==5.2.1=h5eee18b_3
  - glib==2.69.1=he621ea3_2
  - gmp==6.2.1=h295c915_3
  - gmpy2==2.1.2=py39heeb90bb_0
  - gnutls==3.6.15=he1e5248_0
  - gst-plugins-base==1.14.1=h6a678d5_1
  - gstreamer==1.14.1=h5eee18b_1
  - icu==58.2=hf484d3e_1000
  - idna==3.4=py39h06a4308_0
  - intel-openmp==2023.1.0=hdb19cb5_46305
  - jinja2==3.1.2=py39h06a4308_0
  - jpeg==9e=h5eee18b_1
  - kiwisolver==1.4.4=py39h6a678d5_0
  - krb5==1.20.1=h143b758_1
  - lame==3.100=h7b6447c_0
  - lcms2==2.12=h3be6417_0
  - ld_impl_linux-64==2.38=h1181459_1
  - lerc==3.0=h295c915_0
  - libclang==14.0.6=default_hc6dbbc7_1
  - libclang13==14.0.6=default_he11475f_1
  - libcublas==12.1.0.26=0
  - libcufft==11.0.2.4=0
  - libcufile==1.7.2.10=0
  - libcups==2.4.2=h2d74bed_1
  - libcurand==10.3.3.141=0
  - libcusolver==11.4.4.55=0
  - libcusparse==12.0.2.55=0
  - libdeflate==1.17=h5eee18b_1
  - libedit==3.1.20221030=h5eee18b_0
  - libevent==2.1.12=hdbd6064_1
  - libffi==3.4.4=h6a678d5_0
  - libgcc-ng==11.2.0=h1234567_1
  - libgfortran-ng==13.2.0=h69a702a_0
  - libgfortran5==13.2.0=ha4646dd_0
  - libgomp==11.2.0=h1234567_1
  - libiconv==1.16=h7f8727e_2
  - libidn2==2.3.4=h5eee18b_0
  - libjpeg-turbo==2.0.0=h9bf148f_0
  - libllvm14==14.0.6=hdb19cb5_3
  - libnpp==12.0.2.50=0
  - libnvjitlink==12.1.105=0
  - libnvjpeg==12.1.1.14=0
  - libpng==1.6.39=h5eee18b_0
  - libpq==12.15=hdbd6064_1
  - libstdcxx-ng==11.2.0=h1234567_1
  - libtasn1==4.19.0=h5eee18b_0
  - libtiff==4.5.1=h6a678d5_0
  - libunistring==0.9.10=h27cfd23_0
  - libuuid==1.41.5=h5eee18b_0
  - libwebp==1.3.2=h11a3e52_0
  - libwebp-base==1.3.2=h5eee18b_0
  - libxcb==1.15=h7f8727e_0
  - libxkbcommon==1.0.1=h5eee18b_1
  - libxml2==2.10.4=hcbfbd50_0
  - libxslt==1.1.37=h2085143_0
  - llvm-openmp==14.0.6=h9e868ea_0
  - lz4-c==1.9.4=h6a678d5_0
  - matplotlib==3.7.2=py39h06a4308_0
  - matplotlib-base==3.7.2=py39h1128e8f_0
  - mkl==2023.1.0=h213fc3f_46343
  - mkl-service==2.4.0=py39h5eee18b_1
  - mkl_fft==1.3.8=py39h5eee18b_0
  - mkl_random==1.2.4=py39hdb19cb5_0
  - mpc==1.1.0=h10f8cd9_1
  - mpfr==4.0.2=hb69a4c5_1
  - mpmath==1.3.0=py39h06a4308_0
  - mysql==5.7.24=h721c034_2
  - ncurses==6.4=h6a678d5_0
  - nettle==3.7.3=hbbd107a_1
  - networkx==3.1=py39h06a4308_0
  - numexpr==2.8.7=py39h85018f9_0
  - numpy==1.26.0=py39h5f9d8c6_0
  - numpy-base==1.26.0=py39hb5e798b_0
  - openh264==2.1.1=h4ff587b_0
  - openjpeg==2.4.0=h3ad879b_0
  - openssl==3.0.11=h7f8727e_2
  - pandas==2.1.1=py39h1128e8f_0
  - pcre==8.45=h9c3ff4c_0
  - pillow==10.0.1=py39ha6cbd5a_0
  - pip==23.2.1=py39h06a4308_0
  - pyopenssl==23.2.0=py39h06a4308_0
  - pyqt==5.15.7=py39h6a678d5_1
  - pyqt5-sip==12.11.0=py39h6a678d5_1
  - pysocks==1.7.1=py39h06a4308_0
  - python==3.9.18=h955ad1f_0
  - pytorch==2.1.0=py3.9_cuda12.1_cudnn8.9.2_0
  - pytorch-cuda==12.1=ha16c6d3_5
  - qt-main==5.15.2=h7358343_9
  - qt-webengine==5.15.9=hbbf29b9_6
  - qtwebkit==5.212=h3fafdc1_5
  - readline==8.2=h5eee18b_0
  - requests==2.31.0=py39h06a4308_0
  - scipy==1.11.3=py39h5f9d8c6_0
  - setuptools==68.0.0=py39h06a4308_0
  - sip==6.6.2=py39h6a678d5_0
  - sqlite==3.41.2=h5eee18b_0
  - statsmodels==0.14.0=py39ha9d4c09_0
  - tbb==2021.8.0=hdb19cb5_0
  - tk==8.6.12=h1ccaba5_0
  - torchaudio==2.1.0=py39_cu121
  - torchtriton==2.1.0=py39
  - torchvision==0.16.0=py39_cu121
  - tornado==6.1=py39hb9d737c_3
  - typing_extensions==4.7.1=py39h06a4308_0
  - wheel==0.41.2=py39h06a4308_0
  - xz==5.4.2=h5eee18b_0
  - yaml==0.2.5=h7b6447c_0
  - zlib==1.2.13=h5eee18b_0
  - zstd==1.5.5=hc292b87_0

Current channels:

  - https://conda.anaconda.org/pytorch/osx-arm64
  - https://conda.anaconda.org/nvidia/osx-arm64
  - https://conda.anaconda.org/conda-forge/osx-arm64
  - https://repo.anaconda.com/pkgs/main/osx-arm64
  - https://repo.anaconda.com/pkgs/r/osx-arm64

To search for alternate channels that may provide the conda package you're
looking for, navigate to

    https://anaconda.org

and use the search bar at the top of the page.

animate.py fails in example case

Running locally on an NVIDIA 3080 Max-Q with 16 GB VRAM

python generate.py --name rotate_cw.village.horse --prompts "a snowy mountain village" "a horse" --style "an oil painting of" --views identity rotate_cw --num_samples 10 --num_inference_steps 30 --guidance_scale 10.0

runs as expected but

python animate.py --im_path results/rotate_cw.village.horse/0000/sample_256.png --metadata_path results/rotate_cw.village.horse/metadata.pkl

fails with the error

100%|████████████████████████████████████████████████████████████████████████| 45/45 [00:00<00:00, 352.56it/s]
Making video...
Traceback (most recent call last):
File "/home/dan/anagram/visual_anagrams/animate.py", line 169, in
animate_two_view(
File "/home/dan/anagram/visual_anagrams/animate.py", line 123, in animate_two_view
imageio.mimsave(save_video_path, image_array, fps=30)
File "/home/dan/miniconda3/envs/visual_anagrams/lib/python3.9/site-packages/imageio/v2.py", line 494, in mimwrite
with imopen(uri, "wI", **imopen_args) as file:
File "/home/dan/miniconda3/envs/visual_anagrams/lib/python3.9/site-packages/imageio/core/imopen.py", line 281, in imopen
raise err_type(err_msg)
ValueError: Could not find a backend to open results/rotate_cw.village.horse/0000/sample_256.mp4`` with iomode wI`.
Based on the extension, the following plugins might add capable backends:
FFMPEG: pip install imageio[ffmpeg]
pyav: pip install imageio[pyav]

despite having imageio with the ffmpeg and pyav plugins installed in the conda environment.

NameError: name 'stage_1' is not defined

When running the cell:

image_64 = sample_stage_1(stage_1, prompt_embeds, negative_prompt_embeds, views, num_inference_steps=30, guidance_scale=10.0, reduction='mean', generator=None) mp.show_images([im_to_np(view.view(image_64[0])) for view in views])

There is an error:

NameError Traceback (most recent call last)
in <cell line: 1>()
----> 1 image_64 = sample_stage_1(stage_1,
2 prompt_embeds,
3 negative_prompt_embeds,
4 views,
5 num_inference_steps=30,

NameError: name 'stage_1' is not defined

requirement `clip=-1.0` causing installation failure

When installing this repository with

conda env create -f environment.yml

I get the error

ERROR: Could not find a version that satisfies the requirement clip==1.0 (from versions: 0.0.1, 0.1.0, 0.2.0)

When removing the line - clip==1.0 from environment.yml, it works fine.

For pip, by default, clip refers to an old clipboard manager: https://pypi.org/project/clip/
If you mean OpenAI CLIP, maybe the line should be changed to reference git+https://github.com/openai/CLIP.git.

However, because it doesn't seem to be needed, I would suggest removing this line.

Negative prompts?

Hi! Firstly, many thanks for sharing this project - it's fascinating! :)

I'm trying to understand how/if negative prompts can be added to better guide the generation of each view... but I'm having some issues with my understanding. As far as I can see, the generate.py script generates a list of matching positive and negative prompt embeddings from the supplied command line prompts:

prompts = [f'{args.style} {p}'.strip() for p in args.prompts]
prompt_embeds = [stage_1.encode_prompt(p) for p in prompts]
prompt_embeds, negative_prompt_embeds = zip(*prompt_embeds)
prompt_embeds = torch.cat(prompt_embeds)
negative_prompt_embeds = torch.cat(negative_prompt_embeds)  # These are just null embeds

(

visual_anagrams/generate.py

Lines 50 to 54 in 491b76b

 prompts = [f'{args.style} {p}'.strip() for p in args.prompts] 

 prompt_embeds = [stage_1.encode_prompt(p) for p in prompts] 

 prompt_embeds, negative_prompt_embeds = zip(*prompt_embeds) 

 prompt_embeds = torch.cat(prompt_embeds) 

 negative_prompt_embeds = torch.cat(negative_prompt_embeds) # These are just null embeds

)

The final comment suggests that the negative prompt embedding is null, but the diffusers library states:

negative_prompt_embeds (`torch.Tensor`, *optional*):
  Pre-generated negative text embeddings. Can be used to easily tweak text inputs, *e.g.* prompt
  weighting. If not provided, negative_prompt_embeds will be generated from `negative_prompt` input
  argument.

https://github.com/huggingface/diffusers/blob/fdb1baa05c8da5b4ed3e7a62200f406dcb26ba79/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L333C1-L336C26

So... should I be providing negative_prompt input to sample_stage_1 instead? Any advice, or an example of how to add negative prompts would be gratefully appreciated!

	prompts = [f'{args.style} {p}'.strip() for p in args.prompts]
	prompt_embeds = [stage_1.encode_prompt(p) for p in prompts]
	prompt_embeds, negative_prompt_embeds = zip(*prompt_embeds)
	prompt_embeds = torch.cat(prompt_embeds)
	negative_prompt_embeds = torch.cat(negative_prompt_embeds) # These are just null embeds

dangeng / visual_anagrams Goto Github PK

visual_anagrams's People

Contributors

Stargazers

Watchers

Forkers

visual_anagrams's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs