GithubHelp home page GithubHelp logo

huanngzh / epidiff Goto Github PK

View Code? Open in Web Editor NEW
44.0 44.0 4.0 34.15 MB

[CVPR 2024] EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Home Page: https://huanngzh.github.io/EpiDiff/

License: MIT License

Python 100.00%
3d-generation diffusion-models generative-model single-view-reconstruction

epidiff's People

Contributors

huanngzh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

epidiff's Issues

Questions about implementation

Hi dear Authors!

If possible, I would like to inquire about certain parts which are not fully clear from the paper:

  • where in the UNet the ECA blocks are inserted? (between self-attention and cross-attention, or after the cross-attention?)
  • does the ray self-attention output is added as a residual to the prior block output in the unet?
  • what are the hyperparameters for the ECA block? number of heads, dimensions, etc.
  • does the weighted sum for the ray self-attention comes from summing directly on the values in the sample dimension after the softmax?
  • what are the hyperparameters for the positional encoding for the "harmonic transformation"? how many frequencies?

thank you!

Error: "CUDA out of memory"

Hello author, thank you for your excellent work and outstanding contributions!

When I was reproducing your code, I encountered a CUDA out of memory error during the execution of the inference step. I ran your code on a 4090 graphics card with 24GB of VRAM, and the CUDA version is 12.2.

The complete error message is as follows:

(base) root@autodl-container-ba494bb15c-be0ac141:~/EpiDiff# python inference.py --config configs/baseline.yaml --ckpt /root/autodl-tmp/baseline-n16f16-pabs.ckpt --input_img testset/3D_Dollhouse_Lamp.webp --output_dir outputs --elevation 30 --seed 0 --device cuda
Global seed set to 0
The config attributes {'dropout': 0.0, 'reverse_transformer_layers_per_block': None} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Loaded model from /root/autodl-tmp/baseline-n16f16-pabs.ckpt
Traceback (most recent call last):
File "inference.py", line 141, in
main()
File "inference.py", line 121, in main
images_pred = model._generate_images(data)
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/EpiDiff/epidiff/systems/i2mv_system.py", line 454, in _generate_images
noise_pred = self._forward_cls_free(
File "/root/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/EpiDiff/epidiff/systems/i2mv_system.py", line 420, in _forward_cls_free
noise_pred = model(latents, _timestep, _prompt_embd, meta)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EpiDiff/epidiff/models/mv_model.py", line 274, in forward
cp_block_states = self.cp_blocks_decoder[current_upblock_id](
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EpiDiff/epidiff/models/extras/feature_aggregator.py", line 123, in forward
out = self.T1(query, t=t_emb, context=context, attention_mask=mask)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EpiDiff/epidiff/models/blocks/transformer_t.py", line 188, in forward
x = block(x, t, context=context, mask=attention_mask)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EpiDiff/epidiff/models/blocks/transformer_t.py", line 138, in forward
x = self.attn2(self.norm2(x, t), context=context, mask=mask) + x
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/EpiDiff/epidiff/models/blocks/transformer_t.py", line 35, in forward
q, k, v = map(lambda t: rearrange(t, "b n (h d) -> (b h) n d", h=h), (q, k, v))
File "/root/EpiDiff/epidiff/models/blocks/transformer_t.py", line 35, in
q, k, v = map(lambda t: rearrange(t, "b n (h d) -> (b h) n d", h=h), (q, k, v))
File "/root/miniconda3/lib/python3.8/site-packages/einops/einops.py", line 483, in rearrange
return reduce(cast(Tensor, tensor), pattern, reduction='rearrange', **axes_lengths)
File "/root/miniconda3/lib/python3.8/site-packages/einops/einops.py", line 412, in reduce
return _apply_recipe(recipe, tensor, reduction_type=reduction)
File "/root/miniconda3/lib/python3.8/site-packages/einops/einops.py", line 241, in _apply_recipe
return backend.reshape(tensor, final_shapes)
File "/root/miniconda3/lib/python3.8/site-packages/einops/_backends.py", line 84, in reshape
return x.reshape(shape)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.34 GiB (GPU 0; 23.65 GiB total capacity; 20.71 GiB already allocated; 52.06 MiB free; 23.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I have tried modifying some parameters in configs/baseline.yaml, such as reducing the number of num_workers in the data section, but I still couldn't solve this problem.

I would like to ask for your suggestions on how to resolve this issue and if you could provide some configuration advice for the project.

Once again, thank you for your excellent work and contributions.

Code Release

when do you plan to release the code? looking forward to your reply.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.