envision-research / luciddreamer Goto Github PK

View Code? Open in Web Editor NEW

733.0 733.0 32.0 42.89 MB

Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"

License: MIT License

Python 99.97% Shell 0.03%

luciddreamer's People

Contributors

Stargazers

Watchers

luciddreamer's Issues

Gradio Demo

Hi congrats on LucidDreamer, would be great to setup a gradio demo for it on Hugging Face

In case you're interested, this is a step-by-step guide explaining the process of creating a gradio sdk on Spaces. 😊 This is our docs on Community GPU Grants. cc: @yvrjsharma

question about mesh export

How can I get the 3d mesh after training?

Error when installing requirements

I was doing the instructions and it installs everything but gives this error at the end:

ERROR: Could not find a version that satisfies the requirement triton (from versions: none)
ERROR: No matching distribution found for triton

Any ideas? Using Anaconda on windows 11. 4090 here.

Question about the DDIM process in the paper

Hi, thanks for the impressive work! I check the Eq.(11) in the latest arxiv paper but find it is not consistent with the original DDIM process.
I also notice #15 but find there seems to be a mistake.

In the paper "Denoising diffusion implicit models", Eq.(12) shows:

Set $\sigma_t=0$, we obtain:

However, from the paper and the #15, the DDIM process seems to be:

I think they are not equivalent. In original DDIM process, we have:
$x_{t-1} = \frac{\sqrt{\alpha_{t-1}}}{\sqrt{\alpha_{t}}}\cdot x_t + ...$

In the paper version of DDIM, we have:

But:

Is there a mistake in the paper or I miss something?

Update: Sorry, I checked again and find the $\alpha$ does not share the same meaning in the LucidDreamer paper and the DDIM paper. $\alpha_t$ in DDIM paper is equal to $\overline{\alpha_t}$ in the LucidDreamer paper. So the DDIM process is equivalent.

Supplementary Material Issue

Thanks for the great work!

In the main text, it is mentioned that details are provided in the supplementary material. However, it can not be found in the Arxiv version.

Could you please update it to add the supplementary material? Thx :)

How to finetune？

I want to load the trained LucidDreamer model and then proceed with further finetune, what should I do?

TypeError: expected str, bytes or os.PathLike object, not NoneType

The issue encountered is: Upon reviewing the code, it was found that the server and client are unable to interact properly.

the generated model

How do I view the generated model？ SIBR-viewer can‘t load the path of model.

point_cloud.ply is no color

运行代码出现错误

safetensors_rust. SafetensorError: Error while de eserializing header:MetadataIncompleteBuffer

RuntimeError: numel: integer multiplication overflow

Hi authors, thanks a lot for the excellent work and I appreciate for the public codes !
As the title shows, it runs into the problem when the algorithm passes the diff_gaussian_rasterization, especially the following line:
num_rendered, color, depth, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
in the "diff_gaussian_rasterization/init.py " file.

And I notice that someone has already mentioned this issue (no solution provided), but I can not find a solution.
Is there any workaround to fix the problem?
Could you please help with that?

Thanks in advance!!

About the visualization

Thanks for sharing this excellent work!
I woudl like to ask that how can I visualize the gaussian points and the in-time rendered results? Are there any relevant tutorials or README?

Looking forward your reply!

Code Release & Related Questions

Hi, Really great work.

I was wondering if your codebase was based on threestudio? Further, do you have an intended timescale for your release? And do you know which license you intend to use?

how to find Pretrained Diffusion Models config.json file

Firstly, you may need to change model_key: in the configs<config_file>.yaml to link the local Pretrained Diffusion Models ( Stable Diffusion 2.1-base in default)

I wonder how to find config.json file. After downloading the file to "/home/xxx/LucidDreamer-main/stabilityai/stable-diffusion-2-1-base" ,I came across this error.

This is the files in the StableDiffusion model.

packages are not available from current channels

when I create the env face packages are not available from current channels

The color of the generated characters is strange

The color of the generated characters is yellow, is there any way to solve it? by setting text or parameters

video_rgb_3000.mp4

video_rgb_5000.mp4

An inconsistent error occurred when I used the sample to generate an instance

python train.py --opt ./configs/white_hair_ironman.yaml
I didn't modify anything, but an inconsistent error occurred. And I trained for two hours on a 4090。
I want to know why this happens。
I would be very grateful if you could answer me.

video_rgb_5000.mp4

About the reproduction of figs in the paper

Hi,

Thanks for your great work.

I tried to reproduce the figs such as Fig. 1 in the paper following the training scripts in ./configs with some modifications. But the results are less than satisfactory.

Could you give more training configs of Fig. 1 in the paper?

Thanks.

some question about tain.py

I have installed the GussianDreamer environment using :
pip install submodules/diff-gaussian-rasterization/
pip install submodules/simple-knn/
but I run : python train.py --opt './configs/bagel.yaml' , and find the above question in the image.
for 1, I modify 'rendered_image, radii, depth = rasterizer(xxx)' into 'rendered_image, radii, depth, _ = rasterizer(xxx)' it can run,
but I encountered 2. Can you give me some advice? Thank you!

Some questions about the algoritm.

Hi, excellent works with impressive results.

When I read the paper, I have some questions.

In the Alogrithm 1, line 6 and 7 have a notation j, but the dicussion about it is missing. I guess $j=i \delta_S$. Is that right?

Besides, line 3 claims that $t \sim U(1,1000)$. However, it seems that t should be $n * \delta_S + \delta_T$.
For example, if $\delta_S=200$ and $\delta_T=50$, the $t$ can only be 50, 250, 450, 650, 850, instead of $U(1,1000)$. Is there any problem about it?

Thanks! :)

Question about algorithm

Thank you for your great work! In algorithm 1, the noise added is fixed (determined by unet and x_0). However, in the "train_step_perpneg" function, a random noise is added which is different from the algorithm 1.

Issue with installation of requirements

Thank you for your amazing work. However, I am having serious issues with the installation of requirements, even after following the same instructions.

Would you mind sharing the environment.yml file over email?

training process came out loss = nan

Hey Groups , thanks for the great work . I were using the template demo to reproduce the result , however during the traing process,the Loss=nan showed up , the program was still going on and on ,

the command was like

python train.py --opt configs/bagel.yaml 
python train.py --opt configs/cat_armor.yaml

[ITER 1] Video Save Done! [03/06 17:25:29]
Training progress:  10%|███████████▍                                                                                                         | 490/5000 [03:31<30:44,  2.45it/s, Loss=nan]scale up theta_range to: [45, 105] [03/06 17:29:00]
scale up radius_range to: [4.9399999999999995, 5.225] [03/06 17:29:00]
scale up phi_range to: [-180, 180] [03/06 17:29:00]
scale up fovy_range to: [0.24, 0.6] [03/06 17:29:00]
Training progress:  20%|███████████████████████▏                                                                                             | 990/5000 [06:57<26:05,  2.56it/s, Loss=nan]scale up theta_range to: [45, 105] [03/06 17:32:26]
scale up radius_range to: [4.693, 5.0] [03/06 17:32:26]
scale up phi_range to: [-180, 180] [03/06 17:32:26]
scale up fovy_range to: [0.18, 0.6] [03/06 17:32:26]
Training progress:  20%|███████████████████████▏                                                                                            | 1000/5000 [07:01<26:54,  2.48it/s, Loss=nan

How to extract mesh/get normal?

May I ask what is the data format for the final point_cloud_rgb.txt and point_cloud.ply and if it is possible to provide code for extracting mesh/obtaining normals？

Inquiry regarding method part

Thanks for sharing the code and presenting such a great paper!

I have a question about how the Equation 7 is derived from Equation 5 and how the gradient computation and the additional gamma term are included in the equation. Could you please provide some insights or explanations on this?

Thank you!

Results Generated by Model on Civitai are blurred and distorted

Hello, I've been attempting to replicate the results from your paper, specifically the text "A portrait of Hatsune Miku, robot" using the Civitai model. Unfortunately, the outcomes I'm getting are quite poor and don't resemble the results shown in the paper.

I am unsure if there is a specific configuration that I might be missing, could you provide a config file that could reproduce the results as they are in the publication？

Thank you very much for your assistance.

ModuleNotFoundError: No module named 'point_e'

After installing requirement.txt, I encountered this "ModuleNotFoundError: No module named 'point_e'". Where can I get this package? Thank you for your help.

How to generate 3D model?

Thanks for sharing this excellent work!
I woudl like to ask that how can I get the generated 3D model afterwards?

Looking forward your reply!

RuntimeError: numel: integer multiplication overflow

I ran into the following exception when I made some modifications on learning rates.

 9 Training progress:  21%|▍ | 1030/5000 [15:14<1:43:53,  1.57s/it, Loss=0.9026756]Error executing job with overrides: ['+wandb_key=xxx']
10 Traceback (most recent call last):
11   File "/LucidDreamer/train.py", line 622, in main
12     training(lp, op, pp, gcp, gp, hg_params, cfg.test_iterations, cfg.save_iterations, cfg.checkpoint_iterations,
13   File "/LucidDreamer/train.py", line 349, in training
14     render_pkg = render(viewpoint_cam, gaussians, pipe, background,
15   File "/LucidDreamer/gaussian_renderer/__init__.py", line 146, in render
16     rendered_image, radii, depth_alpha = rasterizer(
17   File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
18     return forward_call(*args, **kwargs)
19   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 186, in forward
20     return rasterize_gaussians(
21   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 28, in rasterize_gaussians
22     return _RasterizeGaussians.apply(
23   File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
24     return super().apply(*args, **kwargs)  # type: ignore[misc]
25   File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 78, in forward
26     num_rendered, color, depth, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
27 RuntimeError: numel: integer multiplication overflow

Specifically, the optimization params that I use are as follows:

as_latent_ratio: 0.2
densification_interval: 100
densify_from_iter: 100
densify_grad_threshold: 0.00075
densify_until_iter: 3000
feature_lr: 0.01
feature_lr_final: 0.0005
fovy_scale_up_factor:
- 0.75
- 1.1
geo_iter: 0
iterations: 5000
lambda_scale: 0.0
lambda_tv: 0.0
opacity_lr: 0.01
opacity_reset_interval: 300
percent_dense: 0.003
phi_scale_up_factor: 1.5
position_lr_delay_mult: 0.01
position_lr_final: 1.6e-06
position_lr_init: 0.00016
position_lr_max_steps: 30000
pro_frames_num: 600
pro_render_45: false
progressive_view_init_ratio: 0.2
progressive_view_iter: 500
rotation_lr: 0.01
rotation_lr_final: 0.0005
save_process: true
scale_up_cameras_iter: 500
scale_up_factor: 0.95
scaling_lr: 0.01
scaling_lr_final: 0.0005
use_control_net_iter: 10000000
use_progressive: false
warmup_iter: 1500

I've also checked this issue from the original gaussian-splatting repo with little help: graphdeco-inria/gaussian-splatting#24

I wonder if similar issues were encountered before, and what are the possible methods to mitigate this issue?

Method Issue

According to the multi-step DDIM sampling, it is mentioned in Section 3.2 that Eqn. (13) is derived from Eqn. (11).

However, it is quite confused since Eqn. (11) seems incorrect.

The DDIM sampling seems to be:

$\frac{\tilde{x}_s}{\sqrt{\overline{\alpha}_s}}=\frac{x_t}{\sqrt{\overline{\alpha}_t}}+(\gamma(s)-\gamma(t)) \epsilon(x_t; y, \phi)$.

Since $\overline{\alpha}_0=1$, it can derive Eqn. (13).

Also, the notation of the sampling latents $\tilde{x}_s \dots$ is missed.

Why is the generated effect much worse than that of the paper?

The left side is generated, and the right side is the paper

All parameters are configured by default in yaml.

I also tried increasing the number of iterations, but the effect did not improve.

Hope to receive help, thank you

Does LucidDreamer train a lora during iterations like ProlificDreamer?

You mentioned lora in the paper, so just asking it for clarify, since it almost take twice amount of time to run LucidDreamer than to run prolificdreamer

Issues with Dependency Installation and Training for LucidDreamer Gradio Demo

Hello,

I'm currently engaging with the LucidDreamer project and have been following the installation instructions in the Gradio Demo guide. I would like to report some issues I encountered during this process, along with the solutions that worked for me.

Initial Setup:
As per the guide's instructions, I started by creating a new Conda environment with the following command:
```
conda create -n LD_Demo python=3.9.16 cudatoolkit=11.8 -y
```
This step was completed successfully, setting up an environment with Python 3.9.16 and CUDA Toolkit 11.8.
Dependency Installation Issues:
However, I encountered problems when trying to install specific dependencies:
```
pip install git+https://github.com/YixunLiang/diff-gaussian-rasterization.git
pip install git+https://github.com/YixunLiang/simple-knn.git
```
The error message I received was:
```
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError: The detected CUDA version (12.3) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.
```
Despite my system having CUDA version 12.3, I anticipated that creating the Conda environment with cudatoolkit=11.8 would resolve any version conflicts. To address this issue, I had to uninstall and then reinstall PyTorch and its associated libraries within the Conda environment:
```
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio
```
After these adjustments, I was able to successfully install the dependencies and run the Gradio demo. It's also worth noting that the command mentioned in the documentation seems to be outdated. The correct command now appears to be python app.py --cuda $LD_CUDA, not gradio.demo.py.

Training Issues:
During training, I encountered several errors related to xFormers:

`flshattF` is not supported because: xFormers wasn't build with CUDA support dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) Operator wasn't built - see `python -m xformers.info` for more info

I resolved this issue by following the solution in this thread:

pip install -U xformers --no-deps -qq

I hope this information helps in improving the setup process for future users. Any updates to the documentation or advice on these issues would be greatly appreciated.

Thank you for your time and effort in maintaining this project.

Best regards,

leo4life

About XFormers

When I run this code，the warning is as follows

WARNING[XFORMERS]:` xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.12.1+cu113)
Python 3.9.16 (you have 3.9.18)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details

but in the environment.yml,

cudatoolkit=11.6
python=3.9
pytorch=1.12.1

The pytorch version is not the same as the one required by xformers，will this have an impact on the result?

How can I generate zero-shot avatar??

I can't find training codes for Zero-shot avatar..
This code is only for head. Do you have plans about updating about zero-shot avatar generation?

Comparison results of ISM with the multi-step DDIM baseline

Hi, thanks for the awesome work and the code, it brings lots of valuable insights about the SDS to me!

From the paper I think the multi-step DDIM baseline can also solve the low feature consistency and low-quality problem of the vanila SDS loss; the proposed ISM is a spped-up version of this baseline method. What about the comparison results of the proposed ISM with this multi-step DDIM baseline in terms of running time and quality of the text-to-3D results?

envision-research / luciddreamer Goto Github PK

luciddreamer's People

Contributors

Stargazers

Watchers

Forkers

luciddreamer's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs