envision-research / luciddreamer Goto Github PK
View Code? Open in Web Editor NEWOfficial implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
License: MIT License
Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
License: MIT License
Hi congrats on LucidDreamer, would be great to setup a gradio demo for it on Hugging Face
In case you're interested, this is a step-by-step guide explaining the process of creating a gradio sdk on Spaces. π This is our docs on Community GPU Grants. cc: @yvrjsharma
How can I get the 3d mesh after training?
I was doing the instructions and it installs everything but gives this error at the end:
ERROR: Could not find a version that satisfies the requirement triton (from versions: none)
ERROR: No matching distribution found for triton
Any ideas? Using Anaconda on windows 11. 4090 here.
Hi, thanks for the impressive work! I check the Eq.(11) in the latest arxiv paper but find it is not consistent with the original DDIM process.
I also notice #15 but find there seems to be a mistake.
In the paper "Denoising diffusion implicit models", Eq.(12) shows:
However, from the paper and the #15, the DDIM process seems to be:
I think they are not equivalent. In original DDIM process, we have:
In the paper version of DDIM, we have:
Is there a mistake in the paper or I miss something?
Update: Sorry, I checked again and find the
Thanks for the great work!
In the main text, it is mentioned that details are provided in the supplementary material. However, it can not be found in the Arxiv version.
Could you please update it to add the supplementary material? Thx :)
I want to load the trained LucidDreamer model and then proceed with further finetune, what should I do?
safetensors_rust. SafetensorError: Error while de eserializing header:MetadataIncompleteBuffer
Hi authors, thanks a lot for the excellent work and I appreciate for the public codes !
As the title shows, it runs into the problem when the algorithm passes the diff_gaussian_rasterization, especially the following line:
num_rendered, color, depth, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
in the "diff_gaussian_rasterization/init.py " file.
And I notice that someone has already mentioned this issue (no solution provided), but I can not find a solution.
Is there any workaround to fix the problem?
Could you please help with that?
Thanks in advance!!
Thanks for sharing this excellent work!
I woudl like to ask that how can I visualize the gaussian points and the in-time rendered results? Are there any relevant tutorials or README?
Looking forward your reply!
Hi, Really great work.
I was wondering if your codebase was based on threestudio? Further, do you have an intended timescale for your release? And do you know which license you intend to use?
Firstly, you may need to change model_key: in the configs<config_file>.yaml to link the local Pretrained Diffusion Models ( Stable Diffusion 2.1-base in default)
I wonder how to find config.json file. After downloading the file to "/home/xxx/LucidDreamer-main/stabilityai/stable-diffusion-2-1-base" ,I came across this error.
when I create the env face packages are not available from current channels
The color of the generated characters is yellow, is there any way to solve it? by setting text or parameters
Hi,
Thanks for your great work.
I tried to reproduce the figs such as Fig. 1 in the paper following the training scripts in ./configs with some modifications. But the results are less than satisfactory.
Could you give more training configs of Fig. 1 in the paper?
Thanks.
I have installed the GussianDreamer environment using :
pip install submodules/diff-gaussian-rasterization/
pip install submodules/simple-knn/
but I run : python train.py --opt './configs/bagel.yaml' , and find the above question in the image.
for 1, I modify 'rendered_image, radii, depth = rasterizer(xxx)' into 'rendered_image, radii, depth, _ = rasterizer(xxx)' it can run,
but I encountered 2. Can you give me some advice? Thank you!
Hi, excellent works with impressive results.
When I read the paper, I have some questions.
In the Alogrithm 1, line 6 and 7 have a notation j, but the dicussion about it is missing. I guess
Besides, line 3 claims that
For example, if
Thanks! :)
Thank you for your great work! In algorithm 1, the noise added is fixed (determined by unet and x_0). However, in the "train_step_perpneg" function, a random noise is added which is different from the algorithm 1.
Thank you for your amazing work. However, I am having serious issues with the installation of requirements, even after following the same instructions.
Would you mind sharing the environment.yml file over email?
Hey Groups , thanks for the great work . I were using the template demo to reproduce the result , however during the traing process,the Loss=nan
showed up , the program was still going on and on ,
the command was like
python train.py --opt configs/bagel.yaml
python train.py --opt configs/cat_armor.yaml
[ITER 1] Video Save Done! [03/06 17:25:29]
Training progress: 10%|ββββββββββββ | 490/5000 [03:31<30:44, 2.45it/s, Loss=nan]scale up theta_range to: [45, 105] [03/06 17:29:00]
scale up radius_range to: [4.9399999999999995, 5.225] [03/06 17:29:00]
scale up phi_range to: [-180, 180] [03/06 17:29:00]
scale up fovy_range to: [0.24, 0.6] [03/06 17:29:00]
Training progress: 20%|ββββββββββββββββββββββββ | 990/5000 [06:57<26:05, 2.56it/s, Loss=nan]scale up theta_range to: [45, 105] [03/06 17:32:26]
scale up radius_range to: [4.693, 5.0] [03/06 17:32:26]
scale up phi_range to: [-180, 180] [03/06 17:32:26]
scale up fovy_range to: [0.18, 0.6] [03/06 17:32:26]
Training progress: 20%|ββββββββββββββββββββββββ | 1000/5000 [07:01<26:54, 2.48it/s, Loss=nan
May I ask what is the data format for the final point_cloud_rgb.txt and point_cloud.ply and if it is possible to provide code for extracting mesh/obtaining normalsοΌ
Thanks for sharing the code and presenting such a great paper!
I have a question about how the Equation 7 is derived from Equation 5 and how the gradient computation and the additional gamma term are included in the equation. Could you please provide some insights or explanations on this?
Thank you!
Hello, I've been attempting to replicate the results from your paper, specifically the text "A portrait of Hatsune Miku, robot" using the Civitai model. Unfortunately, the outcomes I'm getting are quite poor and don't resemble the results shown in the paper.
I am unsure if there is a specific configuration that I might be missing, could you provide a config file that could reproduce the results as they are in the publicationοΌ
Thank you very much for your assistance.
After installing requirement.txt, I encountered this "ModuleNotFoundError: No module named 'point_e'". Where can I get this package? Thank you for your help.
Thanks for sharing this excellent work!
I woudl like to ask that how can I get the generated 3D model afterwards?
Looking forward your reply!
I ran into the following exception when I made some modifications on learning rates.
9 Training progress: 21%|β | 1030/5000 [15:14<1:43:53, 1.57s/it, Loss=0.9026756]Error executing job with overrides: ['+wandb_key=xxx']
10 Traceback (most recent call last):
11 File "/LucidDreamer/train.py", line 622, in main
12 training(lp, op, pp, gcp, gp, hg_params, cfg.test_iterations, cfg.save_iterations, cfg.checkpoint_iterations,
13 File "/LucidDreamer/train.py", line 349, in training
14 render_pkg = render(viewpoint_cam, gaussians, pipe, background,
15 File "/LucidDreamer/gaussian_renderer/__init__.py", line 146, in render
16 rendered_image, radii, depth_alpha = rasterizer(
17 File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
18 return forward_call(*args, **kwargs)
19 File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 186, in forward
20 return rasterize_gaussians(
21 File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 28, in rasterize_gaussians
22 return _RasterizeGaussians.apply(
23 File "/LucidDreamer/venv/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
24 return super().apply(*args, **kwargs) # type: ignore[misc]
25 File "/LucidDreamer/venv/lib/python3.10/site-packages/diff_gaussian_rasterization/__init__.py", line 78, in forward
26 num_rendered, color, depth, radii, geomBuffer, binningBuffer, imgBuffer = _C.rasterize_gaussians(*args)
27 RuntimeError: numel: integer multiplication overflow
Specifically, the optimization params that I use are as follows:
as_latent_ratio: 0.2
densification_interval: 100
densify_from_iter: 100
densify_grad_threshold: 0.00075
densify_until_iter: 3000
feature_lr: 0.01
feature_lr_final: 0.0005
fovy_scale_up_factor:
- 0.75
- 1.1
geo_iter: 0
iterations: 5000
lambda_scale: 0.0
lambda_tv: 0.0
opacity_lr: 0.01
opacity_reset_interval: 300
percent_dense: 0.003
phi_scale_up_factor: 1.5
position_lr_delay_mult: 0.01
position_lr_final: 1.6e-06
position_lr_init: 0.00016
position_lr_max_steps: 30000
pro_frames_num: 600
pro_render_45: false
progressive_view_init_ratio: 0.2
progressive_view_iter: 500
rotation_lr: 0.01
rotation_lr_final: 0.0005
save_process: true
scale_up_cameras_iter: 500
scale_up_factor: 0.95
scaling_lr: 0.01
scaling_lr_final: 0.0005
use_control_net_iter: 10000000
use_progressive: false
warmup_iter: 1500
I've also checked this issue from the original gaussian-splatting repo with little help: graphdeco-inria/gaussian-splatting#24
I wonder if similar issues were encountered before, and what are the possible methods to mitigate this issue?
According to the multi-step DDIM sampling, it is mentioned in Section 3.2 that Eqn. (13) is derived from Eqn. (11).
However, it is quite confused since Eqn. (11) seems incorrect.
The DDIM sampling seems to be:
Since
Also, the notation of the sampling latents
You mentioned lora in the paper, so just asking it for clarify, since it almost take twice amount of time to run LucidDreamer than to run prolificdreamer
Hello,
I'm currently engaging with the LucidDreamer project and have been following the installation instructions in the Gradio Demo guide. I would like to report some issues I encountered during this process, along with the solutions that worked for me.
Initial Setup:
As per the guide's instructions, I started by creating a new Conda environment with the following command:
conda create -n LD_Demo python=3.9.16 cudatoolkit=11.8 -y
This step was completed successfully, setting up an environment with Python 3.9.16 and CUDA Toolkit 11.8.
Dependency Installation Issues:
However, I encountered problems when trying to install specific dependencies:
pip install git+https://github.com/YixunLiang/diff-gaussian-rasterization.git
pip install git+https://github.com/YixunLiang/simple-knn.git
The error message I received was:
raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
RuntimeError: The detected CUDA version (12.3) mismatches the version that was used to compile PyTorch (11.7). Please make sure to use the same CUDA versions.
Despite my system having CUDA version 12.3, I anticipated that creating the Conda environment with cudatoolkit=11.8
would resolve any version conflicts. To address this issue, I had to uninstall and then reinstall PyTorch and its associated libraries within the Conda environment:
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio
After these adjustments, I was able to successfully install the dependencies and run the Gradio demo. It's also worth noting that the command mentioned in the documentation seems to be outdated. The correct command now appears to be python app.py --cuda $LD_CUDA
, not gradio.demo.py
.
Training Issues:
During training, I encountered several errors related to xFormers
:
`flshattF` is not supported because: xFormers wasn't build with CUDA support dtype=torch.float32 (supported: {torch.float16, torch.bfloat16}) Operator wasn't built - see `python -m xformers.info` for more info
I resolved this issue by following the solution in this thread:
pip install -U xformers --no-deps -qq
I hope this information helps in improving the setup process for future users. Any updates to the documentation or advice on these issues would be greatly appreciated.
Thank you for your time and effort in maintaining this project.
Best regards,
leo4life
When I run this codeοΌthe warning is as follows
WARNING[XFORMERS]:` xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 1.12.1+cu113)
Python 3.9.16 (you have 3.9.18)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
but in the environment.yml,
The pytorch version is not the same as the one required by xformersοΌwill this have an impact on the result?
I can't find training codes for Zero-shot avatar..
This code is only for head. Do you have plans about updating about zero-shot avatar generation?
Hi, thanks for the awesome work and the code, it brings lots of valuable insights about the SDS to me!
From the paper I think the multi-step DDIM baseline can also solve the low feature consistency and low-quality problem of the vanila SDS loss; the proposed ISM is a spped-up version of this baseline method. What about the comparison results of the proposed ISM with this multi-step DDIM baseline in terms of running time and quality of the text-to-3D results?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.