GithubHelp home page GithubHelp logo

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

By Bowen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining Guo.

Paper | Project Page | Code

teaser_video.mp4

Abstract

We introduce a radiance representation that is both structured and fully explicit and thus greatly facilitates 3D generative modeling. Existing radiance representations either require an implicit feature decoder, which significantly degrades the modeling power of the representation, or are spatially unstructured, making them difficult to integrate with mainstream 3D diffusion methods. We derive GaussianCube by first using a novel densification-constrained Gaussian fitting algorithm, which yields high-accuracy fitting using a fixed number of free Gaussians, and then rearranging these Gaussians into a predefined voxel grid via Optimal Transport. Since GaussianCube is a structured grid representation, it allows us to use standard 3D U-Net as our backbone in diffusion modeling without elaborate designs. More importantly, the high-accuracy fitting of the Gaussians allows us to achieve a high-quality representation with orders of magnitude fewer parameters than previous structured representations for comparable quality, ranging from one to two orders of magnitude. The compactness of GaussianCube greatly eases the difficulty of 3D generative modeling. Extensive experiments conducted on unconditional and class-conditioned object generation, digital avatar creation, and text-to-3D synthesis all show that our model achieves state-of-the-art generation results both qualitatively and quantitatively, underscoring the potential of GaussianCube as a highly accurate and versatile radiance representation for 3D generative modeling.

Environment Setup

We recommend Linux for performance and compatibility reasons. We use conda to manage the environment. Please install conda from here if you haven't done so.

git clone https://github.com/GaussianCube/GaussianCube.git
cd GaussianCube
conda env create -f environment.yml
conda activate gaussiancube

Model Download

Please download model checkpoints and dataset statistics (pre-computed mean and sta files) from the following links:

Huggingface

Model Task Download
OmniObject3D Class-conditioned Generation 🤗 Hugging Face
ShapeNet Car Unconditional Generation 🤗 Hugging Face
ShapeNet Chair Unconditional Generation 🤗 Hugging Face

Inference

Class-conditioned Generation on OmniObject3D

To inference pretrained model of OmniObject3D, save the downloaded model checkpoint and dataset statistics to ./OmniObject3D/, then run:

python inference.py --exp_name /tmp/OmniObject3D_test --config configs/omni_class_cond.yml  --rescale_timesteps 300 --ckpt ./OmniObject3D/OmniObject3D_ckpt.pt  --mean_file ./OmniObject3D/mean.pt --std_file ./OmniObject3D/std.pt  --bound 1.0 --num_samples 10 --render_video --class_cond

Unconditional Generation on ShapeNet

To inference pretrained model of ShapeNet Car, save the downloaded model checkpoint and dataset statistics to ./shapenet_car/, then run:

python inference.py --exp_name /tmp/shapenet_car_test --config configs/shapenet_uncond.yml  --rescale_timesteps 300 --ckpt ./shapenet_car/shapenet_car_ckpt.pt  --mean_file ./shapenet_car/mean.pt  --std_file ./shapenet_car/std.pt  --bound 0.45 --num_samples 10 --render_video

To inference pretrained model of ShapeNet Chair, save the downloaded model checkpoint and dataset statistics to ./shapenet_chair/, then run:

python inference.py --exp_name /tmp/shapenet_chair_test --config configs/shapenet_uncond.yml  --rescale_timesteps 300 --ckpt ./shapenet_chair/shapenet_chair_ckpt.pt  --mean_file ./shapenet_chair/mean.pt  --std_file ./shapenet_chair/std.pt  --bound 0.35 --num_samples 10 --render_video

Mesh Conversion

For the generated results, we provide a script to convert the generated GaussianCube to mesh following LGM. First, install additional dependencies:

# for mesh extraction
pip install nerfacc
pip install git+https://github.com/NVlabs/nvdiffrast
# install diff_gauss for alpha rendering
git clone --recurse-submodules https://github.com/slothfulxtx/diff-gaussian-rasterization.git 
cd diff-gaussian-rasterization
python setup.py install

Then run the following command to convert the generated results to mesh:

python scripts/convert_mesh.py --test_path /tmp/shapenet_car_test/rank_00_0000.pt --cam_radius 1.2 --bound 0.45 --mean_file ./shapenet_car/mean.pt --std_file ./shapenet_car/std.pt

Training

Data Preparation

Please refer to data_construction to prepare the training data. Then, put the data in the following structure (take ShapeNet as an example):

example_data
├── shapenet
│   ├── mean_volume_act.pt
│   ├── std_volume_act.pt
│   ├── shapenet_train.txt
│   ├── volume_act
│   └── shapenet_rendering_512

Unconditional Diffusion Training on ShapeNet Car or ShapeNet Chair

Run the following command to train the model:

python main.py --log_interval 100 --batch_size 8 --lr 5e-5 --exp_name ./output/shapenet_diffusion_training --save_interval 5000 --config configs/shapenet_uncond.yml --use_tensorboard --use_vgg --load_camera 1 --render_l1_weight 10 --render_lpips_weight 10 --use_fp16 --mean_file ./example_data/shapenet/mean_volume_act.pt --std_file ./example_data/shapenet/std_volume_act.pt --data_dir ./example_data/shapenet/volume_act --cam_root_path ./example_data/shapenet/shapenet_rendering_512/ --txt_file ./example_data/shapenet/shapenet_train.txt --bound 0.45 --start_idx 0 --end_idx 100 --clip_input

Class-conditioned Diffusion Training on OmniObject3D

Run the following command to train the model:

python main.py --log_interval 100 --batch_size 8 --lr 5e-5 --exp_name ./output/omniobject3d_diffusion_training --save_interval 5000 --config configs/omni_class_cond.yml --use_tensorboard --use_vgg --load_camera 1 --render_l1_weight 10 --render_lpips_weight 10 --use_fp16 --mean_file ./example_data/omniobject3d/mean_volume_act.pt --std_file ./example_data/omniobject3d/std_volume_act.pt --data_dir ./example_data/omniobject3d/volume_act --cam_root_path ./example_data/omniobject3d/Omniobject3d_rendering_512/ --txt_file ./example_data/omniobject3d/omni_train.txt --uncond_p 0.2 --bound 1.0 --start_idx 0 --end_idx 100 --clip_input --omni

Text-conditioned Diffusion Training on Objaverse

Extract the CLIP features of text captions and put them under ./example_data/objaverse/ using the following script:

python scripts/encode_text_features.py

Then run the following command to train the model:

python main.py --log_interval 100 --batch_size 8 --lr 5e-5 --weight_decay 0 --exp_name ./output/objaverse_diffusion_training --save_interval 5000 --config configs/objaverse_text_cond.yml --use_tensorboard --use_vgg --load_camera 1 --render_l1_weight 10 --render_lpips_weight 10 --use_fp16 --data_dir ./example_data/objaverse/volume_act/ --start_idx 0 --end_idx 100 --txt_file ./example_data/objaverse/objaverse_train.txt --mean_file ./example_data/objaverse/mean_volume_act.pt --std_file ./example_data/objaverse/std_volume_act.pt --cam_root_path ./example_data/objaverse/objaverse_rendering_512/ --bound 0.5 --uncond_p 0.2 --objaverse --clip_input --text_feature_root ./example_data/objaverse/objaverse_text_feature/

Acknowledgement

This codebase is built upon the improved-diffusion, thanks to the authors for their great work. Also thanks the authors of Cap3D and VolumeDiffusion for the text captions of Objaverse dataset.

Citation

If you find this work useful, please consider citing:

@article{zhang2024gaussiancube,
  title={GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling},
  author={Zhang, Bowen and Cheng, Yiji and Yang, Jiaolong and Wang, Chunyu and Zhao, Feng and Tang, Yansong and Chen, Dong and Guo, Baining},
  journal={arXiv preprint arXiv:2403.19655},
  year={2024}
}

Todo

  • Release the inference code.
  • Release all pretrained models.
  • Release the data construction code.
  • Release the diffusion training code.

gaussiancube's Projects

gaussiancube icon gaussiancube

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.