GithubHelp home page GithubHelp logo

3dtopia's Introduction

logo

3DTopia

A two-stage text-to-3D generation model. The first stage uses diffusion model to quickly generate candidates. The second stage refines the assets chosen from the first stage.

demo.mp4

News

[2024/03/10] Our captions for Objaverse is released here.

[2024/03/04] Our technical report is released here.

[2024/01/18] We release a text-to-3D model 3DTopia!

Citation

@article{hong20243dtopia,
  title={3DTopia: Large Text-to-3D Generation Model with Hybrid Diffusion Priors},
  author={Hong, Fangzhou and Tang, Jiaxiang and Cao, Ziang and Shi, Min and Wu, Tong and Chen, Zhaoxi and Wang, Tengfei and Pan, Liang and Lin, Dahua and Liu, Ziwei},
  journal={arXiv preprint arXiv:2403.02234},
  year={2024}
}

1. Quick Start

1.1 Install Environment for this Repository

We recommend using Anaconda to manage the environment.

conda env create -f environment.yml

1.2 Install Second Stage Refiner

Please refer to threefiner to install our second stage mesh refiner. We have tested installing both environments together with Pytorch 1.12.0 and CUDA 11.3.

1.3 Download Checkpoints [Optional]

We have implemented automatic checkpoint download for both gradio_demo.py and sample_stage1.py. If you prefer to download manually, you may download checkpoint 3dtopia_diffusion_state_dict.ckpt or model.safetensors from huggingface.

Q&A

  • If you encounter this error in the second stage ImportError: /lib64/libc.so.6: version 'GLIBC_2.25' not found, try to install a lower version of pymeshlab by pip install pymeshlab==0.2.

2. Inference

2.1 First Stage

Run the following command to sample a robot as the first stage. Results will be located under the folder results.

python -u sample_stage1.py --text "a robot" --samples 1 --sampler ddim --steps 200 --cfg_scale 7.5 --seed 0

Arguments:

  • --ckpt specifies checkpoint file path;
  • --test_folder controls which subfolder to put all the results;
  • --seed will fix random seeds; --sampler can be set to ddim for DDIM sampling (By default, we use 1000 steps DDPM sampling);
  • --steps controls sampling steps only for DDIM;
  • --samples controls number of samples;
  • --text is the input text;
  • --no_video and --no_mcubes suppress rendering multi-view videos and marching cubes, which are by-default enabled;
  • --mcubes_res controls the resolution of the 3D volumn sampled for marching cubes; One can lower this resolution to save graphics memory;
  • --render_res controls the resolution of the rendered video;

2.2 Second Stage

There are two steps as the second stage refinement. Here is a simple example. Please refer to threefiner for more detailed usage.

# step 1
threefiner sd --mesh results/default/stage1/a_robot_0_0.ply --prompt "a robot" --text_dir --front_dir='-y' --outdir results/default/stage2/ --save a_robot_0_0_sd.glb
# step 2
threefiner if2 --mesh results/default/stage2/a_robot_0_0_sd.glb --prompt "a robot" --outdir results/default/stage2/ --save a_robot_0_0_if2.glb

The resulting mesh can be found at results/default/stage2/a_robot_0_0_if2.glb

3. Acknowledgement

We thank the community for building and open-sourcing the foundation of this work. Specifically, we want to thank EG3D, Stable Diffusion for their codes. We also want to thank Objaverse for the wonderful dataset.

3dtopia's People

Contributors

hongfz16 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.