GithubHelp home page GithubHelp logo

tqtqliu / mvsgaussian Goto Github PK

View Code? Open in Web Editor NEW
278.0 27.0 14.0 31.59 MB

[ECCV 2024] MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Home Page: https://mvsgaussian.github.io/

License: MIT License

Python 99.34% Shell 0.66%
eccv2024 gaussian-splatting generalizable multi-view-stereo novel-view-synthesis

mvsgaussian's Introduction

MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

1Huazhong University of Science and Technology  2Nanyang Technological University
3Great Bay University  4Shanghai AI Laboratory

TL;DR: MVSGaussian is a Gaussian-based method designed for efficient reconstruction of unseen scenes from sparse views in a single forward pass. It offers high-quality initialization for fast training and real-time rendering.

⚡ Updates

  • [2024.07.16] The latest updated code supports multi-batch training (details) and inference, and a single 3090 GPU is sufficient to reproduce all of our experimental results.
  • [2024.07.16] Added a Demo (Custom Data) that only requires multi-view images as input.
  • [2024.07.10] Code and checkpoints are released.
  • [2024.07.01] Our work is accepted by ECCV2024.
  • [2024.05.21] Project Page | arXiv | YouTube released.

🌟 Abstract

We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct unseen scenes. Specifically, 1) we leverage MVS to encode geometry-aware Gaussian representations and decode them into Gaussian parameters. 2) To further enhance performance, we propose a hybrid Gaussian rendering that integrates an efficient volume rendering design for novel view synthesis. 3) To support fast fine-tuning for specific scenes, we introduce a multi-view geometric consistent aggregation strategy to effectively aggregate the point clouds generated by the generalizable model, serving as the initialization for per-scene optimization. Compared with previous generalizable NeRF-based methods, which typically require minutes of fine-tuning and seconds of rendering per image, MVSGaussian achieves real-time rendering with better synthesis quality for each scene. Compared with the vanilla 3D-GS, MVSGaussian achieves better view synthesis with less training computational cost. Extensive experiments on DTU, Real Forward-facing, NeRF Synthetic, and Tanks and Temples datasets validate that MVSGaussian attains state-of-the-art performance with convincing generalizability, real-time rendering speed, and fast per-scene optimization.

🔨 Installation

  • Clone our repository

    git clone https://github.com/TQTQliu/MVSGaussian.git
    cd MVSGaussian
    
  • Set up the python environment

    conda create -n mvsgs python=3.7.13
    conda activate mvsgs
    pip install -r requirements.txt
    pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 -f https://download.pytorch.org/whl/torch_stable.html
    
  • Install Gaussian Splatting renderer

    git clone https://github.com/graphdeco-inria/gaussian-splatting --recursive
    pip install gaussian-splatting/submodules/diff-gaussian-rasterization
    pip install gaussian-splatting/submodules/simple-knn
    

🤗 Demo (Custom Data)

First, prepare the multi-view image data, and then run colmap. Here, we take examples/scene1 (examples data) as an example:

python lib/colmap/imgs2poses.py -s examples/scene1

And execute the following command to obtain novel views:

python run.py --type evaluate --cfg_file configs/mvsgs/colmap_eval.yaml test_dataset.data_root examples/scene1

or videos:

python run.py --type evaluate --cfg_file configs/mvsgs/colmap_eval.yaml test_dataset.data_root examples/scene1 save_video True

📦 Datasets

🚂 Training

  • Train generalizable model

    To train a generalizable model from scratch on DTU, specify data_root in configs/mvsgs/dtu_pretrain.yaml first and then run:

    python train_net.py --cfg_file configs/mvsgs/dtu_pretrain.yaml train.batch_size 4
    

    You can specify the gpus in configs/mvsgs/dtu_pretrain.yaml.

    Our code also supports multi-gpu training. The released pretrained model (paper) was trained with 4 RTX 3090 GPUs with a batch size of 1 for each GPU:

    python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/mvsgs/dtu_pretrain.yaml distributed True gpus 0,1,2,3 train.batch_size 1
    

    You can also use 4 GPUs, with a batch size of 4 for each GPU:

    python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/mvsgs/dtu_pretrain.yaml distributed True gpus 0,1,2,3 train.batch_size 4
    

    We provide the results as a reference below:

    GPU number Batch size DTU Real Forward-facing NeRF Synthetic Tanks and Temples Training time (per epoch) Training memory Checkpoint
    PSNR SSIM LPIPS PSNR SSIM LPIPS PSNR SSIM LPIPS PSNR SSIM LPIPS
    1 4 28.23 0.963 0.075 24.19 0.860 0.164 26.57 0.948 0.070 23.50 0.879 0.137 ~12min ~22G 1gpu_4batch
    4 1 28.21 0.963 0.076 24.07 0.857 0.164 26.46 0.948 0.071 23.29 0.878 0.139 ~5min ~7G 4gpu_1batch (paper)
    4 4 28.56 0.964 0.073 24.02 0.858 0.165 26.28 0.947 0.072 23.14 0.876 0.147 ~14min ~23G 4gpu_4batch
    • Per-scene optimization

      One strategy is to optimize only the initial Gaussian point cloud provided by the generalizable model.

      bash scripts/mvsgs/llff_ft.sh
      bash scripts/mvsgs/nerf_ft.sh
      bash scripts/mvsgs/tnt_ft.sh
      

      We provide optimized Gaussian point clouds for each scenes here.

      You can also run the following command to get the results of vanilla 3D-GS, whose initialization is obtained via COLMAP.

      bash scripts/3dgs/llff_ft.sh
      bash scripts/3dgs/nerf_ft.sh
      bash scripts/3dgs/tnt_ft.sh
      

      It is worth noting that for the LLFF dataset, the point cloud in the original dataset is obtained by using all views. For fair comparison, we only use the training view set to regain the point cloud, so we recommend downloading the LLFF dataset we processed.

      (Optional) Another approach is to optimize the entire pipeline, similar to NeRF-based methods.

      Here we take the fern on the LLFF as an example:

      cd ./trained_model/mvsgs
      mkdir llff_ft_fern
      cp dtu_pretrain/latest.pth llff_ft_fern
      cd ../..
      python train_net.py --cfg_file configs/mvsgs/llff/fern.yaml
      

    🎯 Evaluation

    • Evaluation on DTU

      Download the pretrained model and put it into trained_model/mvsgs/dtu_pretrain/latest.pth

      Use the following command to evaluate the pretrained model on DTU:

      python run.py --type evaluate --cfg_file configs/mvsgs/dtu_pretrain.yaml mvsgs.cas_config.render_if False,True mvsgs.cas_config.volume_planes 48,8 mvsgs.eval_depth True
      

      The rendered images will be saved in result/mvsgs/dtu_pretrain.

    • Evaluation on Real Forward-facing

      python run.py --type evaluate --cfg_file configs/mvsgs/llff_eval.yaml
      
    • Evaluation on NeRF Synthetic

      python run.py --type evaluate --cfg_file configs/mvsgs/nerf_eval.yaml
      
    • Evaluation on Tanks and Temples

      python run.py --type evaluate --cfg_file configs/mvsgs/tnt_eval.yaml
      
    • Render videos

      Add the save_video True argument to save videos, such as:

      python run.py --type evaluate --cfg_file configs/mvsgs/llff_eval.yaml save_video True
      

      For optimized Gaussians, add -v to save videos, such as:

      python lib/render.py -m output/$scene -p $dir_ply -v
      

      See scripts/mvsgs/nerf_ft.sh for $scene and $dir_ply.

    📝 Citation

    If you find our work useful for your research, please cite our paper.

    @article{liu2024mvsgaussian,
        title={MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo},
        author={Liu, Tianqi and Wang, Guangcong and Hu, Shoukang and Shen, Liao and Ye, Xinyi and Zang, Yuhang and Cao, Zhiguo and Li, Wei and Liu, Ziwei},
        journal={arXiv preprint arXiv:2405.12218},
        year={2024}
    }
    

    😃 Acknowledgement

    This project is built on source codes shared by Gaussian-Splatting, ENeRF, MVSNeRF and LLFF. Many thanks for their excellent contributions!

    📧 Contact

    If you have any questions, please feel free to contact Tianqi Liu (tq_liu at hust.edu.cn).

mvsgaussian's People

Contributors

tqtqliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mvsgaussian's Issues

code issue

I utilized the open-source code you provided to run the MVSGaussian model and Unfortunately, under your experimental configuration, training on the DTU dataset for 300 epochs only achieved a performance of 27.58 dB. I noticed that in the code you released, the sampling points for both levels are the same(one sample), and the rotation network of the Gaussian module has not been unlocked for training (you set it to a fixed value). I am not too sure if the issue lies with my environment or with the code you released. If you could address my questions, I would be extremely grateful.

Issues on two-stage cascaded framework

Great work! Are the number of sampling points in your NeRF module the same as the number of points in 3DGS, or are they the same points? Is the number of sampling points in the final level 2? Is the first level used only for depth estimation and does not introduce 3DGS? How do you handle the density of Gaussian points—are they predicted through MLP or mapped using PDF?

How to run on Custom data?

Hey,

thanks for your amazing work. I was wondering how can I run your method on the output from Colmap?

Basically I have data in the following format:
data -- images/
-- sparse -- 0 -- cameras.bin, images.bin, points3D.bin

关于流程上一些详细的步骤

非常感谢这么好的工作的分享,我有个小小的建议:
流程上可以详细些,就是train test的步骤,包括预训练的泛化的模型可以怎么用?非常感谢

Hello!I encountered an error!

(mvsgs) PS E:\3dgs\MVSGaussian> pip install gaussian-splatting/submodules/diff-gaussian-rasterization
Processing e:\3dgs\mvsgaussian\gaussian-splatting\submodules\diff-gaussian-rasterization
Preparing metadata (setup.py) ... done
Building wheels for collected packages: diff-gaussian-rasterization
Building wheel for diff-gaussian-rasterization (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [11 lines of output]
running bdist_wheel
D:\anconda\envs\mvsgs\lib\site-packages\torch\utils\cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
running build_ext
D:\anconda\envs\mvsgs\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定 的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'diff_gaussian_rasterization.C' extension
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc" -c cuda_rasterizer/backward.cu -o build\temp.win-amd64-cpython-37\Release\cuda_rasteriz
er/backward.obj -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include\torch\csrc\api\include -ID:\ancon
da\envs\mvsgs\lib\site-packages\torch\include\TH -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUD
A\v11.6\include" -ID:\anconda\envs\mvsgs\include -ID:\anconda\envs\mvsgs\Include "-ID:\Visual Studio 2022 Community\VC\Tools\MSVC\14.38.33130\include" "-ID:\Visu
al Studio 2022 Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include" "-ID:\Visual Studio 2022 Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows
Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.2
2621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt
" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll

interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /
EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -D__CUDA
NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -IE:\3dgs\MVSGaussi
an\gaussian-splatting\submodules\diff-gaussian-rasterization\third_party/glm/ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --use-local-env
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for diff-gaussian-rasterization
Running setup.py clean for diff-gaussian-rasterization
Failed to build diff-gaussian-rasterization
Installing collected packages: diff-gaussian-rasterization
Running setup.py install for diff-gaussian-rasterization ... error
error: subprocess-exited-with-error

× Running setup.py install for diff-gaussian-rasterization did not run successfully.
│ exit code: 1
╰─> [20 lines of output]
running install
D:\anconda\envs\mvsgs\lib\site-packages\setuptools\command\install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
setuptools.SetuptoolsDeprecationWarning,
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-37
creating build\lib.win-amd64-cpython-37\diff_gaussian_rasterization
copying diff_gaussian_rasterization_init_.py -> build\lib.win-amd64-cpython-37\diff_gaussian_rasterization
running build_ext
D:\anconda\envs\mvsgs\lib\site-packages\torch\utils\cpp_extension.py:476: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
D:\anconda\envs\mvsgs\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定 的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'diff_gaussian_rasterization.C' extension
creating build\temp.win-amd64-cpython-37
creating build\temp.win-amd64-cpython-37\Release
creating build\temp.win-amd64-cpython-37\Release\cuda_rasterizer
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc" -c cuda_rasterizer/backward.cu -o build\temp.win-amd64-cpython-37\Release\cuda_rasteriz
er/backward.obj -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include\torch\csrc\api\include -ID:\ancon
da\envs\mvsgs\lib\site-packages\torch\include\TH -ID:\anconda\envs\mvsgs\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUD
A\v11.6\include" -ID:\anconda\envs\mvsgs\include -ID:\anconda\envs\mvsgs\Include "-ID:\Visual Studio 2022 Community\VC\Tools\MSVC\14.38.33130\include" "-ID:\Visu
al Studio 2022 Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include" "-ID:\Visual Studio 2022 Community\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows
Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.2
2621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt
" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll

interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /
EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -D__CUDA
NO_HALF_OPERATORS_ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -IE:\3dgs\MVSGaussi
an\gaussian-splatting\submodules\diff-gaussian-rasterization\third_party/glm/ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --use-local-env
error: command 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\bin\nvcc.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> diff-gaussian-rasterization

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Code Release

Hi, I'm very interested in your MVSGaussian, and when do you plan to release your code?

Code Release

Great work! When do you plan to release your code?

How can I find the suitable number for "depth", "range" in file .txt and "zfar", "znear" for my custom dataset ?

Thanks for your amazing work,

But I have one problem
When I tested with your dataset , the result was pretty good, but when I tested with my custom dataset, I have a trouble with "depth" and "range" in file .*txt, I cannot find the suitable value for those variables, therefor the result is not good (the result images below).
Now I'm using depth and range = 425 - 905 and "znear" - "zfar" = 0.01 - 100 (which are defaults of dtu dataset and your code).
Can you help me for this question ? Thank you so much.
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.