GithubHelp home page GithubHelp logo

enigmatisms / nerf Goto Github PK

View Code? Open in Web Editor NEW
69.0 3.0 2.0 10.93 MB

Reproduction of ECCV 2020 NeRF and its subsequent works (mip NeRF, 360, Ref NeRF, info-NeRF) in PyTorch.

Home Page: https://enigmatisms.github.io/2022/03/27/NeRF%E8%AE%BA%E6%96%87%E5%A4%8D%E7%8E%B0/

License: Apache License 2.0

Python 98.76% Shell 1.24%
nerf novel-view-synthesis pytorch

nerf's Introduction

NeRFs


(README Update 2023.8.18): Note that this repo is a little bit old, the backbone and main logics are implemented last year. For (much) better integrated project, I recommend nerfstudio. You can also check our competition results out in nerfstudio-giga.

This repo contains the following reproduction implementation:

(1) CVPR 2022 best student honorable mention: Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields is implemented in this repo. This repo can turn Ref NeRF part on/off with one flag: -t. Ref NeRF is implemented upon (proposal network + NeRF) framework. Currently, the result is not so satisfying as I expected. This may be caused by insufficient time for training (limited training device, 6GB global memory, can only use up to batch size 2^9 (rays), while the paper uses 2^14).

(2) The idea from ICCV 2021: Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields, using conical frustum instead of pure points in sampling, trying to solve the problem of aliasing under multi-resolution settings. Removed the use of coarse network.

(3) The idea from CVPR 2022: Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields, which currently has no open-sourced code.

(4) Original NeRF paper: ECCV 2020: Representing Scenes as Neural Radiance Fields for View Synthesis. Well, actually this is lost in all these commits.

(5) Info-NeRF (information theory based regularizer, boosting results for "few shot" learning in NeRF): InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering. The reproduction is implemented by Dinngger, in branch infonerf.

  • Using shallower proposal network distilled from NeRF MLP weight rendering, in order to reduce the evaluation time for coarse samples.
  • Ref NeRF is built together with the proposal network proposed in mip NeRF 360, making the model harder to train. The reason behind this is (I suppose) normal prediction in Ref NeRF uses a "back-culling strategy" (orientation loss), which prevents foggy artifacts behind semi-transparent surface. This strategy will both concentrate density and have some strange (magical) effect on the gradients of proposal network. I experimented with original NeRF framework, and things seem to work out fine, with no mosaic-like noise.
  • Using weight regularizer which aims to concentrate computed weight in a smaller region, make it more "delta-function-like". The final output is more accurate. There are some results and a small comparison below.

If you are interested in the implementation of NeRFs and you don't want to read tedious codes, then this repo offers more comprehensive codes (Yeah, clearer logic, but not necessarily good-looking or efficient or even scalable..., it depends on how u see things). For more detailed info about this repo and reproduction, please refer to:

Some Ref NeRF results: (Latest-commit e4907564) Shinny blender "helmet" dataset trained for 3 hours (not completed, PSNR around 27). Oops, gif file to big to upload. Fine, just imagine the output, its better than the older commit (in terms of normal prediction)

(Older-commit 847fdb9d) Shinny blender "helmet" dataset trained for 6-7 hours (not completed, PSNR around 28.5.) ezgif-1-8207b1faa2

Some old results(2022.6) : Mip NeRF proposal network distillation with amp speed up (yeah, this is faster)

Lego trained for 2.5h Hotdog trained for 30min
ezgif-4-fe2ea0a6a2 ezgif-4-402a3412da

Some older results (2022.4):

Spherical views (400 * 400) Comparison (no regularizer - left) Proposal network distillation

Side Notes

This repo contains:

  • CUDA implemented functions, like inverse transform sampling, image sampler, positional encoding module, etc.
  • A simpler version (in terms of readability) of NeRF (comparing with offcial NeRF implementation which is written in TensorFlow)
  • Simple APEX accelerated version of NeRF

Requirements

  • To enable APEX auto mixed precision, you need NVIDIA/apex, just follow the instruction of apex and you are good to go. APEX support is disabled by default. You would need to set -s or --scale when running the train.py
  • To test CUDA implementation, make sure:
    • Libtorch has the correspond version (the same as your CUDA version)
    • The CUDA version of PyTorch should be the same as that of CUDA
    • libeigen3-dev is required. However, for Ubuntu 18.04 users, the default version of libeigen3-devin apt is 3.3.4, which is too low for CUDA 11+, when compiling, an error might be thrown ("<math_functions.hpp> not found"). To correctly compile CUDA extensions, you would need to download Eigen 3.4.0 and compile it manually. After installing Eigen 3.4.0, setup.py in cuda/ should be modified if the CMAKE_INSTALL_PREFIX is not /usr/local/include
  • Other requirements
PyTorch torchvision argparse tensorboard numpy/PIL scipy
1.7+ recommended (depends on the version of PyTorch) 1.1 recommended 1.15+ ... optional

Repo Structure

There are some folders YOU MUST HAVE in the root folder! (See below, the compulsory ones, which is not included in this cloud repo (git ignored))

.
├── logs/ --- tensorboard log storage (compulsory)
├── model/ --- folder from and to which the models are loaded & stored (compulsory)
├── check_points/ --- check_points folder (compulsory)
├── train.py --- Python main module
├── test --- quick testing script
	 ├── ...
├── cuda
	 ├── src
	 		├── ... (cuda implemented functions)
	 └── setup.py	--- PyTorch CUDA extension compiling and exporting script
└── py/ 
	 ├── addtional.py  --- For mip nerf (proposal network and regularizer)
	 ├── mip_methods.py  --- Though current implementation uses no cone samling (mip part), the mip functions are retained.
	 ├── mip_model.py  --- Mip NeRF (no ref), or rather say: NeRF model definition
	 ├── nerf_base.py  --- Both Ref NeRF and Mip NeRF inherited from this base class
	 ├── procedures.py  --- Functions like rendering a whole image, rendering with orbital camera views, **argparse default settings**
	 ├── ref_func.py  --- Ref NeRF spherical harmonics (modified and adopted from Ref NeRF official repo)
	 ├── ref_model.py  --- Ref NeRF model definition
	 ├── dataset.py --- Custom dataset for loading nerf-synthetic dataset
	 ├── model.py --- NeRF model, main. (include model definition and rendering)
	 ├── timer.py --- TicToc timer.
	 └── utils.py --- Some utility functions

Compile & Run

I. With CUDA extension (Deprecated)

CUDA extension is no longer in use for a long time (Dozens of commits ago). But if u insists...

To build pytorch extension for python. Run:

cd cuda/
python ./setup.py install --user

There is probably a compilation error, which might be caused by include directory specified in setup.py. Note that:

  • If you are using arch_70+, (for example, I am using RTX 3060 whose architect is sm_86), eigen 3.3.7 or higher is required. Otherwise you will get an error No such file or directory <math_functions.h>. I'm using the manually installed Eigen 3.4.0, which is located in /usr/local/include/eigen3
  • If the GPU architecture is lower than arch_70, normal libeigen3_dev (version 3.3.4 on Ubuntu 18.04) will suffice.

Actually, CUDA extension is not well supported after debugging (positional encoding can still be used, the script for running all other functions are largely changed, yet they are all tested previously, should work fine.), and it is not compatible with APEX or other auto mixed precision libs (like torch.amp)

II. Run, directly

直接,润。那么应该怎么润呢?

To run the training, make sure you have output/ and check_points/ folder in the root dir, otherwise I will throw u an error, then:

cd . 		# cd 2 root dir
python ./train.py -s 		# -s enables apex O1 optimization, which is 30%-40% faster during training

For other configurations, plz refer to python ./train.py --help for more help.


Results (Outdated)

Please just refer to the "demo" part.

Apart from the dynamic results (full resolution) listed above, there are some additional results (from nerf-blender-synthetic dataset (lego)):

Iterated for 20 epochs (100 images selected, which takes 260s to train, 13s per epoch on RTX 3060)

Iterated for ?? epochs (I don't remember). The top-left-most image is rendered by coarse network, therefore it is more blurred.

Iterated for 100s epochs (3+ hours - training, apex O2, half resolution, (rendering is slow, 42s for eight images)). The top-left-most image is rendered by coarse network, therefore it is more blurred.

I am not so patient, therefore all of the models aren't trained thoroughly (also I think it is meaningless to do that).

nerf's People

Contributors

dependabot[bot] avatar enigmatisms avatar sleep2hours avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

nerf's Issues

meaning for spa_info_b in ref_model.py

I read the code, and got confused by the meaning of spa_info_b in ref_model.py. I have read the ref-nerf paper, and understood the diffuse reflection and specular reflection, but still don't understand spa_info_b. I think it is kind of density, which is showed in figure 4 in the paper.
thank u very much!
spa_info_b
ref-nerf

Cannot find reference 'encoding' in 'nerf_helper'.py'.

Dear export, hello! I want to learn about the details of "Cannot find reference 'encoding' in 'nerf_helper'.py'. ". I saw the description of pybind11 from CUDA EXTENSION. But there is a reference information "Cannot find reference 'encoding' in 'nerf_helper'.py'. " in model.py when running. Thank you very much!

model.py
from nerf_helper import encoding

nerf_extension.cu
PYBIND11_MODULE (TORCH_EXTENSION_NAME, nerf_helper)
{
  	nerf_helper.def ("comservativeSampling", &cudaSampler, conservative_sampler_docs.c_str());
  	nerf_helper.def ("sampling", &easySampler, sampler_docs.c_str());
	nerf_helper.def ("encoding", &positionalEncode, pe_docs.c_str());  // This !! Thank you !!!
	nerf_helper.def ("invTransformSample", &inverseTransformSample, inv_tf_docs.c_str());
	nerf_helper.def ("invTransformSamplePt", &inverseTransformSamplePt, inv_tf_pt_docs.c_str());
	nerf_helper.def ("imageSampling", &imageSampler, image_sampler_doc.c_str());
	nerf_helper.def ("validSampling", &validSampler, valid_sampler_doc.c_str());
}

The model yields blank images during training

Hi!! Thanks for your efforts to reproduce the paper!
I find some mistakes with training on the datasets of materials, and the dataset looks like:
image

The model outputs blank and meaningless images, and I wonder How to fix it?
image

loss=0 in mip-nerf-360

Your working is great! Thank you for the helpful project!
But I got some problems when using the mip-nerf-360 code you provided. When I run mip_train.py, some issues were found.
First, when reading data, an error was generated at the following location:

File "mip_train.py", line 353, in <module>
    main()
  File "mip_train.py", line 206, in main
    for i, (train_img, train_tf, ext_img, ext_tf) in enumerate(train_loader):
ValueError: not enough values to unpack (expected 4, got 2)

So I deleted ext_img, ext_tf these two variables, convert the code from:

for i, (train_img, train_tf, ext_img, ext_tf) in enumerate(train_loader):
    train_timer.tic()
    train_img = train_img.cuda().squeeze(0)
    train_tf = train_tf.cuda().squeeze(0)
    ext_img = ext_img.cuda().squeeze(0)
    ext_tf = ext_tf.cuda().squeeze(0)
    now_crop = (center_crop if train_cnt < center_crop_iter else 1.)
    valid_pixels, valid_coords = randomFromOneImage(train_img, now_crop)
    ext_valid_pixels, ext_valid_coords = randomFromOneImage(ext_img, now_crop)
    # sample one more t to form (coarse_sample_pnum) proposal interval
    coarse_samples, coarse_lengths, rgb_targets, coarse_cam_rays = validSampler(
        valid_pixels, valid_coords, train_tf, sample_ray_num, coarse_sample_pnum,
        400, 400, train_focal, near_t, far_t, True
    )
    ext_coarse_lengths, ext_rgb_targets, ext_coarse_cam_rays = validSampler(
        ext_valid_pixels, ext_valid_coords, ext_tf, args.N_entropy, coarse_sample_pnum + 1, 400, 400, train_focal, near_t, far_t, False
    )
    coarse_lengths = torch.cat([coarse_lengths, ext_coarse_lengths], 0)
    rgb_targets = torch.cat([rgb_targets, ext_rgb_targets], 0)
            coarse_cam_rays = torch.cat([coarse_cam_rays, ext_coarse_cam_rays], 0)

to:

for i, (train_img, train_tf) in enumerate(train_loader):
    train_timer.tic()
    train_img = train_img.cuda().squeeze(0)
    train_tf = train_tf.cuda().squeeze(0)
    now_crop = (center_crop if train_cnt < center_crop_iter else 1.)
    valid_pixels, valid_coords = randomFromOneImage(train_img, now_crop)

    # sample one more t to form (coarse_sample_pnum) proposal interval
    coarse_samples, coarse_lengths, rgb_targets, coarse_cam_rays = validSampler(
        valid_pixels, valid_coords, train_tf, sample_ray_num, coarse_sample_pnum, 
        400, 400, train_focal, near_t, far_t, True
    )

I learned from the code that ext_img, ext_tf is randomly selected. What is the purpose of selecting these two data? Is it safe to directly delete theses two variables?
But at least the code can successfully run through, so I started training the model. The initial loss is:
image

But after a few batches, prop_loss and entropy_ ray_zvals_loss begins to steadily output 0:
image
May I ask why this situation is happening? The dataset I am using is 'lego' in theofficial dataset nerf_synthetic. How can I solve the problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.