sfchng / gaussian-activated-radiance-fields Goto Github PK

View Code? Open in Web Editor NEW

110.0 23.0 1.0 12.57 MB

[ECCV 2022 Oral] GARF: Gaussian Activated Radiance Fields for High Fidelity Reconstruction & Pose Estimation

Home Page: https://sfchng.github.io/garf/

Python 100.00%

nerf novel-view-synthesis

gaussian-activated-radiance-fields's Introduction

GARF : Gaussian Activated Radiance Fields for High Fidelity Reconstruction & Pose Estimation

Project Page | arXiv preprint | Paper | Colab Notebook

Shin-Fang Chng ¹, Sameera Ramasinghe ², Jamie Sherrah ¹, Simon Lucey ¹.

¹ Australian Institute for Machine Learning (AIML), University of Adelaide, ² Amazon, Australia

Overview

We provide the PyTorch implementation for training NeRF (Gaussian-based) and GARF models, along with a Colab demo for an image fitting task.

🕵️ Google Colab

If you want to explore Gaussian activation, please check out our Colab notebook which allows you to experiment it with a neural image representation task.

🛠️ Installation Steps

Assuming a fresh Anaconda environment, you can install the dependencies by

pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
pip install -r requirements.txt

💿 Training data

1. LLFF

You can download the real-world dataset by running

gdown 16VnMcF1KJYxN9QId6TClMsZRahHNMW5g
rm -f nerf_llff_data.zip
mv nerf_llff_data data/llff

2. BLEFF

You can download the synthetic forward-facing dataset by running

wget https://www.robots.ox.ac.uk/~ryan/nerfmm2021/BLEFF.tar.gz
tar -xzvf BLEFF.tar.gz
mv BLEFF data/bleff

The data directory should contain the subdirectories llff and bleff. If you have downloaded the datasets, you can create soft links to them within the data directory.

⏳ Training

By default, models and TensorBoard event files are saved to ~/output/<GROUP>/<NAME>. This can be modified using the --output_root flag.

Full MLP training (GARF):

To optimize GARF from scratch by initializing camera poses with identity

For LLFF dataset

python train.py --model=garf --yaml=garf_llff --group=<GROUP> --name=<NAME> --data.dataset=llff --data.scene="fern" --optim.sched=!

For BLEFF dataset

python train.py --model=garf --yaml=garf_bleff --group=<GROUP> --name=<NAME> --data.dataset=bleff --data.scene="balls1" --data.mode="mix_rt/t000r000" --optim.sched=!

To optimize GARF from Colmap estimation, by initializing camera poses with estimation from Colmap

For LLFF dataset

python train.py --model=garf --yaml=garf_llff --group=<GROUP> --name=<NAME> --data.dataset=llff --data.scene="fern" --optim.sched=! --init.pose=True --init.pose_warmup=2000

For BLEFF dataset

python train.py --model=garf --yaml=garf_bleff --group=<GROUP> --name=<NAME> --data.dataset=bleff --data.scene="balls1" --data.mode="mix_rt/t000r000" --optim.sched=! --init.pose=True --init.pose_warmup=2000

Spherical Harmonics-based training (GARF-SH):

python train.py --model=garf_sh --yaml=garf_sh_llff --group=<GROUP> --name=<NAME> --data.dataset=llff --data.scene="fern" --optim.sched=!

Gaussian-activated NeRF training:

python train.py --model=nerf_gaussian --yaml=nerf_gaussian_llff --group=<GROUP> --name=<NAME> --data.dataset=llff --data.scene="fern" --optim.sched=!

🔎 Evaluation

This code evaluates poses, image quality metrics (PSNR/LPIPS/SSIM) of the test set, and rendering novel views. By default, if the value is not provided for resume=<NUM_ITER>, it will automatically load the most recent checkpoint.

python evaluate.py --model=garf --yaml=garf_llff --group=<GROUP> --name=<NAME> --data.dataset=llff --data.scene="fern" --optim.sched=! --resume=<NUM_ITER>

🙇 Special Thanks

This codebase heavily drawns upon the amazing codebase of BARF: Bundle Adjusting Neural Radiance Fields. We thank Chen-Hsuan Lin, Huangying Zhan and Tonghe for their insightful discussions.

👩‍💻 Citation

This code is for non-commercial use. If you find our work useful in your research please consider citing our paper:

@inproceedings{chng2022gaussian,
  title         = {Gaussian activated neural radiance fields for high fidelity reconstruction and pose estimation},
  author        = {Chng, Shin-Fang and Ramasinghe, Sameera and Sherrah, Jamie and Lucey, Simon},
  booktitle     = {The European Conference on Computer Vision: ECCV},
  year          = {2022}
}

@inproceedings{ramasinghe2022beyond,
  title         = {Beyond periodicity: towards a unifying framework for activations in coordinate-MLPs},
  author        = {Ramasinghe, Sameera and Lucey, Simon},
  booktitle     = {The European Conference on Computer Vision: ECCV},
  year          = {2022}
}

gaussian-activated-radiance-fields's People

Contributors

Stargazers

Watchers

Forkers

myrfa

gaussian-activated-radiance-fields's Issues

question for the paper: learnable sigma

Hi,
I enjoy the series of papers published your group about Gaussain activation function applied in implicit neural representation.
I hope to say thank you guys for the hard working.
I really appreciate that your work may open another horizon for INR research, except the hyperparameter $\sigma$

The hyperparameter $\sigma$, which controls the Band Width of the signals outputed from Gaussian activation function, hinder me to utilize your novel idea to my next works.
Because it is important to find a proper sigma, I wonder that it is feasible to make $\sigma$ as a learnable parameter.
it may help to utilize the Gaussian activation function easily.
Is there any ablation study about the learnable $\sigma$? (e.g. it does not converge easily, etc.)
could I ask any experiments about that? Thank you for your time!

New idea.

Update: I tried some Gaussian like function as activations, some resulted well. If the key is the bell-shape, then I guess anything that gives out the bell shape can work also well enough.

Original comment:
If you see this, comment anything to let me know. Thanks.
I saw the activation, it's interesting. I tried this
x = np.linspace(-5, 5, 201)
sig = 3#2#1#0.1#0.3#0.5
l = 6#4#2

numerator = -np.power(x, ll)
denominator = 2sig*sig
result = np.exp(numerator/denominator)

fig, ax = plt.subplots(1)
ax.plot(x, result)

If this is exactly what in your code, then I guess the "result*cos(x)" can result better.
Good luck.

Code Release Plan

The results in your paper are very interesting and I can't wait to try them out.
Any plans to release code?

By the way, if I want to implement your code, the only thing I have to do is replace the pose encoding with a Gaussian activation function?

Thanks.

Using custom data with colmap initialized poses.

I have recorded simple custom data and got the Colmap poses.
When I run using BARF or GARF, the pose errors decrease (R 96 degrees, t about 18), but PSNR is around 18, with messy output-generated images.
I don't understand why the R and t errors are so high, and why messy images?

An issue regarding BLEFF dataloader

Thanks for releasing the code.
In your BLEFF code, c2w data is read from gt_metas.json file. However, some scenes have weird c2w data in that file (for example, bed scene has corresponding c2w data).

            [
                0.14657878875732422,
                -0.0010548068676143885,
                -0.2047007977962494,
                -1.303330659866333
            ],
            [
                -0.20465244352817535,
                -0.00637618824839592,
                -0.14651136100292206,
                -0.7851966619491577
            ],
            [
                -0.004570294171571732,
                0.25168848037719727,
                -0.004569551907479763,
                1.4914559125900269
            ],
            [
                0.0,
                0.0,
                0.0,
                1.0
            ]

If you check whether it is valid (determinant is 1 & orthonormal), you will notice that it is not valid. In contrast, it seems that poses_bounds.npy has valid data (The official implementation of nerfmm also uses that file)

RuntimeError

I am encountering the following error on my conda environment

    raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.  If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\e0309726\Anaconda3\envs\nerf-env\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Users\e0309726\Anaconda3\envs\nerf-env\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
[W ..\torch\csrc\CudaIPCTypes.cpp:21] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]

I am using the versions recommended by the authors. Is there something I have to modify within the code provided ?

Edit:
could you let me know which version of python was used ? thank you!

When can we expect the code release?

Please add a legend or short description on the rendered videos

in the project page, please describe which is which in the rendered video comparison, otherwise I am confused.