GithubHelp home page GithubHelp logo

tilted's Introduction

TILTED

Project pagearXiv

Code release for our ICCV 2023 paper:

Brent Yi1, Weijia Zeng1, Sam Buchanan2, and Yi Ma1. Canonical Factors for Hybrid Neural Fields. International Conference on Computer Vision (ICCV), 2023.
1UC Berkeley, 2TTI-Chicago

Overview

We study neural field architectures that rely on factored feature volumes, by (1) analyzing factored grids in 2D to characterize undesirable biases for axis-aligned signals, and (2) using the resulting insights to study TILTED, a family of hybrid neural field architectures that removes these biases.

This repository is structured as follows:

.
├── tilted
│   ├── core                - Code shared between experiments. Factored grid
│   │                         and neural decoder implementations.
│   ├── nerf                - Neural radiance field rendering, training, and
│   │                         dataloading utilities.
│   ├── rgb2d               - 2D image reconstruction data and training
│   │                         utilities.
│   └── sdf                 - Signed distance field dataloading, training, and
│                             meshing infrastructure.
│
├── paper_commands          - Commands used for running paper experiments (NeRF)
├── paper_results           - Output files used to generate paper tables. (NeRF)
│                             Contains hyperparameters, evaluation metrics,
│                             runtimes, etc.
│
├── tables_nerf.ipynb       - Table generation notebook for NeRF experiments.
│
├── train_nerf.py           - Training script for neural radiance field experiments.
├── visualize_nerf.py       - Visualize trained neural radiance fields.
│
└── requirements.txt        - Python dependencies.

Note that training scripts for 2D and SDF experiments have not yet been released. Feel free to reach out if you need these.

Running

Setup

This repository has been tested with Python 3.8, jax==0.4.9, and jaxlib==0.4.9+cuda11.cudnn86. We recommend first installing JAX via their official instructions: https://github.com/google/jax#installation

We've packaged dependencies into a requirements.txt file:

pip install -r requirements.txt

Visualization

We use Tensorboard for logging.

After training, radiance fields can be interactively visualized. Helptext for the visualization script can be found via:

python visualize_nerf.py --help

As a runnable example, we've uploaded trained checkpoints for the Kitchen dataset here.

This can be unzipped in tilted/ and visualized via:

# Checkpoints can be selected via the dropdown on the right.
# The 'Reset Up Direction' button will also be when orbitting / panning!
python visualize_nerf.py ./example_checkpoints

The visualization script supports RGB, PCA, and feature norm visualization:

TILTED.Visualizer.mov

The core viewer infrastructure has been moved into nerfstudio-project/viser, which may be helpful if you're interested in visualization for other projects.

Datasets

Meshes for SDF experiments were downloaded from alecjacobson/common-3d-test-models/.

All NeRF datasets were downloaded using nerfstudio's ns-download-data command:

# Requires nerfstudio installation.
ns-download-data blender
ns-download-data nerfstudio all

Training

Commands we used for training NeRF models in the paper can be found in paper_commands/.

Here are two examples, which should run at ~65 it/sec on an RTX 4090:

# Train a model on a synthetic scene.
python train_nerf.py blender-kplane-32c-axis-aligned --dataset-path {path_to_data}

# Train a model on a real scene.
python train_nerf.py nerfstudio-kplane-32c-axis-aligned --dataset-path {path_to_data}

The --help flag can also be passed in to print helptext.

Notes

This is research code, so parts of it may be chaotic. We've put effort into refactor and cleanup before release, but there's always more work to do here! If you have questions or comments, please reach out.

Some notes:

  • The global orientation can have a large impact on performance of baselines. --render-config.global-rotate-seed INT can be set in train_nerf.py to try a different global orientation; paper results sweep across 0, 1, and 2 for each synthetic scene.
  • For speeding things up, the bottleneck training step count can be dropped significantly without hurting performance. This is dictated by --bottleneck.optim.projection-decay-start and --bottleneck.optim.projection-decay-steps; bottleneck training stop as soon as the projection LR hits 0.
  • Runtimes can vary significantly between machines. Our experiments were run using JAX 0.4.9 and CUDA 11.8 on RTX 4090 GPUs.

This material is based upon work supported by the National Science Foundation Graduate Research Fellowship Program under Grant DGE 2146752. YM acknowledges partial support from the ONR grant N00014-22-1-2102, the joint Simons Foundation-NSF DMS grant 2031899, and a research grant from TBSI.

If any of it is useful, you can also cite:

@inproceedings{tilted2023,
    author = {Yi, Brent and Zeng, Weijia and Buchanan, Sam and Ma, Yi},
    title = {Canonical Factors for Hybrid Neural Fields},
    booktitle = {International Conference on Computer Vision (ICCV)},
    year = {2023},
}

tilted's People

Contributors

brentyi avatar

Stargazers

 avatar Sergei Pashakhin avatar E. Kelly Buchanan avatar  avatar  avatar Marc Cymontkowski avatar  avatar Yining Jiao avatar rico avatar Shangjin_Xie avatar  avatar Yifan Xie avatar janusch avatar Jingbo  avatar WZY99 avatar  avatar Liren avatar Pavel Kash avatar  avatar  avatar Zhuoyang Pan avatar maturk avatar Slava Elizarov avatar  avatar WendyYang avatar  avatar Puhua Jiang avatar Stefan Baumann avatar Haochen avatar Ruoyu Wang avatar Pike渔市场 avatar llcc avatar 姬忠鹏 avatar Minjun Kang avatar  avatar Weichuang Li avatar paulpanwang avatar Hu Wenbo avatar Mr.wen avatar Jihyong Oh avatar  avatar Philipp Wu avatar Fan Yang avatar  avatar YiChenCityU avatar xnnjw avatar Zexin He avatar  avatar  avatar Jingsen Zhu avatar gwanjyun avatar Jingxiang Sun avatar Zhe Li avatar 胡良校 avatar Yunsheng Luo avatar Daoyi Gao avatar  avatar  avatar Xiyi Chen avatar Yuming Gu avatar Sicheng Li avatar  avatar Xiyu Zhang avatar YeChongjie avatar Eason Zhang avatar Chao Wen avatar  avatar Baoxiong Jia avatar Yuchao Gu avatar Matt Shaffer avatar Ruilong Li(李瑞龙) avatar  avatar Druv Pai avatar Sandalots avatar NEU-Junshun avatar  avatar hiyyg avatar Lasse Peters avatar Zhentao Liu avatar Yang Li avatar Xingyi He avatar Pan Ji avatar Vincent Ho avatar Yaodong Yu avatar  avatar Guan Luo avatar Lingzhe Zhao avatar Yujie Yuan avatar Zizhang Li avatar 爱可可-爱生活 avatar  avatar Mohammad Reza Taesiri avatar  avatar Gaoyang Zhang avatar Jingnan Gao avatar Lu Ming avatar Chenyu avatar Liu Liu avatar julius avatar kiui avatar

Watchers

 avatar  avatar Sam Buchanan avatar JingfanChen avatar janusch avatar Matt Shaffer avatar

Forkers

hiyyg jackzhousz

tilted's Issues

non-differentiable

Hi,

Thank you for this exciting and simple paper.

I want to replace the original VM representation with the learnable VM representation.

If I understand correctly, all I need to do is first use the angleaxis parametrization

self.angleaxis = torch.nn.Parameter(torch.tensor([0,0,0], dtype=torch.float32)),

and then use the rotation matrix from axis_angle_to_matrix(self.angleaxis) to apply to the original input 3D points.

/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Unfortunately, I find the rotation parameters cannot be optimized (Grad=None, with ValueError: can't optimize a non-leaf Tensor), Would you please give me some hint, why gradients can back-propagate to rotation parameters?

Thank you very much!

Why jax?

Thanks for the great work! I want to ask a question irrelevant to the paper:

why did you choose jax? Is it faster and easier to use than pytorch?

Visualizing method

Hi Brent,

Firstly, thanks for sharing this excellent and robust work !

Is it possible to share the method of visualizing "the structure-revealing L2-norm of interpolated features" in figure 8 ? I am trying to draw inspiration from your work.

best regards,
chieh

Training scripts for SDF experiments

Thanks for this wonderful work.

As mentioned in the readme, I am reaching out for the training scripts for SDF experiments. If you could, please share.

Thanks in advance.

Intuition on "4.1. Applying Transformations"

hi Brent,

Thanks for sharing this very interesting work! I have one question about how you apply transformation to the learnable factors elaborated in section 4.1. I understand it as the following: Let's say we have 3 planes and 64 channels each, and sample 8 transformations, then each transformation is responsible for 6/64 channels. This means we project the queried points to 8 different groups of channels with 8 different transformation matrices, then we concatenate this 8 groups of projected features. The hope is that at least 1/8 groups of features correspond to the "canonical factors". This is very confusing to me as I thought you would take a RANSAC-like method that treats 8 transformation matrices equally and only takes the one with the best reconstruction loss. What you did actually rely on the assumption that 1/8 sampled canonical factors have sufficient representation capacity to recover the structure of the scene. Correct me if my understanding is wrong and I am looking forward to hearing your thoughts.

best regards,
Shengyu

Question about the figure in project page.

Wonderful work !
In the project page, two gifs demonstrate the white square's optimization process under the sentence "establish that an alternating minimization approach converges linearly to the square's true apperance and pose parameters, despite significant nonconvexity in the objective landscape." I wonder what is the difference between these two gifs? Different optimization methods? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.