GithubHelp home page GithubHelp logo

planarrecon's Introduction

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos


PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos
Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang
CVPR 2022

real-time video


How to Use

Installation

conda env create -f environment.yaml
conda activate planarrecon

Follow instructions in torchsparse to install torchsparse.

Pretrained Model on ScanNet

Download the pretrained weights and put it under PROJECT_PATH/checkpoints/release. You can also use gdown to download it in command line:

gdown --id 1XLL5X2M5BPo89An4jom5s0zhQOiyS_h8

Data Preperation for ScanNet

Download and extract ScanNet by following the instructions provided at http://www.scan-net.org/.

[Expected directory structure of ScanNet (click to expand)]

You can obtain the train/val/test split information from here.

DATAROOT
└───scannet
│   └───scans
│   |   └───scene0000_00
│   |       └───color
│   |       │   │   0.jpg
│   |       │   │   1.jpg
│   |       │   │   ...
│   |       │   ...
│   └───scans_raw
│   |   └───scene0000_00
│   |       └───scene0000_00.aggregation.json
│   |       └───scene0000_00_vh_clean_2.labels.ply
│   |       └───scene0000_00_vh_clean_2.0.010000.segs.json
│   |       │   ...
|   └───scannetv2_test.txt
|   └───scannetv2_train.txt
|   └───scannetv2_val.txt
|   └───scannetv2-labels.combined.tsv

Next run the data preparation script which parses the raw data format into the processed pickle format. This script also generates the ground truth Planes. The plane generation code is modified from PlaneRCNN.

[Data preparation script]
# Change PATH_TO_SCANNET accordingly.
# For the training/val split:
python tools/generate_gt.py --data_path PATH_TO_SCANNET --save_name planes_9/ --window_size 9 --n_proc 2 --n_gpu 1

Inference on ScanNet val-set

python main.py --cfg ./config/test.yaml

The planes will be saved to PROJECT_PATH/results.

Evaluation on ScanNet val-set

Evaluate 3D geometry:

python tools/eval3d_geo_ins.py --model ./results/scene_scannet_release_68 --n_proc 16

Training on ScanNet

Start training by running ./train.sh.

[train.sh]
#!/usr/bin/env bash
export CUDA_VISIBLE_DEVICES=0,1
python -m torch.distributed.launch --nproc_per_node=2 main.py --cfg ./config/train.yaml
Similar to NeuralRecon, the training is seperated to three phases and the switching is controlled manually for now:
  • Phase 1 (the first 0-20 epoch), training single fragments.
    MODEL.FUSION.FUSION_ON=False, MODEL.TRACKING=False

  • Phase 2 (21-35 epoch), with GRUFusion.
    MODEL.FUSION.FUSION_ON=True, MODEL.TRACKING=False

  • Phase 3 (the remaining 35-50 epoch), with Matching/Fusion.
    MODEL.FUSION.FUSION_ON=True, MODEL.TRACKING=True

More info about training to be added soon.

Real-time Demo on Custom Data with Camera Poses from ARKit.

We provide a demo of PlanarRecon running with self-captured ARKit data. Please refer to DEMO.md for details. We also provide the example data captured using iPhoneXR. Incrementally saving and visualizing are not enabled in PlanarRecon for now.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@article{xie2022planarrecon,
  title={{PlanarRecon}: Real-Time {3D} Plane Detection and Reconstruction from Posed Monocular Videos},
  author={Xie, Yiming and Gadelha, Matheus and Yang, Fengting and Zhou, Xiaowei and Jiang, Huaizu},
  journal={CVPR},
  year={2022}
}

Acknowledgment

Some of the code and installation guide in this repo is borrowed from NeuralRecon! We also thank Atlas for the 3D geometry evaluation.

planarrecon's People

Contributors

ymingxie avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.