GithubHelp home page GithubHelp logo

reginalluna / backtoreality Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wyf-accept/backtoreality

0.0 0.0 0.0 19.87 MB

[CVPR 2022] Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

License: MIT License

Python 93.27% Shell 0.01% C 0.48% C++ 2.62% Cuda 3.62%

backtoreality's Introduction

Back To Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

Announcement ๐Ÿ”ฅ

We have not tested the code yet. We will finish this project by April!!!

Introduction

This repo contains PyTorch implementation for paper Back To Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement (CVPR2022)

overview

@inproceedings{xu2022br,
author = {Xu, Xiuwei and Wang, Yifan and Zheng, Yu and Rao, Yongming and Zhou, Jie and Lu, Jiwen},
title = {Back To Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022}
}

Other papers related to 3D object detection with synthetic shape:

  • RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection (ICCV 2021)

New dataset ๐Ÿ’ฅ

We conduct additional experiment on the more challenging Matterport3D dataset. From ModelNet40 and Matterport3D, we select all 13 shared categories, each containing more than 80 object instances in Matterport3D training set, to construct our benchmark (Matterport3d-md40). Below is the performance of FSB, WSB and BR (point-version) based on Votenet: overview

Note that we use OpenCV to estimate the rotated bounding boxes (RBB) as ground-truth, instead of the axis-aligned bounding boxes used in ScanNet-md40 benchmark.

ScanNet-md40 and Matterport3d-md40 are two more challenging benckmarks for indoor 3D object detection. We hope they will promote future research on small object detection and synthetic-to-real scene understanding.

Dependencies

We evaluate this code with Pytorch 1.8.1 (cuda11), which is based on the official implementation of Votenet and GroupFree3D. Please follow the requirements of them to prepare the environment. Other packages can be installed using:

pip install open3d sklearn tqdm

Current code base is tested under following environment:

  1. Python 3.6.13
  2. PyTorch 1.8.1
  3. numpy 1.19.2
  4. open3d 0.12.0
  5. opencv-python 4.5.1.48
  6. plyfile 0.7.3
  7. scikit-learn 0.24.1

Data preparation

ScanNet

To start from the raw data, you should:

  • Follow the README under GroupFree3D/scannet or Votenet/scannet to generate the real scenes.
  • Follow the README under ./data_generation/ScanNet to generate the virtual scenes.

The processed data can also be downloaded from here. They should be placed to paths:

./detection/Votenet/scannet/
./detection/GroupFree3D/scannet/

After that, the file directory should be like:

...
โ””โ”€โ”€ Votenet (or GroupFree3D)
    โ”œโ”€โ”€ ...
    โ””โ”€โ”€ scannet
        โ”œโ”€โ”€ ...
        โ”œโ”€โ”€ scannet_train_detection_data_md40
        โ”œโ”€โ”€ scannet_train_detection_data_md40_obj_aug
        โ””โ”€โ”€ scannet_train_detection_data_md40_obj_mesh_aug

Matterport3D

To start from the raw data, you should:

  • Follow the README under Votenet/matterport to generate the real scenes.
  • Follow the README under ./data_generation/Matterport3D to generate the virtual scenes.

The processed data can also be downloaded from here.

The file directory should be like:

...
โ””โ”€โ”€ Votenet
    โ”œโ”€โ”€ ...
    โ””โ”€โ”€ matterport
        โ”œโ”€โ”€ ...
        โ”œโ”€โ”€ matterport_train_detection_data_md40
        โ”œโ”€โ”€ matterport_train_detection_data_md40_obj_aug
        โ””โ”€โ”€ matterport_train_detection_data_md40_obj_mesh_aug

Usage

Please follow the instructions below to train different models on ScanNet. Change --dataset scannet to --dataset matterport for training on Matterport3D.

Votenet

1. Fully-Supervised Baseline

To train the Fully-Supervised Baseline (FSB) on Scannet data:

# Recommended GPU num: 1

cd Votenet

CUDA_VISIBLE_DEVICES=0 python train_Votenet_FSB.py --dataset scannet --log_dir log_Votenet_FSB --num_point 40000

2. Weakly-Supervised Baseline

To train the Weakly-Supervised Baseline (WSB) on Scannet data:

# Recommended num of GPUs: 1

CUDA_VISIBLE_DEVICES=0 python train_Votenet_WSB.py --dataset scannet --log_dir log_Votenet_WSB --num_point 40000

3. Back To Reality

To train BR (mesh-version) on Scannet data, please run:

# Recommended num of GPUs: 2

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR.py --dataset scannet --log_dir log_Votenet_BRM --num_point 40000

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR_CenterRefine --dataset scannet --log_dir log_Votenet_BRM_Refine --num_point 40000 --checkpoint_path log_Votenet_BRM/train_BR.tar

To train BR (point-version) on Scannet data, please run:

# Recommended num of GPUs: 2

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR.py --dataset scannet --log_dir log_Votenet_BRP --num_point 40000 --dataset_without_mesh

CUDA_VISIBLE_DEVICES=0,1 python train_Votenet_BR_CenterRefine --dataset scannet --log_dir log_Votenet_BRP_Refine --num_point 40000 --checkpoint_path log_Votenet_BRP/train_BR.tar --dataset_without_mesh

GroupFree3D

1. Fully-Supervised Baseline

To train the Fully-Supervised Baseline (FSB) on Scannet data:

# Recommended num of GPUs: 4

cd GroupFree3D

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_FSB.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_FSB --batch_size 4

2. Weakly-Supervised Baseline

To train the Weakly-Supervised Baseline (WSB) on Scannet data:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_WSB.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_WSB --batch_size 4

3. Back To Reality

To train BR (mesh-version) on Scannet data, please run:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRM --batch_size 4

# Recommended num of GPUs: 6

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR_CenterRefine.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.001 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRM_Refine --checkpoint_path <[checkpoint_path_of_groupfree3D]/ckpt_epoch_last.pth> --max_epoch 120 --val_freq 10 --save_freq 20 --batch_size 2

To train BR (point-version) on Scannet data, please run:

# Recommended num of GPUs: 4

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.006 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRP --batch_size 4 --dataset_without_mesh

# Recommended num of GPUs: 6

python -m torch.distributed.launch --master_port <port_num> --nproc_per_node <num_of_gpus_to_use> train_GF_BR_CenterRefine.py --num_point 50000 --num_decoder_layers 6 --size_delta 0.111111111111 --center_delta 0.04 --learning_rate 0.001 --decoder_learning_rate 0.0006 --weight_decay 0.0005 --dataset scannet --log_dir log_GF_BRP_Refine --checkpoint_path <[checkpoint_path_of_groupfree3D]/ckpt_epoch_last.pth> --max_epoch 120 --val_freq 10 --save_freq 20 --batch_size 2 --dataset_without_mesh

Acknowledgements

We thank a lot for the flexible codebase of Votenet and GroupFree3D.

backtoreality's People

Contributors

holycomfort avatar wyf-accept avatar xuxw98 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.