Leveraging HD Maps for object detection

In our work, we experiment with using HD Maps to see how they can benefit 3d object detection. To this end, we try multiple different experiments using HD Maps from the nuScenes dataset and augment the centerpoint 3d object detection model.

Our main theme is to experiment with ideas where we allow extra information flow from our HD maps to the center-head or the bounding box head of CenterPoint. In other words, we leave the 3d backbone of centerpoint as is and see how using maps can potentially improve detection (i.e, bounding box, orientation, etc).

Our code is forked off of the CenterPoint repository and modified as required to incorporate our changes. The setup required to run our code is the same as that to run Centerpoint, but with the modification that the data directory will require HD Maps to be generated and stored in the format specified in our script.

For more details refer to our project report

Note: This repo has been moved (partially) from our internal private repository to serve as documentation. For more info, please reach out over email.

Below is the README for the original CenterPoint project. Please refer to the original centerpoint repository for any details!

Center-based 3D Object Detection and Tracking

3D Object Detection and Tracking using center points in the bird-eye view.

Center-based 3D Object Detection and Tracking,
Tianwei Yin, Xingyi Zhou, Philipp Krähenbühl,
arXiv technical report (arXiv 2006.11275)

@article{yin2020center,
  title={Center-based 3D Object Detection and Tracking},
  author={Yin, Tianwei and Zhou, Xingyi and Kr{\"a}henb{\"u}hl, Philipp},
  journal={arXiv:2006.11275},
  year={2020},
}

Updates

[2020-08-10] NEW: We now support vehicle detection on Waymo with SOTA performance. Please stay tuned for more updates in the fall.

Contact

Any questions or discussion are welcome!

Tianwei Yin [email protected] Xingyi Zhou [email protected]

Abstract

Three-dimensional objects are commonly represented as 3D boxes in a point-cloud. This representation mimics the well-studied image-based 2D bounding-box detection, but comes with additional challenges. Objects in a 3D world do not follow any particular orientation, and box-based detectors have difficulties enumerating all orientations or fitting an axis-aligned bounding box to rotated objects. In this paper, we instead propose to represent, detect, and track 3D objectsas points. We use a keypoint detector to find centers of objects, and simply regress to other attributes, including 3D size, 3D orientation, and velocity. In our center-based framework, 3D object tracking simplifies to greedy closest-point matching.The resulting detection and tracking algorithm is simple, efficient, and effective. On the nuScenes dataset, our point-based representations performs 3-4mAP higher than the box-based counterparts for 3D detection, and 6 AMOTA higher for 3D tracking. Our real-time model runs end-to-end 3D detection and tracking at 30 FPS with 54.2AMOTA and 48.3mAP while the best single model achieves 60.3mAP for 3D detection, and 63.8AMOTA for 3D tracking.

Highlights

Simple: Two sentences method summary: We use standard 3D point cloud encoder with a few convolutional layers in the head to produce a bird-eye-view heatmap and other dense regression outputs including the offset to centers in the previous frame. Detection is a simple local peak extraction, and tracking is a closest-distance matching.
Fast: Our PointPillars model runs at 30 FPS with 48.3 AP and 59.1 AMOTA for simultaneous 3D detection and tracking on the nuScenes dataset.
Accurate: Our best single model achieves 60.3 mAP and 67.3 NDS on nuScenes detection testset.
Extensible: Simple baseline to switch in your backbone and novel algorithms.

Main results

3D detection

	Split	MAP	NDS	FPS
PointPillars-512	Val	48.3	59.1	30.3
VoxelNet-1024	Val	55.4	63.8	14.5
VoxelNet-1440_dcn_flip	Val	59.1	67.1	2.2
VoxelNet-1440_dcn_flip	Test	60.3	67.3	2.2

3D Tracking

	Split	Tracking time	Total time	AMOTA ↑	AMOTP ↓
CenterPoint_pillar_512	val	1ms	34ms	54.2	0.680
CenterPoint_voxel_1024	val	1ms	70ms	62.6	0.630
CenterPoint_voxel_1440_dcn_flip	val	1ms	451ms	65.9	0.567
CenterPoint_voxel_1440_dcn_flip	test	1ms	451ms	63.8	0.555

All results are tested on a Titan Xp GPU with batch size 1. More models and details can be found in MODEL_ZOO.md.

Third-party resources

AFDet: another work inspired by CenterNet achieves good performance on KITTI/Waymo dataset.

Use CenterPoint

We provide a demo with PointPillars model for 3D object detection on the nuScenes dataset.

Basic Installation

# basic python libraries
conda create --name centerpoint python=3.6
conda activate centerpoint
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch
git clone https://github.com/tianweiy/CenterPoint.git
cd CenterPoint
pip install -r requirements.txt

# add CenterPoint to PYTHONPATH by adding the following line to ~/.bashrc (change the path accordingly)
export PYTHONPATH="${PYTHONPATH}:PATH_TO_CENTERPOINT"

First download the model (By default, centerpoint_pillar_512) from the Model Zoo and put it in work_dirs/centerpoint_pillar_512_demo.

We provide a driving sequence clip from the nuScenes dataset. Donwload the folder and put in the main directory.
Then run a demo by python tools/demo.py. If setup corectly, you will see an output video like (red is gt objects, blue is the prediction):

Advanced Installation

For more advanced usage, please refer to INSTALL to set up more libraries needed for distributed training and sparse convolution.

Benchmark Evaluation and Training

Please refer to GETTING_START to prepare the data. Then follow the instruction there to reproduce our detection and tracking results. All detection configurations are included in configs and we provide the scripts for all tracking experiments in tracking_scripts. The pretrained models, log, and each model's prediction files are provided in the MODEL_ZOO.md.

License

CenterPoint is release under MIT license (see LICENSE). It is developed based on a forked version of det3d. We also incorperate a large amount of code from CenterNet and CenterTrack. See the NOTICE for details. Note that the nuScenes dataset is free of charge for non-commercial activities. Please contact the nuScenes team for commercial usage.

Acknowlegement

This project is not possible without multiple great opensourced codebases. We list some notable examples below.

CenterPoint is deeply influenced by the following projects. Please consider citing the relevant papers.

@article{zhu2019classbalanced,
  title={Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection},
  author={Zhu, Benjin and Jiang, Zhengkai and Zhou, Xiangxin and Li, Zeming and Yu, Gang},
  journal={arXiv:1908.09492},
  year={2019}
}

@article{lang2019pillar,
   title={PointPillars: Fast Encoders for Object Detection From Point Clouds},
   journal={CVPR},
   author={Lang, Alex H. and Vora, Sourabh and Caesar, Holger and Zhou, Lubing and Yang, Jiong and Beijbom, Oscar},
   year={2019},
}

@article{zhou2018voxelnet,
   title={VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection},
   journal={CVPR},
   author={Zhou, Yin and Tuzel, Oncel},
   year={2018},
}

@article{yan2018second,
  title={Second: Sparsely embedded convolutional detection},
  author={Yan, Yan and Mao, Yuxing and Li, Bo},
  journal={Sensors},
  year={2018},
}

@article{zhou2019objects,
  title={Objects as Points},
  author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
  journal={arXiv:1904.07850},
  year={2019}
}

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={arXiv:2004.01177},
  year={2020}
}

whuhxb / centerpoint-maps Goto Github PK

centerpoint-maps's Introduction