code for the paper "3D Sensor-based Pedestrian Detection by Integrating Improved HHA Encoding and Two-branch Feature Fusion."
The code consists of two parts, one is the improved HHA encoding (Improved HHA) , and the other is RGB-D pedestrian detection (RGBD_Detect).
-
the code is implemented in the Matlab platform and win10.
-
depth image and camera intrinsics are used as input.
-
The algorithm is based on the work "Learning Rich Features from RGB-D Images for Object Detection and Segmentation", and the code also refers to the official code provided by them.
-
The improved HHA encoding is faster and the encoding results are more consistent. We validate the performance of our HHA encoding method on several RGB-D datasets, including KITTI, EPFL, KTP, and UNIHall. Detailed comparison results can be found in the paper.
We proposed a two-branch feature fusion extraction module (TFFEM) to obtain both modalities' local and global features. Based on TFFEM, an RGB-D pedestrian detection network is designed to locate the people, with RGB and HHA images as inputs.
the code is based on the mmdetection. Therefore, the mmdetection is needed to install according to their guidelines. Then replace the mmdet file we provided.
Take the KITTI dataset as an example. First download the dataset and create some directories.
└── KITTI_DATASET_ROOT
├── training <-- 7481 train data
| ├── images <-- (image_2)
| ├── hha
| ├── labels <-- (label_2)
└── testing <-- 7580 test data
├── images <-- (image_2)
├── hha
├── labels <-- (label_2)
The HHA data obtained from the previous section or you can directly download from here (Extraction code:TFFE)
Then create the JSON file and replace it in the mmdetection project.
python createjson.py
python train.py