GithubHelp home page GithubHelp logo

amazon-science / progressive-coordinate-transforms Goto Github PK

View Code? Open in Web Editor NEW
64.0 3.0 10.0 570 KB

Progressive Coordinate Transforms for Monocular 3D Object Detection, NeurIPS 2021

License: Apache License 2.0

Python 78.36% Shell 0.05% C++ 21.59%
3d-detection waymo-open-dataset kitti-dataset neurips-2021

progressive-coordinate-transforms's Introduction

Progressive Coordinate Transforms for Monocular 3D Object Detection

This repository is the official implementation of PCT.

Introduction

In this paper, we propose a novel and lightweight approach, dubbed Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations for monocular 3D object detection. Specifically, a localization boosting mechanism with confidence-aware loss is introduced to progressively refine the localization prediction. In addition, semantic image representation is also exploited to compensate for the usage of patch proposals. Despite being lightweight and simple, our strategy allows us to establish a new state-of-the-art among the monocular 3D detectors on the competitive KITTI benchmark. At the same time, our proposed PCT shows great generalization to most coordinate-based 3D detection frameworks.

arch

Requirements

Installation

Download this repository (tested under python3.7, pytorch1.3.1 and ubuntu 16.04.7). There are also some dependencies like cv2, yaml, tqdm, etc., and please install them accordingly:

cd #root
pip install -r requirements

Then, you need to compile the evaluation script:

cd root/tools/kitti_eval
sh compile.sh

Prepare your data

First, you should download the KITTI dataset, and organize the data as follows (* indicates an empty directory to store the data generated in subsequent steps):


#ROOT
  |data
    |KITTI
      |2d_detections
      |ImageSets
      |pickle_files *
      |object
        |training
          |calib
          |image_2
          |label
          |depth *
          |pseudo_lidar (optional for Pseudo-LiDAR)*
          |velodyne (optional for FPointNet)
        |testing
          |calib
          |image_2
          |depth *
          |pseudo_lidar (optional for Pseudo-LiDAR)*
          |velodyne (optional for FPointNet)

Second, you need to prepare your depth maps and put them to data/KITTI/object/training/depth. For ease of use, we also provide the estimated depth maps (these data generated from the pretrained models provided by DORN and Pseudo-LiDAR).

Monocular (DORN) Stereo (PSMNet)
trainval(~1.6G), test(~1.6G) trainval(~2.5G)

Then, you need to generate image 2D features for the 2D bounding boxes and put them to data/KITTI/pickle_files/org. We train the 2D detector according to the 2D detector in RTM3D. You can also use your own 2D detector for training and inference.

Finally, generate the training data using provided scripts :

cd #root/tools/data_prepare
python patch_data_prepare_val.py --gen_train --gen_val --gen_val_detection --car_only
mv *.pickle ../../data/KITTI/pickle_files

Prepare Waymo dataset

We also provide Waymo Usage for monocular 3D detection.

Training

Move to the workplace and train the mode (also need to modify the path of pickle files in config file):

 cd #root
 cd experiments/pct
 python ../../tools/train_val.py --config config_val.yaml

Evaluation

Generate the results using the trained model:

 python ../../tools/train_val.py --config config_val.yaml --e

and evalute the generated results using:

../../tools/kitti_eval/evaluate_object_3d_offline_ap11 ../../data/KITTI/object/training/label_2 ./output

or

../../tools/kitti_eval/evaluate_object_3d_offline_ap40 ../../data/KITTI/object/training/label_2 ./output

we provide the generated results for evaluation due to the tedious process of data preparation process. Unzip the output.zip and then execute the above evaluation commonds. Result is:

Models AP3D11@mod. AP3D11@easy AP3D11@hard
PatchNet + PCT 27.53 / 34.65 38.39 / 47.16 24.44 / 28.47

Acknowledgements

This code benefits from the excellent work PatchNet, and use the off-the-shelf models provided by DORN and RTM3D.

Citation

@article{wang2021pct,
  title={Progressive Coordinate Transforms for Monocular 3D Object Detection},
  author={Li Wang, Li Zhang, Yi Zhu, Zhi Zhang, Tong He, Mu Li, Xiangyang Xue},
  journal={arXiv preprint arXiv:2108.05793},
  year={2021}
}

Contact

For questions regarding PCT-3D, feel free to post here or directly contact the authors ([email protected]).

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

progressive-coordinate-transforms's People

Contributors

amazon-auto avatar bryanyzhu avatar willy0919 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

progressive-coordinate-transforms's Issues

How much time does it take to convert Waymo to KITTI format?

Thank you for amazing work. I wanted to know how much time does it take to convert Waymo to KITTI format using the script

python converter.py --save_dir datasets/waymo_open_organized/ --split validation

The validation one seems to take a lot of time on my machine, and so wanted to confirm.

Waymo Results: mAP for all classes or only for the vehicle class

Hi PCT authors,
I had a small query regarding the Waymo results. Table 7 of your paper reports the mAP on Waymo dataset. Do you report the mAP/ mAPH of all the classes or is it only the mAP/mAPH for the vehicle (car) class ?

PS- Another paper CaDDN only reports mAP on the vehicle (car) class in their Table 2.

Waymo evaluation: Metrics of all Level 1 Objects same as Metrics of [0, 30) Level 1 Objects

Hi PCT authors,
I am using your waymo_eval.py for evaluating my Waymo model. Here is the output

OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1/AP: 0.34
OBJECT_TYPE_TYPE_VEHICLE_LEVEL_1/APH: 0.33
OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2/AP: 0.02
OBJECT_TYPE_TYPE_VEHICLE_LEVEL_2/APH: 0.02
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1/AP: 0.34
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_1/APH: 0.33
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2/AP: 0.04
RANGE_TYPE_VEHICLE_[0, 30)_LEVEL_2/APH: 0.04
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1/AP: 0.12
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_1/APH: 0.12
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2/AP: 0.00
RANGE_TYPE_VEHICLE_[30, 50)_LEVEL_2/APH: 0.00
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1/AP: 0.05
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_1/APH: 0.05
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2/AP: 0.00
RANGE_TYPE_VEHICLE_[50, +inf)_LEVEL_2/APH: 0.00

You should quickly notice that the AP for all Level 1 Vehicle = 0.34 is the same as the AP for [0,30) Level 1 Vehicle = 0.34. This strange behavior also shows up for the Level 1 Vehicle APH and other Level 1 classes (which I have not shown here). Generally, the AP for all Level 1 Vehicle is less than the AP for [0,30) Level 1 Vehicle as correctly reported in Table 7 of your paper.

I am unable to understand this behavior and so wanted to ask if you saw similar stuff on your end.

PS- Level 2 metrics do NOT show this behavior. e.g., in the above output, AP for all Level 2 objects (0.02), is less than AP for [0,30) Level 2 objects (0.04) as expected.

I am using anaconda and following are the packages in my conda environment:

blas                      1.0                         mkl    anaconda
cudatoolkit               10.1.243             h6bb024c_0    anaconda
cudnn                     7.6.5                cuda10.1_0    anaconda
google-auth               1.22.1                     py_0    anaconda
google-auth-oauthlib      0.4.1                      py_2    anaconda
google-pasta              0.2.0                      py_0    anaconda
protobuf                  3.13.0.1         py36he6710b0_1    anaconda
py-opencv                 3.4.2            py36hb342d67_1
python                    3.6.13               h12debd9_1  
tensorboard               2.2.1              pyh532a8cf_0    anaconda
tensorflow                2.1.0           gpu_py36h2e5cdaa_0    anaconda
tensorflow-gpu            2.1.0                h0d30ee6_0    anaconda

Could you provide the output.zip file or pretrained model checkpoints.

Hi. I noticed that you mentioned in the README.md file that

we provide the generated results for evaluation due to the tedious process of data preparation process. Unzip the output.zip and then execute the above evaluation commands.
...

However, I did not find a link to the result file. Would you like to share the detection results or pretrained model with us. Thank you very much.

Some confusion and a request

First of all, thank you for your excellent work. But I have some confusion and a request.
1.kitti_dataset 27 lines of the code, you load label in 'ddmp', not the provided "label_2". What preprocessing did you do to the label? I did not find an explanation in your paper.
2. As you said in the paper, the performance of the 2D detector has no positive correlation with the final 3D detection accuracy. So how do I choose a 2D detector, because I cannot choose the best 2D detector?
3. Have you done other coordinate-based detector experiments, because the paper only reports PatchNet+PCT.
4. Can you provide the feature files of the two-dimensional detection in training and testing so that I can run the code?

about waymo result

Hi, you mentioned you use adabin trained on waymo. So how you do that, since waymo don't provide the gt depth map. Another question is did you train the total model, depthcompletion, 2d detection and 3d detection in an end-to-end manner?

ModuleNotFoundError: No module named 'lib.helpers.decorator_helper_level'

When I run python ../../tools/train_val.py --config config_val.yaml,
I get the error as follow.

Traceback (most recent call last):
File "../../tools/train_val.py", line 19, in
from lib.helpers.trainer_helper import Trainer
File "/newnfs/zzwu/08_3d_code/progressive-coordinate-transforms/lib/helpers/trainer_helper.py", line 11, in
from lib.helpers.decorator_helper_level import decorator_level
ModuleNotFoundError: No module named 'lib.helpers.decorator_helper_level'

About generating 2D detection feature!

Hi, thanks for sharing your great work!
Do you share your 2D detection feature file?
Or could you tell me which layer's feature should be saving in RTM3D?

you need to generate image 2D features for the 2D bounding boxes and put them to data/KITTI/pickle_files/org

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.