GithubHelp home page GithubHelp logo

robot-learning-freiburg / batch3dmot Goto Github PK

View Code? Open in Web Editor NEW
31.0 4.0 2.0 13.62 MB

3D Multi-Object Tracking Using Graph Neural Networks with Cross-Edge Modality Attention. http://batch3dmot.cs.uni-freiburg.de

License: GNU General Public License v3.0

Python 100.00%
graphneuralnetwork 3d-mot tracking kitti nuscenes robotics centerpoint megvii graph-neural-networks computer-vision

batch3dmot's Introduction

Batch3DMOT - Offline 3D Multi-Object Tracking

arXiv | IEEE Xplore | website

This repository is the official implementation of the paper:

3D Multi-Object Tracking Using Graph Neural Networks with Cross-Edge Modality Attention

Martin BΓΌchner and Abhinav Valada.

IEEE Robotics and Automation Letters (RA-L), Vol. 7, Issue 4, pp. 9707-9714, 2022

Overview of Batch3DMOT architecture

Overview of Batch3DMOT architecture

If you find our work useful, please consider citing our paper:

@article{buchner20223d,
  title={3D Multi-Object Tracking Using Graph Neural Networks With Cross-Edge Modality Attention},
  author={B{\"u}chner, Martin and Valada, Abhinav},
  journal={IEEE Robotics and Automation Letters},
  volume={7},
  number={4},
  pages={9707--9714},
  year={2022},
  publisher={IEEE}
}

πŸ“” Abstract

Online 3D multi-object tracking (MOT) has witnessed significant research interest in recent years, largely driven by demand from the autonomous systems community. However, 3D offline MOT is relatively less explored. Labeling 3D trajectory scene data at a large scale while not relying on high-cost human experts is still an open research question. In this work, we propose Batch3DMOT that follows the tracking-by-detection paradigm and represents real-world scenes as directed, acyclic, and category-disjoint tracking graphs that are attributed using various modalities such as camera, LiDAR, and radar. We present a multi-modal graph neural network that uses a cross-edge attention mechanism mitigating modality intermittence, which translates into sparsity in the graph domain. Additionally, we present attention-weighted convolutions over frame-wise k-NN neighborhoods as suitable means to allow information exchange across disconnected graph components. We evaluate our approach using various sensor modalities and model configurations on the challenging nuScenes and KITTI datasets. Extensive experiments demonstrate that our proposed approach yields an overall improvement of 2.8% in the AMOTA score on nuScenes thereby setting a new benchmark for 3D tracking methods and successfully enhances false positive filtering.

πŸ‘¨β€πŸ’» Code Release

Installation

  • Download nuScenes dataset from here.
  • Download Megvii and CenterPoint detections. You may use src/utils/concat_jsons.py to obtain mini-split results.
  • Define relevant paths in *_config.yaml
    • The tmp-folder holds preprocessed graph data while the data-folder holds the raw nuScenes dataset.
    • Adjust package paths to match your local setup.
  • Generate 2D image annotation by running python nuscenes/scripts/export_2d_annotations_as_json.py --dataroot=/path/to/nuscdata --version=v1.0-trainval and place it under the nuScenes data directory.

Preprocessing

  • Generate metadata and GT for feature encoder training:
    • python batch_3dmot/preprocessing/preprocess_img.py --config cl_config.yaml
    • python batch_3dmot/preprocessing/preprocess_lidar.py --config cl_config.yaml,
    • python batch_3dmot/preprocessing/preprocess_radar.py --config cl_config.yaml
  • Train feature encoders:
    • python batch_3dmot/preprocessing/train_resnet_ae.py --config cl_config.yaml
    • python batch_3dmot/preprocessing/train_pointnet.py --config cl_config.yaml
    • python batch_3dmot/preprocessing/train_radarnet.py --config cl_config.yaml
  • Construct disjoint, directed tracking graphs either using modalities or not:
    • python batch_3dmot/preprocessing/construct_detection_graphs_disjoint_parallel.py --config cl_config.yaml
    • python batch_3dmot/preprocessing/construct_detection_graphs_disjoint_parallel_only_poses.py --config pose_config.yaml

Training and Evaluation

  • Train Batch3DMOT (poses-only or using modalities):
    • python batch_3dmot/train_poses_only.py --config pose_config.yaml
    • python batch_3dmot/wandb_train.py --config cl_config.yaml
  • Perform inference using trained model (poses-only, pose+camera or using more modalities):
    • python batch_3dmot/predict_detections_poses.py --config pose_config.yaml
    • python batch_3dmot/predict_detctions_img.py --config cl_config.yaml
    • python batch_3dmot/predict_detections.py --config cl_config.yaml
  • Evaluate produced tracking result:
    • python batch_3dmot/eval/eval_nuscenes.py --config ***_config.yaml

πŸ‘©β€βš–οΈ License

For academic usage, the code is released under the GPLv3 license. For any commercial purpose, please contact the authors.

batch3dmot's People

Contributors

avalada avatar martinbchnr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

jlqzzz huleyun

batch3dmot's Issues

Results on nuScenes validation split

Hi!

Thank you for sharing your work! I think it is very inspiring and interesting.
Could you please share your tracking result (i.e. the .json file) on the nuScenes validation split? It would be very appreciated!

Many thanks and best regards

Please share code

hello, thank you for your nice paper! could you please share your code!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.