GithubHelp home page GithubHelp logo

xinshuoweng / ab3dmot Goto Github PK

View Code? Open in Web Editor NEW
1.6K 48.0 400.0 185.14 MB

(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"

Home Page: http://www.xinshuoweng.com/

License: Other

Python 100.00%
computer-vision machine-learning robotics tracking 3d-tracking multi-object-tracking real-time evaluation-metrics evaluation 3d-multi-object-tracking

ab3dmot's Introduction

Hi there 👋, welcome to Xinshuo Weng's GitHub page!

I am a Ph.D. candidate at the Robotics Institute of Carnegie Mellon University advised by Kris Kitani and also an incoming research scientist at the NVIDIA Autonomous Vehicle Research. I received master's degree (2016-17) at CMU. Prior to CMU, I worked at Facebook Reality Lab as a research engineer to help build “Photorealistic Telepresence”. My bachelor was received from Wuhan University. My research interest lies in generative models and 3D computer vision for autonomous systems. I have developed 3D multi-object tracking systems such as AB3DMOT that received >1,300 stars on GitHub. Also, I am leading a few autonomous driving workshops at top conferences such as NeurIPS 2021, IJCAI 2021, ICCV 2021 and IROS 2021. I was awarded a number of fellowships and nominations such as the Qualcomm Innovation Fellowship for 2020 and Facebook Fellowship Finalist for 2021.

Xinshuo's GitHub stats

ab3dmot's People

Contributors

akretz avatar cclauss avatar dependabot[bot] avatar xinshuoweng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ab3dmot's Issues

evaluate on KITTI

你好,想请问下如何在kitti的test数据上做evaluate呢?我的意思是,用train data 训练完模型后,如何提交给官方做evaluate,以便结果能够倍官方所承认?我在官网没有找到相关的入口,烦请指点下,谢谢。

Detections data format

Here you commented that the detections format is [x, y, z, theta, l, w, h]. But I found the detections will be reordered before passed into convert_3dbox_to_8corner which requires [x, y, z, theta, l, w, h] as input. So I think the format of detections is actually [h, w, l, x, y, z, theta]. Is that right?

visualization of groud truth and tracking outputs

Hello,

Is there any way to visualize the labels(gt) and the tracking bounding boxes both on the images simultaneously? Similar to visualization.py which overlays 3D bounding boxes only from tracking. I wanted to visualize even labels on images.

Question about first detection-prediction association

Hey there!

I have implemented a 3D tracker based on your method, but it raised questions for me about the association of trackers and detections for the very first and second detections.

So imagine a scenario where we have a small sign. At t0 it is detected correctly and a Kalman tracker is created for it. At t1 it is also detected correctly and we want to match the prediction of the Kalman tracker with the detection. But the tracker has only one measurement so far and can not predict any next state because it has no information about velocity. So the tracker's predicted state is basically the same as the t0 detection. But because the sign is so small and the scene is moving fast, the 3D IoU between the tracker's state (basically the detection at t0) and the t1 detection is zero. So the detection and tracker are not associated. Because the detection is unmatched, a new tracker is created and the problem repeats itself as the tracker never gets state updates thus never having any velocity information.

Am I missing something here? How does the tracking work if t0 and t1 detection have an IoU of zero? Isn't the whole purpose of using Kalman filters for state prediction to mitigate such cases?

Some question about the paper

In table 1, how do you reproduce the results of Complexer-YOLO and FANTrack, since they haven't published the code yet?

In Fig. 4 you seem to have their result on the test set as well. How can I get this?

clarification of provided detections.

Hello,

First, thanks for providing this work. You already provide the PointRCNN detections on the KITTI MOT dataset. You mention on the paper that "We directly adopt their models pre-trained on the training set of the KITTI 3D object detection benchmark."

Assuming so, if the detector PointRCNN was to be trained on the MOT dataset rather than the object dataset, would that improve tracking results? Or is the detection provided already trained on the MOT dataset.

I am a bit confused there, as I am intending to use your provided detections in other trackers on KITTI MOT to compare, as having the computing power to train is a bit more difficult for me.

clarification AMOTA

Hello,

A brief question on AMOTA definition: Per the paper and the code, AMOTA is :
image

So, what happens, if my model only has true positives with confidence values lets say from 1 to 0.50 instead of 1 to 0. In the current code, summation of MOTA is always divided by L=40 as per the paper.

However, if I only have 24 recall values (recall = 0.60) is the max, is it still correct to divide by 40 or should you divide by 24 then?Asking as currently, if i have different confidence scores, I get different AMOTA due to having less recall values, but always dividing by 40, and not sure whether thats intended behavior. I would think since its L = length of recall values it should be then divided by 24 in this case, rather than 40 regardless of recall values.

Would love clarification.

Thanks

Shouldn't you project the 3D coordinates for KITTI 2D MOT evaluation ?

Hello,

your paper and code are fantastic! I learned a lot.
Quick question:
In additional_info variables, among other things, you store 2D bounding box that is obtained directly from detector module (point-rcnn).
You then use this 2D box directly and pass it to output files to be evaluated on KITTI server.

On the other hand, 3D coordinates obtained from the detector are used in your tracking pipeline to associate detections and tracks, and are therefore updated in the process (Kalman filter update, orientation correction etc.)

Shouldn't you project those 3D coordinates obtained from your tracking pipeline onto image plane and use this 2D bounding box for evaluation on KITTI 2D mot benchmark ?
Instead of using given bounding box from the detection.

Thanks!

Format of other info?

Hi. You haven't explained the format of the other info:

def update(self,dets_all):
    """
    Params:
      dets_all: dict
        dets - a numpy array of detections in the format [[x,y,z,theta,l,w,h],[x,y,z,theta,l,w,h],...]
        info: a array of other info for each det
    Requires: this method must be called once for each frame even with empty detections.
    Returns the a similar array, where the last column is the object ID.
    NOTE: The number of objects returned may differ from the number of detections provided.

It is very difficult to decipher it from your code. Please let me know what information (the proper sequence of data) is to be provided.

Problem with file organization and data format when evaluating on 3d mot

Hi, I tried to evaluate the results following the instruction in the repo. However, I got messages indicating data format error when evaluating both on my own results and on the provided result in this repo (previously it's in /results/car_3d_det_val, now it seems to be gone).

My file organization is:
~/AB3DMOT/results/car_3d_det_result/0000.txt

and a sample line of the text is
421 389 Car 0 0 -10 -205.93195.66 276.22 421.86 1.57 1.73 4.73 -5.55 1.89 7.911 1.50 10.18
which I think should be in accordance with the indication of KITTI MOT devkit.
Are there anything I'm missing?

Ego motion is not considered

Hi thanks for the project

I have an question to ask regarding ego motion of vehicle.

from your paper its clear that baseline aglorithm will not consider the ego motion of vehicle ( i.e motion of the car on which we place lidar )

if it so, then you how will you justify for that?
for example , if the ego vehicle and target vehicle both are moving parallel to each with constant speed.
in this scenario, target vehicle will be detected as static object with zero velocity

Tracking with oriented objects

@cclauss @xinshuoweng thanks for the source code from your implementation tracking of oriented objects is it possible , if i have the bounding box values with orientation information will i get the tracked output with oriented bounding box ??

about this tracker's tracking style

Hi,
very great work!
I want to know this tracker's tracking style: tracking by detecting OR tracking independently?

Tracking by detecting means 'the tracking is heavily dependent on the detecting algorithm, when one target continuous multi-frame(for example 10 frames) undetectable from its first detected out frame, then the tracking result will very bad.' like muti-targets tracker: Sort.

Tracking independently means 'give some targets to track in the first frame, the tracking algorithm could tracking these targets until they are disappeared, dose not need the detecting information'. Like single target tracker:DaSiamRPN.

Want to know this muti-targets tracker's tracking style, Thank you!

orientation correction technique

@xinshuoweng Hi, thanks for your sharing. I have a question about the orientation correction technique which add a π to the orientation that mentioned in your paper, but your code is self.kf.x[3] = -1 * self.kf.x[3] instead of self.kf.x[3] += np.pi, could you please explain why?

can you share the detector (PointRcnn) which get the detection results

hello, xinshuoweng. I use the detector model pointrcnn_7870.pth https://drive.google.com/file/d/1BCX9wMn-GYAfSOPpyxf6Iv6fc0qKLSiU/view?usp=sharing from OpenPCDet https://github.com/open-mmlab/OpenPCDet, and I use it to get the detection result, the convert the detection format to yours detection format, here is my car_val result:
`=================evaluation: best results with single threshold=================
Multiple Object Tracking Accuracy (MOTA) 0.7444
Multiple Object Tracking Precision (MOTP) 0.7839
Multiple Object Tracking Accuracy (MOTAL) 0.7444
Multiple Object Detection Accuracy (MODA) 0.7444
Multiple Object Detection Precision (MODP) 0.8568

Recall 0.8105
Precision 0.9610
F1 0.8793
False Alarm Rate 0.0809

Mostly Tracked 0.7081
Partly Tracked 0.2324
Mostly Lost 0.0595

True Positives 7804
Ignored True Positives 1250
False Positives 317
False Negatives 1825
Ignored False Negatives 1221
ID-switches 0
Fragmentations 37

Ground Truth Objects (Total) 10850
Ignored Ground Truth Objects 2471
Ground Truth Trajectories 210

Tracker Objects (Total) 8564
Ignored Tracker Objects 443
Tracker Trajectories 406

========================evaluation: average over recall=========================
sAMOTA AMOTA AMOTP
0.7763 0.3392 0.6484

Thank you for participating in our benchmark!
`
there is a big gap from my result to yours, can you share your pointrcnn detector or how do you train the detecor, very appreciate for your reply!!

error while x < 0

thanks for your work, I used it in my own data now, but there is some problems.

I deleted the question.

@xinshuoweng thanks again.

Tracking for n frames and additional functionality

@cclauss @xinshuoweng i have few queries

  1. i wanted to have the tracking of the object for n consecutive frames currently with the reference code by changing Age and hits value i am able obtain prediction but if new objects enter inthe frame tracker module doesnot assign ID for that frame .
  2. can i use your approach for motion forecasting , if so what is the changes i have to make
  3. can 3D kalman filter be modified to other type of kalman filters

Undefined names: split, is2dpts, isfloat

flake8 testing of https://github.com/xinshuoweng/AB3DMOT on Python 3.7.1

$ flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics

./visualization.py:77:81: F821 undefined name 'split'
		print("No image data is provided for %s, please download the KITTI dataset" % split)
                                                                                ^
./visualization.py:82:81: F821 undefined name 'split'
		print("No image data is provided for %s, please download the KITTI dataset" % split)
                                                                                ^
./utils.py:37:9: F821 undefined name 'is2dpts'
	return is2dpts(range_test) and range_test[0] <= range_test[1]
        ^
./utils.py:40:40: F821 undefined name 'isfloat'
	try: return isinteger(scalar_test) or isfloat(scalar_test)
                                       ^
4    F821 undefined name 'split'
4

visualization of results

Cannot run : $python trk_conf_threshold.py car_3d_det_test
Gives the error as input folder not found.
However was able to run: $python trk_conf_threshold.py car_3d_det_val
May be because you are storing folder name of the output for test sequence wrongly in the results folder.

After correcting the folder name I ran:
$ python trk_conf_threshold.py car_3d_det_test
$ python visualization.py car_3d_det_test_thres

It was successful but the bounding boxes on the images were not proper.

Directories for KITTI dataset evaluation

I would like to evaluate the tracker performance with recently downloaded KITTI dataset. Unfortunately, I could not figure out where to put the image_02 folder.

Suppose I have following folder from KITTI

│   ├── 2011_09_26_drive_0028_sync
│   │   ├── image_00
│   │   │   ├── data [430 entries exceeds filelimit, not opening dir]
│   │   │   └── timestamps.txt
│   │   ├── image_01
│   │   │   ├── data [430 entries exceeds filelimit, not opening dir]
│   │   │   └── timestamps.txt
│   │   ├── image_02
│   │   │   ├── data [430 entries exceeds filelimit, not opening dir]
│   │   │   └── timestamps.txt
│   │   ├── image_03
│   │   │   ├── data [430 entries exceeds filelimit, not opening dir]
│   │   │   └── timestamps.txt
│   │   ├── oxts
│   │   │   ├── data [430 entries exceeds filelimit, not opening dir]
│   │   │   ├── dataformat.txt
│   │   │   └── timestamps.txt
│   │   └── velodyne_points
│   │       ├── data [430 entries exceeds filelimit, not opening dir]
│   │       ├── timestamps_end.txt
│   │       ├── timestamps_start.txt
│   │       └── timestamps.txt

Then, which folder and/or other files should I move to ./data/KITTI/resources/? Thank you in advance

PS. I would also like to visualize the the results with $ python visualization.py

Code clarification

Hi @xinshuoweng , thanks for open-sourcing your 3d tracking code. Very nice to find this. I have a few questions for you about the coordinate system and dimension ordering:

  1. The docstring here doesn't look correct -- mentions image projection.
    https://github.com/xinshuoweng/AB3DMOT/blob/master/main.py#L115

  2. It looks like the tracking reference frame is the camera coordinate frame, rather than an egovehicle, LiDAR, or world frame. Might be helpful to clarify this in the docstrings. Unlike KITTI, most new datasets now annotate in an egovehicle, LiDAR, or world frame, instead of the camera frame, meaning x is now associated with length, y with width, and z with height, e.g. Argoverse or Nuscenes
    https://github.com/xinshuoweng/AB3DMOT/blob/master/main.py#L134

  3. What is the reason for the reordering back and forth of the bounding box information? Might be helpful to indicate the order expected (since this threw me off initially)
    https://github.com/xinshuoweng/AB3DMOT/blob/master/main.py#L348

  4. Any particular reason to not use Shapely.geometry for polygon intersection? I see that you've provided your own function here:
    https://github.com/xinshuoweng/AB3DMOT/blob/master/main.py#L48

Not all detections are tracked

I have pipelined the object location data from Carla simulator to AB3DMOT. I have noticted that not all detections are tracked. Initially I had the same error with Qhull, as described by @johnwlambert in #32, but adding the code he provided has solved the issue. Still, only a small proportion of all detected cars is tracked.

Here is the detection and tracking vectors as specified in source files ([x,y,z,theta,l,w,h] and [x,y,z,theta,l,w,h,id,confidence] respectively) for several successive frames.
link to pastebin
First frame the tracker identifies all of the detections, but then for some reason tracks only 3 of 10.

PointCloud datasets for visualization

Hi,

Thanks for sharing the code as mentioned in your earlier dates. Now i want to visualize the results of pointcloud using mayavi.
In data/KITTI only annotation for given. But i would like to get velodyne pointclouds for the smae annotation data. Seeking your help on this.

evaluate on KITTI

不好意思,又来打扰你。你发的网站我查看了,有个疑问是我需要发邮件和Andreas Geiger确认信息,才能够得到测试的数据吗,官网的说明看的有点迷糊。
感谢回复!

xinshuo_io

Thank you for excellent work! I wonder how can I import xinshuo_io in conda environment using Pycharm. I can only import that in terminal.

where did you use the velocity state?

Hi, thanks for your project!

i read the code, it is perfect! but i didnt find where did you use the velocity of the object, could you please tell me the function that is associated with the vx, vy and vz of the object.

thanks!

about source bin/activate

I have done the previous jobs,when i did "source bin/activate",it showed below:
image

and then i pip install source ,it works.But when i did the command,there was the same question.Who meets the same problem?

Different types of KF

@cclauss @xinshuoweng hello thanks for the source code , can we replace KF for kitti dataset with extended kalman filter or unscented kalman filter in the existing source code , if so what all are the additional inputs required for the extended kalman filter

Which variables are in "additional_info"?

In the main.py the tracker gets updated with dets_all . dets_all = {'dets': dets, 'info': additional_info}.

What is additional_info ?
Or more general, what is each column in the detection .txt (for example 0000.txt)?

  • column 0 is frame number

  • column 1 is type (2=car)

  • columns 7 to 13 are x, y, z, theta, l, w, h

Is this correct?

What are columns 2 to 6?

Am I right, that the additional_info is not used by the main algorithm?

C++ implementation

@xinshuoweng thanks for the wonderful source code , just wanted to knwo is there any C++ version available or reference C++ version , any pointers are welcome

Wrong Visualization Path

Hi, thanks for your project!

Toady when I want to visualize the qualitative results of your3D MOT system on images shown in the paper, it cant't find the images, I find it's the path wrong.

the right command is:
python visualization.py pointrcnn_Car_test

maybe it will be help.

Wrong detection

Hi. thanks for your nice code.
I'm new in your code, so when i tried to run it on some images but i get false position of detection in output images. i have run your all instructions step by step in your readme then i put some images in data\KITTI\resources\testing\image_02\0003 and i get wrong result. i figure it out that your calibrate data are special just for KITTI dataset, am i right?
so, i have some question now.

0- I want to processing video from different angle camera, not like front of cars, little up and by angle 30 degree, how i make calibrate data for my camera?
1- How to process video using your Xinshuo_PyToolbox to find vehicles ? need i repeat some instructions for processing my own video ? how create calib data for my videos ?
2- Have you any idea to measuring vehicle speed using your coordination of cars?
3- Are your code need any Laser sensor data to detect and tracking vehicles ?

What happens if there are no detections in a frame

I am wondering what happens if there are no detections in a frame. In that case I will be calling the update function with the following:

dets_all = {'dets': [], 'info': []}

It throws an error because it is trying to extract info from an empty list: dets = dets[:, self.reorder]

How should empty detection be handled?

Evaluation Error

I applied main.py to generate result data files. When run evaluate_kitti3dmot.py with result files, it always failed in loadTracker(), which turns out :
Feel free to contact us, if you receive this error message:
Caught exception while loading result data.

What are the possible reasons? Thanks for attention!

Custom data and architecture

@xinshuoweng Hi thanks for the wonderful code base . i am having few queries

  1. As mentioned in the paper the object detection used are PointRCNN and Moncular 3D methods which gives the output mentioned in the text file . I am using an architecture which uses BEV image from point cloud for detection. Can i used ur method for tracking
    2.On CPU as mentioned in paper i am getting 214.9fps but my object detection is running at 30fps so will the tracking also be real time
    3.When i read ur paper ur backpropogating the 3D data into 2D image can this be applied to person and other classes too

Inconsistency in sMOTA definition

Thanks for putting up this code, this is really great.

I have a question about the definition of the scaled-MOTA metric.

In your original paper, you define it as:
image

But in this repo in the evaluate_kitti3dmot.py file, it is defined as
self.sMOTA = min(1, max(0, 1 - (self.fn + self.fp + self.id_switches - (1 - recall_thres) * self.n_gt) / float(recall_thres * self.n_gt)))
Your NuScenes tracking challenge also defines it that way:
image

In your later work and in other authors' works, it has been noted that your implementation is different from other sMOTA implementations. So how should it be defined?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.