GithubHelp home page GithubHelp logo

Comments (6)

Po-Jen avatar Po-Jen commented on June 24, 2024 1

@erksch OK sorry I think I misunderstood your question. I believe your question is still an open research problem, and here's a paragraph from literature review of FANTrack:

Tracking algorithms have used distance functions such as Euclidean [21] and Mahalanobis distance [22] as matching costs for data association. Other similarity measures include color-based appearance features [23], SIFT-like features [24], and linear and non-linear motion models and their various weighted combinations [25]. These tediously hand-crafted features fail to generalize across complex scenarios and backgrounds, however.
Recent works explore learning pairwise costs using deep structured SVM [3], CNNs [26], and RNNs [27]. For CNNs, similarity learning often exploits Siamese networks. Leal- Taixˆe et al. [28] and Frossard et al. [29] use them to learn descriptors for matching with multi-modal inputs. While we also use Siamese networks to learn generalized and discriminative features from 3D object configurations and visual information conditioned on similarity, we adapt our objective function to use the cosine-similarity metric with hard-mining which has a positive impact on convergence.

Still, it's interesting AB3DMOT use IOU only but achieves SOTA performance. Maybe the case you mentioned does not happen often (I haven't played with KITTI or Waymo dataset yet)?

from ab3dmot.

Po-Jen avatar Po-Jen commented on June 24, 2024

Interesting question, let me know whether my understanding makes sense to you.

Basically, AB3DMOT doesn't add new detection immediately to trackers. To prevent the case you mentioned, new detection need to be detected 3 times before it can be added to trackers. You can see this by looking at:

if ((trk.time_since_update < self.max_age) and (trk.hits >= self.min_hits or self.frame_count <= self.min_hits)):

It says trk.hits >= self.min_hits is one of the criteria to add trackers to ret. (although your concern do happens in the first 3 frames due to self.frame_count <= self.min_hits)

From this line, you can see trk.hits is set to 1 when it's first detected:

self.hits = 1 # number of total hits including the first detection

And the min_hits is 3:

def __init__(self, max_age=2, min_hits=3): # max age will preserve the bbox does not appear no more than 2 frames, interpolate the detection

from ab3dmot.

erksch avatar erksch commented on June 24, 2024

Yes, I know about the hits. But as far as I can tell they only are an additional condition if a tracker should be returned as a valid object. It states that in the paper too " However, to avoid creating false positive trajectories, a new trajectory Tp_new will not be created for the unmatched detection Dp_unmatch until Dp_unmatch has been continually matched in the next Bir_min frames".

The hits do not affect the problem I mentioned, because if an object's next position does not overlap with the first prediction of the tracker, tracker and object are not associated and there is not hit in the first place. The general problem is that a Kalman Tracker serves no purpose if it only has one detection and the matching requires that the two first detections of an object in a trajectory overlap.

from ab3dmot.

erksch avatar erksch commented on June 24, 2024

Yeah it's astounding that such a simple approach works so well. But if you think about it, while 2D tracking seems extremely hard to me because of constantly overlapping objects and occlusion, 3D tracking is actually pretty easy because objects never overlap in 3D space (at least big ones like cars) and if your detections are fine you have no problem with occlusion either. Just checking the IoU of all boxes between the previous frame and the current frame gives obviously already good results if objects do not move too much within one frame.

from ab3dmot.

Po-Jen avatar Po-Jen commented on June 24, 2024

That makes sense 😃

Hopefully the author will have some time to share her thoughts on this.

from ab3dmot.

xinshuoweng avatar xinshuoweng commented on June 24, 2024

Hi, great to see such a good discussion here. Both of what you are saying make sense. To your case where your sign is small, typically, we would not use IoU based criteria, i.e., a distance (+ size or anything useful) based criteria would make more sense. The reason we can use IoU to get good performance on KITTI is because car objects are big in KITTI and data is collected under high FPS. For example, for nuScenes with lower FPS, we would prefer to use distance-based criteria.

from ab3dmot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.