GithubHelp home page GithubHelp logo

Comments (18)

LinasKo avatar LinasKo commented on September 27, 2024 2

Hi @CodingMechineer 👋

Let's do one quick test - does installing supervision==0.21.0.rc5 change anything?

from supervision.

CodingMechineer avatar CodingMechineer commented on September 27, 2024 2

That's unfortunate. Here's the next steps to try:

  1. I assume you're already using supervision==0.21.0.rc5 - only later versions have pad_boxes. If not, you should switch to supervision==0.21.0.rc5.

  2. Next, trying out a few values of parameters might help, especially since I think padding already captures 99% of the cases (on my machine the padding worked really well, applied the same way you did)

    1. Try changing px and py in pad_boxes
    2. Try setting a different track_activation_threshold and minimum_matching_threshold in tracker. If the expected FPS is different than 30, you should set it too, as a tracker argument.

Are you always running this on videos, or on a stream too? If both, I wonder if it performs similarly on the live stream and the video of the same stream.

I may run this on a stream in the future. Currently, I only run it on video files. To do the same task, I also tried the Ultralytics library. That works completely fine and I continue with that.

The code looks something like this:

from ultralytics import YOLO

model_path = 'best.pt'
video_path = '../001 - DATA/099 - Test_videos/Test_video_0.avi'

cap = cv2.VideoCapture(video_path)
model = YOLO(model_path)

while cap.isOpened():
  success, frame = cap.read()
  results = model.track(frame, persist=True)

  if results[0].boxes.id is not None:
    boxes = results[0].boxes.xyxy.cpu()
    track_ids = results[0].boxes.id.int().cpu().tolist()
    clss = results[0].boxes.cls.cpu().tolist()
    confs = results[0].boxes.conf.cpu().tolist()

  # Do all the plotting and processing

Maybe this can help you. Please let me know if I can do anything else for you.

from supervision.

LinasKo avatar LinasKo commented on September 27, 2024 1

@SkalskiP, no. I dug for an hour or so, plotted sequential detections (there's typically >50% overlap, yet they disappear). I played with some values, but I'd need to plot/print out steps of the algorithm to learn how it sees the world.

from supervision.

rolson24 avatar rolson24 commented on September 27, 2024 1

Hi @CodingMechineer, @SkalskiP, and @LinasKo

I took a look at it with @CodingMechineer's code and it seems like the tracker is working as expected. It unfortunately is failing for @CodingMechineer because the motion predictor (kalman filter) in the tracker uses the first 2 frames of a track to determine the speed and direction of an object, and so for the first association between the initial detection frame and the second detection frame the tracker uses the overlap between the two frames, NOT between the prediction and the second frame. In this specific video, the objects move very quickly and so for the first 2 frames of some tracks, there is almost no overlap (the ones that are not tracked are <30%, needs to be >30%), meaning no track gets established. This then means the tracker must start over with an entirely new track for that object because it could not establish a motion model, and so for the next frame it has no hope of the overlap of frame 1 and frame 3 being more than 30% in this example.

I improved the performance for this example by setting the minimum_matching_treshold to 0.9 when initializing ByteTrack() and I further improved the performance by adjusting the parameter for activating a new track to be 0.9 (initial 2 frame overlap only needs to be greater than 10% rather than 30%)
This second change is in the source code and was mainly just to test my theory of what is happening and I would not recommend changing it.

For your specific example @CodingMechineer, if the tracking performance is essential to your project, I believe you would either need to record at a higher framerate or you would need to slow down the device being used to sort peanuts. Both these options would reduce the amount an object moves between frames. The last thing we could try is to initialize the motion model to be going in general from left to right if this specific location is the only place you need to run this code on. This would allow the tracker to better pick up on objects in the first two frames. Unfortunately it would require changing the source code a bit and messing around with the kalman filter, something that I am willing to help with, but we would probably not be able to put into the supervision API.

@LinasKo @SkalskiP This issue seems to come from how ByteTrack is designed to be flexible for varied and unexpected tracking scenarios. If we wanted to fix the tracking for these types of repetitive, predictable computer vision tasks, it would be better to design a second tracker that can better handle high-speed and predictable types of motion for tasks like this.

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@CodingMechineer, we accidentally shipped a tracking bug in supervision==0.20.0. Try using supervision==0.19.0 or supervision==0.21.0.rc5 pre-release.

from supervision.

CodingMechineer avatar CodingMechineer commented on September 27, 2024

@LinasKo @SkalskiP I installed version supervision==0.21.0.rc5 and supervision==0.19.0. Though with both versions I have the same problem.

Top: YOLO predictions
Bottom: Supervision tracker
Screenshot 2024-05-21 153420

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@CodingMechineer Could you share with us the exact version of the model and the video file you are using?

from supervision.

CodingMechineer avatar CodingMechineer commented on September 27, 2024

@SkalskiP Sure! Please let me know if there is an issue on my end.
I zipped the video, the model, the code and a requirements.txt file. Unfortunately, the file sizes from the video and the model are too big. Thus, GitHub doesn't let me upload everything.

https://1drv.ms/u/s!AjTS76M8DCeYm8djbuYtiGXfXFNvsQ?e=GleN38

from supervision.

LinasKo avatar LinasKo commented on September 27, 2024

Hi @CodingMechineer 👋

Tracker uses detections overlap and motion model prediction to estimate which detection represent the same object in sequential frames. It then filters out what it can't match. While the details are a bit complicated, a quick way to influence the result is to increase the object area shown to the tracker.

So, my quick suggestion: check if padding the boxes solves your problem.

That means, insert detections.xyxy = sv.pad_boxes(detections.xyxy, px=10, py=10) between the calls to from_ultralytics and update_with_detections.

Here's how it looks on my end. This way, all holes are detected, even after tracking.
image

Does that solve your problem? 😉

from supervision.

CodingMechineer avatar CodingMechineer commented on September 27, 2024

Unfortunately not, some spots and peanuts are still not tracked.

image

from supervision.

LinasKo avatar LinasKo commented on September 27, 2024

That's unfortunate. Here's the next steps to try:

  1. I assume you're already using supervision==0.21.0.rc5 - only later versions have pad_boxes. If not, you should switch to supervision==0.21.0.rc5.
  2. Next, trying out a few values of parameters might help, especially since I think padding already captures 99% of the cases (on my machine the padding worked really well, applied the same way you did)
    1. Try changing px and py in pad_boxes
    2. Try setting a different track_activation_threshold and minimum_matching_threshold in tracker. If the expected FPS is different than 30, you should set it too, as a tracker argument.

Are you always running this on videos, or on a stream too? If both, I wonder if it performs similarly on the live stream and the video of the same stream.

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@LinasKo, do you have any idea why this happens?

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@LinasKo this should not happen. I'm worried because I have no idea why it's happening. @rolson24 would you have time to take a look?

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@CodingMechineer btw if you use model.track in ultralytics, you can still use detections = sv.Detections.from_ultralytics(results) and that tracker_id will be extracted from result object.

from supervision.

SkalskiP avatar SkalskiP commented on September 27, 2024

@LinasKo and @CodingMechineer, is that issue still active?

from supervision.

LinasKo avatar LinasKo commented on September 27, 2024

Yup, we'll need to look at this in the future.

from supervision.

rolson24 avatar rolson24 commented on September 27, 2024

@LinasKo this should not happen. I'm worried because I have no idea why it's happening. @rolson24 would you have time to take a look?

Hi @SkalskiP,
Sorry I have been super busy with school. I can take a look at this and try to see what is going on.

from supervision.

CodingMechineer avatar CodingMechineer commented on September 27, 2024

Thank you for your detailed investigation @rolson24!

I made the same observation regarding the framerate as you explained. With the video from the example, I had the stated problem only with the Roboflow library but not with Ultralytics. Though, when the device is running faster with the same framerate, I had the same problems with the Ultralytics library. Thus, I must make sure the movements from the objects between the frames is small enough, so that the tracking works satisfyingly. Probably the object movement between the frames was between a threshold so that it worked with Ultralytics but not with Roboflow.

In summary, I need to make sure my framerate is high enough so that the object overlap is big enough and the tracking works accordingly. Hence, there is no need to change the source code.

Thanks everybody for your help!

from supervision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.