GithubHelp home page GithubHelp logo

Comments (6)

serkansulun avatar serkansulun commented on August 23, 2024

I might have discovered some things, looking at the code, please tell me if I'm wrong somewhere.

1- We are only using annotated frames, eg. only frame 6 and 11. Frame 6 becomes previous, frame 11 becomes current.

2- In augmentation we shift and scale the ground-truth box in the current frame, and crop twice the area around the shifted box, only to define a new synthetic search region. While getting labels (coordinates relative to the new search region), we still use the not-shifted ground-truth box coordinates.

Are these correct?

I'm still not clear about question 3.

Thanks

from goturn.

davheld avatar davheld commented on August 23, 2024

1-2) That is correct

  1. After figuring out what part of the image we want to crop, we crop that region, and then we resize the cropped region to the fixed-size input for the network (227 x 227).

from goturn.

serkansulun avatar serkansulun commented on August 23, 2024

Thanks for the quick answer.

  1. Do we do it by matching the shorter side to 227 and zero padding the remaining area? Do the ground-truth bounding box coordinates stay the same?
    Thanks

from goturn.

davheld avatar davheld commented on August 23, 2024

We do it by resizing both sides to 227x227. You can try the padding approach if you wish. The ground-truth bounding box coordinates are resized accordingly.

from goturn.

serkansulun avatar serkansulun commented on August 23, 2024

Hi again David,
To test my implementation, I want to use accuracy and robustness errors, instead of ranking. In your paper I couldn't find a direct reference to these metrics but I have some idea, so please let me know if I'm wrong somewhere.
1- Accuracy is simply intersection-over-union value, between ground-truth and predicted boxes.
2- When the accuracy becomes zero (no intersection), the tracker is reinitialized, meaning the ground-truth box is provided for the next frame. Or is it reinitialized n frames later?
3- Robustness error is number of reinitializations divided by number of frames.
4- Are these two measures averaged first over frames (in a single sequence) then over all sequences; or over all frames in all sequences at once?
Thanks in advance

from goturn.

davheld avatar davheld commented on August 23, 2024

from goturn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.