GithubHelp home page GithubHelp logo

Cab sam-pt automatically track? about sam-pt HOT 4 CLOSED

syscv avatar syscv commented on September 16, 2024
Cab sam-pt automatically track?

from sam-pt.

Comments (4)

m43 avatar m43 commented on September 16, 2024

Hi, thank you for your question. Yes, SAM-PT can track an object that is defined in the initial frame. It outputs segmentation masks for all subsequent frames in the video. For the first frame, the target object is defined using "query" points. Have I understood your question well, I'm not sure what you might have referred to with "automatically"?

from sam-pt.

rruiz-s avatar rruiz-s commented on September 16, 2024

Hi, thank you so much for sharing SAM-PT and your explanation.

I've experiment with SAM-PT and as @m43 kindly explained, SAM-PT uses the points from the first frame to track the object automatically in all subsequent frames of the video.

Maybe related to @jimmylihui initial question, I was wondering if there is a way to include new query points for subsequent frames in the video while keeping the initial query points. In my case, I had some problems because new objects that were not in the first frame appeared later in the video. Therefore, while the objects from the first frame were being tracked automatically , the new objects from subsequent frames were not.

Again, congratulations on the work and thank you for sharing.

Edited: I shortened the comment to remain within the limits of the initial question

from sam-pt.

m43 avatar m43 commented on September 16, 2024

Thanks for the question!

Yes, the model supports being passed query points with arbitrary and varying timesteps for the same mask (see here). The inputted query points are defined as a tensor of shape (num_masks, n_points_per_mask, 3), with each element denoting the (t, x, y) (timestep, x location, y location) of the query point. For example, if you track one mask with (1920, 360) at timestep 0 and (960, 720) as timestep 2, then you would have a query points tensor like torch.tensor([ [[0, 1920,360], [2, 960,720]] ]).

However, this functionality hasn't been utilized in the simple demo where I fixed the query timestep of all points to 0 here for simplicity. Maybe you want to adapt (or contribute) the code in the way necessary for your use case, or perhaps I could update the demo sometime.

from sam-pt.

rruiz-s avatar rruiz-s commented on September 16, 2024

Thank you very much @m43 for your clear and detailed answer!

I feel it gives relevant information for this thread, particularly regarding the timestep element of the query points and its possibilities.

from sam-pt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.