GithubHelp home page GithubHelp logo

jazzlee008 / sam-pt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from syscv/sam-pt

0.0 0.0 0.0 53.08 MB

SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.

Home Page: https://arxiv.org/abs/2307.01197

sam-pt's Introduction

Segment Anything Meets Point Tracking

Segment Anything Meets Point Tracking
Frano Rajič, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu
ETH Zürich, HKUST, EPFL

SAM-PT design

We propose SAM-PT, an extension of the Segment Anything Model (SAM) for zero-shot video segmentation. Our work offers a simple yet effective point-based perspective in video object segmentation research. For more details, refer to our paper. Our code, models, and evaluation tools will be released soon. Stay tuned!

Interactive Video Segmentation Demo

Annotators only provide a few points to denote the target object at the first video frame to get video segmentation results. Please visit our project page for more visualizations, including qualitative results on DAVIS 2017 videos and more Avatar clips.

street bees avatar horsejump-high

Semi-supervised Video Object Segmentation (VOS)

SAM-PT is the first method to utilize sparse point tracking combined with SAM for video segmentation. With such compact mask representation, we achieve the highest $\mathcal{J\&F}$ scores on the DAVIS 2016 and 2017 validation subsets among methods that do not utilize any video segmentation data during training. table-1

Quantitative results in semi-supervised VOS on the validation subsets of DAVIS 2017, YouTube-VOS 2018, and MOSE 2023: table-3 table-4 table-5

Video Instance Segmentation (VIS)

On the validation split of UVO VideoDenseSet v1.0, SAM-PT outperforms TAM even though the former was not trained on any video segmentation data. TAM is a concurrent approach combining SAM and XMem, where XMem was pre-trained on BL30K and trained on DAVIS and YouTube-VOS, but not on UVO. On the other hand, SAM-PT combines SAM with the PIPS point tracking method, both of which have not been trained on any video segmentation tasks. table-6

Acknowledgments

We want to thank SAM, PIPS, HQ-SAM, and XMem for publicly releasing their code and pretrained models.

Citation

If you find SAM-PT useful in your research or if you refer to the results mentioned in this repository, please star ⭐ this repository and consider citing 📝:

@article{sam-pt,
  title   = {Segment Anything Meets Point Tracking},
  author  = {Rajič, Frano and Ke, Lei and Tai, Yu-Wing and Tang, Chi-Keung and Danelljan, Martin and Yu, Fisher},
  journal = {arXiv:2307.01197},
  year    = {2023}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.