GithubHelp home page GithubHelp logo

isaac0424 / co-tracker Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/co-tracker

1.0 0.0 0.0 16.48 MB

CoTracker is a model for tracking any point (pixel) on a video.

License: Other

Python 1.31% Jupyter Notebook 98.69%

co-tracker's Introduction

CoTracker: It is Better to Track Together

Meta AI Research, FAIR; University of Oxford, VGG

Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht

[Paper] [Project] [BibTeX]

Open In Colab

bmx-bumps

CoTracker is a fast transformer-based model that can track any point in a video. It brings to tracking some of the benefits of Optical Flow.

CoTracker can track:

  • Every pixel within a video
  • Points sampled on a regular grid on any video frame
  • Manually selected points

Try these tracking modes for yourself with our Colab demo.

Installation Instructions

Ensure you have both PyTorch and TorchVision installed on your system. Follow the instructions here for the installation. We strongly recommend installing both PyTorch and TorchVision with CUDA support.

Steps to Install CoTracker and its dependencies:

git clone https://github.com/facebookresearch/co-tracker
cd co-tracker
pip install -e .
pip install opencv-python einops timm matplotlib moviepy flow_vis

Model Weights Download:

mkdir checkpoints
cd checkpoints
wget https://dl.fbaipublicfiles.com/cotracker/cotracker_stride_4_wind_8.pth
wget https://dl.fbaipublicfiles.com/cotracker/cotracker_stride_4_wind_12.pth
wget https://dl.fbaipublicfiles.com/cotracker/cotracker_stride_8_wind_16.pth
cd ..

Running the Demo:

Try our Colab demo or run a local demo with 10*10 points sampled on a grid on the first frame of a video:

python demo.py --grid_size 10

Evaluation

To reproduce the results presented in the paper, download the following datasets:

And install the necessary dependencies:

pip install hydra-core==1.1.0 mediapy tensorboard 

Then, execute the following command to evaluate on BADJA:

python ./cotracker/evaluation/evaluate.py --config-name eval_badja exp_dir=./eval_outputs dataset_root=your/badja/path

Training

To train the CoTracker as described in our paper, you first need to generate annotations for Google Kubric MOVI-f dataset. Instructions for annotation generation can be found here.

Once you have the annotated dataset, you need to make sure you followed the steps for evaluation setup and install the training dependencies:

pip install pytorch_lightning==1.6.0

launch training on Kubric. Our model was trained using 32 GPUs, and you can adjust the parameters to best suit your hardware setup.

python train.py --batch_size 1 --num_workers 28 \
--num_steps 50000 --ckpt_path ./ --model_name cotracker \
--save_freq 200 --sequence_len 24 --eval_datasets tapvid_davis_first badja \
--traj_per_sample 256 --sliding_window_len 8 --updateformer_space_depth 6 --updateformer_time_depth 6 \
--save_every_n_epoch 10 --evaluate_every_n_epoch 10 --model_stride 4

License

The majority of CoTracker is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Particle Video Revisited is licensed under the MIT license, TAP-Vid is licensed under the Apache 2.0 license.

Citing CoTracker

If you find our repository useful, please consider giving it a star โญ and citing our paper in your work:

@article{karaev2023cotracker,
  title={CoTracker: It is Better to Track Together},
  author={Nikita Karaev and Ignacio Rocco and Benjamin Graham and Natalia Neverova and Andrea Vedaldi and Christian Rupprecht},
  journal={arxiv},
  year={2023}
}

co-tracker's People

Contributors

nikitakaraevv avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.