GithubHelp home page GithubHelp logo

cico's Introduction

CiCo

An offical code repository for our paper

  • One-stage Video Instance Segmentation: from Frame-in Frame-out to Clip-in Clip-out

    image

  • Instance mask results ovis

News

  • [15/03/2022] Release code with pre-trained models on Github and paper on arxiv

Installation

  • Clone this repository

    git clone https://github.com/MinghanLi/CiCo.git
    cd CiCo
  • Set up the environment using Anaconda:

    conda env create -f environment.yml
    conda activate cico-env
  • According to your Cuda and pytorch version to install mmcv or mmcv-full from here.

    # An example pytorch 1.10 and cuda 11.3 with mmcv version 1.4.2
    pip install mmcv-full==1.4.2 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
  • Compile a customized COCO API for YouTubeVIS dataset from here

    pip install "git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI"
    git clone https://github.com/youtubevos/cocoapi
    cd cocoapi/PythonAPI
    # To compile and install locally 
    python setup.py build_ext --inplace
    # To install library to Python site-packages 
    python setup.py build_ext install

Run

Prepare datasets and models

  • Datasets: If you'd like to train or test CiCo, please download the datasets from the official web: YTVIS2019, YTVIS2021 and OVIS, then update your data path in configs/_base_/datasets/vis.py

    cd CiCo
    vim configs/_base_/datasets/vis.py
  • Download CoCo-pretrained models with ResNet or Swin transformer backbones from here. To train, please put them in a directory, such as outputs/pretrained_models/.

Inference

# add --display to display instance masks
python eval.py --trained_model=path/to/your/trained/models.pth --NUM_CLIP_FRAMES=3 --overlap_frames=1 

Training

# Train CiCo with a 3-frame clip on 4 GPUs
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --config=configs/CiCo/cico_yolact_r50_yt19.py --NUM_CLIP_FRAMES=3 
--IMS_PER_BATCH=6 --LR=0.001 --backbone_dir=outputs/pretrained_models/

Main Results

Quantitative Results on YTVIS2019

Backbone FCN mAP Trained models Results
R50 Yolact 37.1 cico_yolact_r50_yt19_f3.pth stdout.txt
R50 CondInst 37.3 cico_CondInst_r50_yt19_f3.pth stdout.txt
R101 Yolact 39.6
R101 CondInst 39.6 cico_CondInst_r101_yt19_f3.pth stdout.txt
Swin-tiny Yolact 41.8 stdout.txt
Swin-tiny CondInst 41.4 cico_CondInst_swint_yt19_f3.pth stdout.txt

Quantitative Results on YTVIS2021

Backbone FCN mAP Weights Results
R50 Yolact 35.2 cico_yolact_r50_yt21_f3.pth stdout.txt
R50 CondInst 35.4 cico_CondInst_r50_yt21_f3.pth stdout.txt
R101 Yolact 36.5 cico_yolact_r101_yt21_f3.pth stdout.txt
R101 CondInst 36.7 cico_CondInst_r101_yt21_f3.pth stdout.txt
Swin-tiny Yolact 38.0
Swin-tiny CondInst 39.1 cico_CondInst_swint_yt21_f3.pth stdout.txt

Quantitative Results on OVIS

Backbone FCN mAP Weights Results
R50 Yolact 17.4 cico_yolact_r50_ovis_f3.pth stdout.txt
R50 CondInst 18.0
R101 Yolact 19.1 cico_yolact_r101_ovis_f3.pth stdout.txt
R101 CondInst 20.4 cico_condinst_r101_ovis_f3.pth stdout.txt
Swin-tiny Yolact 18.0
Swin-tiny CondInst 18.2

cico's People

Contributors

minghanli avatar

Watchers

 avatar

Forkers

cv-ip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.