GithubHelp home page GithubHelp logo

deepmask's Introduction

Introduction

This repository contains a Torch implementation for both the DeepMask and SharpMask object proposal algorithms.

teaser

DeepMask is trained with two objectives: given an image patch, one branch of the model outputs a class-agnostic segmentation mask, while the other branch outputs how likely the patch is to contain an object. At test time, DeepMask is applied densely to an image and generates a set of object masks, each with a corresponding objectness score. These masks densely cover the objects in an image and can be used as a first step for object detection and other tasks in computer vision.

SharpMask is an extension of DeepMask which generates higher-fidelity masks using an additional top-down refinement step. The idea is to first generate a coarse mask encoding in a feedforward pass, then refine this mask encoding in a top-down pass using features at successively lower layers. This result in masks that better adhere to object boundaries.

If you use DeepMask/SharpMask in your research, please cite the relevant papers:

@inproceedings{DeepMask,
   title = {Learning to Segment Object Candidates},
   author = {Pedro O. Pinheiro and Ronan Collobert and Piotr Dollár},
   booktitle = {NIPS},
   year = {2015}
}
@inproceedings{SharpMask,
   title = {Learning to Refine Object Segments},
   author = {Pedro O. Pinheiro and Tsung-Yi Lin and Ronan Collobert and Piotr Dollár},
   booktitle = {ECCV},
   year = {2016}
}

Note: the version of DeepMask implemented here is the updated version reported in the SharpMask paper. DeepMask takes on average .5s per COCO image, SharpMask runs at .8s. Runtime roughly doubles for the "zoom" versions of the models.

Requirements and Dependencies

Quick Start

To run pretrained DeepMask/SharpMask models to generate object proposals, follow these steps:

  1. Clone this repository into $DEEPMASK:

    DEEPMASK=/desired/absolute/path/to/deepmask/ # set absolute path as desired
    git clone [email protected]:facebookresearch/deepmask.git $DEEPMASK
  2. Download pre-trained DeepMask and SharpMask models:

    mkdir -p $DEEPMASK/pretrained/deepmask; cd $DEEPMASK/pretrained/deepmask
    wget https://s3.amazonaws.com/deepmask/models/deepmask/model.t7
    mkdir -p $DEEPMASK/pretrained/sharpmask; cd $DEEPMASK/pretrained/sharpmask
    wget https://s3.amazonaws.com/deepmask/models/sharpmask/model.t7
  3. Run computeProposals.lua with a given model and optional target image (specified via the -img option):

    # apply to a default sample image (data/testImage.jpg)
    cd $DEEPMASK
    th computeProposals.lua $DEEPMASK/pretrained/deepmask # run DeepMask
    th computeProposals.lua $DEEPMASK/pretrained/sharpmask # run SharpMask
    th computeProposals.lua $DEEPMASK/pretrained/sharpmask -img /path/to/image.jpg

Training Your Own Model

To train your own DeepMask/SharpMask models, follow these steps:

Preparation

  1. If you have not done so already, clone this repository into $DEEPMASK:

    DEEPMASK=/desired/absolute/path/to/deepmask/ # set absolute path as desired
    git clone [email protected]:facebookresearch/deepmask.git $DEEPMASK
  2. Download the Torch ResNet-50 model pretrained on ImageNet:

    mkdir -p $DEEPMASK/pretrained; cd $DEEPMASK/pretrained
    wget https://s3.amazonaws.com/deepmask/models/resnet-50.t7
  3. Download and extract the COCO images and annotations:

    mkdir -p $DEEPMASK/data; cd $DEEPMASK/data
    wget http://msvocds.blob.core.windows.net/annotations-1-0-3/instances_train-val2014.zip
    wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip
    wget http://msvocds.blob.core.windows.net/coco2014/val2014.zip

Training

To train, launch the train.lua script. It contains several options, to list them, simply use the --help flag.

  1. To train DeepMask:

    th train.lua
  2. To train SharpMask (requires pre-trained DeepMask model):

    th train.lua -dm /path/to/trained/deepmask/

Evaluation

There are two ways to evaluate a model on the COCO dataset.

  1. evalPerPatch.lua evaluates only the mask generation step. The per-patch evaluation only uses image patches that contain roughly centered objects. Its usage is as follows:

    th evalPerPatch.lua /path/to/trained/deepmask-or-sharpmask/
  2. evalPerImage.lua evaluates the full model on COCO images, as reported in the papers. By default, it evaluates performance on the first 5K COCO validation images (run th evalPerImage.lua --help to see the options):

    th evalPerImage.lua /path/to/trained/deepmask-or-sharpmask/

Precomputed Proposals

You can download pre-computed proposals (1000 per image) on the COCO and PASCAL VOC datasets, for both segmentation and bounding box proposals. We use the COCO JSON format for the proposals. The proposals are divided into chunks of 500 images each (that is, each JSON contains 1000 proposals per image for 500 images). All proposals correspond to the "zoom" setting in the paper (DeepMaskZoom and SharpMaskZoom) which tend to be most effective for object detection.

DeepMask

SharpMask

deepmask's People

Contributors

pdollar avatar soumith avatar kevintpeng avatar krishmunot avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.