GithubHelp home page GithubHelp logo

fasterrcnn's Introduction

Taken from https://github.com/tensorpack/tensorpack.git (tensorpack/examples/FasterRCNN/)

Faster-RCNN / Mask-RCNN on COCO

This example provides a minimal (<2k lines) and faithful implementation of the following papers:

with the support of:

Dependencies

  • Python 3; OpenCV.
  • TensorFlow >= 1.6 (1.4 or 1.5 can run but may crash due to a TF bug);
  • pycocotools: pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
  • Pre-trained ImageNet ResNet model from tensorpack model zoo.
  • COCO data. It needs to have the following directory structure:
COCO/DIR/
  annotations/
    instances_train2014.json
    instances_val2014.json
    instances_minival2014.json
    instances_valminusminival2014.json
  train2014/
    COCO_train2014_*.jpg
  val2014/
    COCO_val2014_*.jpg

minival and valminusminival are optional. You can download them here.

Usage

Train:

On a single machine:

./train.py --config \
    MODE_MASK=True MODE_FPN=True \
    DATA.BASEDIR=/path/to/COCO/DIR \
    BACKBONE.WEIGHTS=/path/to/ImageNet-R50-Pad.npz \

To run distributed training, set TRAINER=horovod and refer to HorovodTrainer docs.

Options can be changed by either the command line or the config.py file. Recommended configurations are listed in the table below.

The code is only valid for training with 1, 2, 4 or >=8 GPUs. Not training with 8 GPUs may result in different performance from the table below.

To predict on an image (and show output in a window):

./train.py --predict input.jpg --load /path/to/model --config SAME-AS-TRAINING

Evaluate the performance of a model on COCO. (Several trained models can be downloaded in model zoo):

./train.py --evaluate output.json --load /path/to/COCO-R50C4-MaskRCNN-Standard.npz \
    --config MODE_MASK=True DATA.BASEDIR=/path/to/COCO/DIR

Evaluation or prediction will need the same --config used during training.

Results

These models are trained with different configurations on trainval35k and evaluated on minival using mAP@IoU=0.50:0.95. Performance in Detectron can be roughly reproduced, some are better but some are worse. MaskRCNN results contain both box and mask mAP.

Backbone mAP
(box;mask)
Detectron mAP 1
(box;mask)
Time on 8 V100s Configurations
(click to expand)
R50-C4 33.8 18h
super quickMODE_MASK=False FRCNN.BATCH_PER_IM=64
PREPROC.SHORT_EDGE_SIZE=600 PREPROC.MAX_SIZE=1024
TRAIN.LR_SCHEDULE=[150000,230000,280000]
R50-C4 37.1 36.5 44h
standardMODE_MASK=False
R50-FPN 37.5 37.9 30h
standardMODE_MASK=False MODE_FPN=True
R50-C4 38.5;33.7 ⬇️ 37.8;32.8 49h
standardMODE_MASK=True
R50-FPN 38.8;35.4 ⬇️ 38.6;34.5 32h
standardMODE_MASK=True MODE_FPN=True
R50-FPN 39.8;35.5 39.5;34.42 34h
standard+ConvGNHeadMODE_MASK=True MODE_FPN=True
FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head
R50-FPN 40.3;36.4 ⬇️ 40.3;35.7 44h
standard+GNMODE_MASK=True MODE_FPN=True
FPN.NORM=GN BACKBONE.NORM=GN
FPN.FRCNN_HEAD_FUNC=fastrcnn_4conv1fc_gn_head
FPN.MRCNN_HEAD_FUNC=maskrcnn_up4conv_gn_head
R101-C4 41.7;35.5 ⬇️ 63h
standardMODE_MASK=True
BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]
R101-FPN 40.7;36.9 ⬇️ 40.9;36.4 40h
standardMODE_MASK=True MODE_FPN=True
BACKBONE.RESNET_NUM_BLOCK=[3,4,23,3]

1: Here we comapre models that have identical training & inference cost between the two implementation. However their numbers are different due to many small implementation details.

2: Numbers taken from Group Normalization

Notes

See Notes on This Implementation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.