GithubHelp home page GithubHelp logo

ozzie00 / keras-retinanet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fizyr/keras-retinanet

0.0 1.0 0.0 6.39 MB

Keras implementation of RetinaNet object detection.

License: Apache License 2.0

Python 100.00%

keras-retinanet's Introduction

Keras RetinaNet Build Status

Keras implementation of RetinaNet object detection as described in Focal Loss for Dense Object Detection by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He and Piotr Dollár.

Installation

  1. Clone this repository.
  2. In the repository, execute python setup.py install --user. Note that due to inconsistencies with how tensorflow should be installed, this package does not define a dependency on tensorflow as it will try to install that (which at least on Arch Linux results in an incorrect installation). Please make sure tensorflow is installed as per your systems requirements. Also, make sure Keras 2.1.2 is installed.
  3. As of writing, this repository requires the master branch of keras-resnet (run pip install --user --upgrade git+https://github.com/broadinstitute/keras-resnet).
  4. Optionally, install pycocotools if you want to train / test on the MS COCO dataset by running pip install --user git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI.

Training

keras-retinanet can be trained using this script. Note that the train script uses relative imports since it is inside the keras_retinanet package. If you want to adjust the script for your own use outside of this repository, you will need to switch it to use absolute imports.

If you installed keras-retinanet correctly, the train script will be installed as retinanet-train. However, if you make local modifications to the keras-retinanet repository, you should run the script directly from the repository. That will ensure that your local changes will be used by the train script.

Usage

For training on Pascal VOC, run:

# Running directly from the repository:
keras_retinanet/bin/train.py pascal <path to VOCdevkit/VOC2007>

# Using the installed script:
retinanet-train pascal <path to VOCdevkit/VOC2007>

For training on MS COCO, run:

# Running directly from the repository:
keras_retinanet/bin/train.py coco <path to MS COCO>

# Using the installed script:
retinanet-train coco <path to MS COCO>

For training on a custom dataset, a CSV file can be used as a way to pass the data. See below for more details on the format of these CSV files. To train using your CSV, run:

# Running directly from the repository:
keras_retinanet/bin/train.py csv <path to csv file containing annotations> <path to csv file containing classes>

# Using the installed script:
retinanet-train csv <path to csv file containing annotations> <path to csv file containing classes>

In general, the steps to train on your own datasets are:

  1. Create a model by calling for instance keras_retinanet.models.ResNet50RetinaNet and compile it. Empirically, the following compile arguments have been found to work well:
model.compile(
    loss={
        'regression'    : keras_retinanet.losses.regression_loss,
        'classification': keras_retinanet.losses.focal_loss()
    },
    optimizer=keras.optimizers.adam(lr=1e-5, clipnorm=0.001)
)
  1. Create generators for training and testing data (an example is show in keras_retinanet.preprocessing.PascalVocGenerator).
  2. Use model.fit_generator to start training.

Testing

An example of testing the network can be seen in this Notebook. In general, output can be retrieved from the network as follows:

_, _, detections = model.predict_on_batch(inputs)

Where detections are the resulting detections, shaped (None, None, 4 + num_classes) (for (x1, y1, x2, y2, cls1, cls2, ...)).

Loading models can be done in the following manner:

from keras_retinanet.models.resnet import custom_objects
model = keras.models.load_model('/path/to/model.h5', custom_objects=custom_objects)

Execution time on NVIDIA Pascal Titan X is roughly 55msec for an image of shape 1000x600x3.

CSV datasets

The CSVGenerator provides an easy way to define your own datasets. It uses two CSV files: one file containing annotations and one file containing a class name to ID mapping.

Annotations format

The CSV file with annotations should contain one annotation per line. Images with multiple bounding boxes should use one row per bounding box. Note that indexing for pixel values starts at 0. The expected format of each line is:

path/to/image.jpg,x1,y1,x2,y2,class_name

Some images may not contain any labeled objects. To add these images to the dataset as negative examples, add an annotation where x1, y1, x2, y2 and class_name are all empty:

path/to/image.jpg,,,,,

A full example:

/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,

This defines a dataset with 3 images. img_001.jpg contains a cow. img_002.jpg contains a cat and a bird. img_003.jpg contains no interesting objects/animals.

Class mapping format

The class name to ID mapping file should contain one mapping per line. Each line should use the following format:

class_name,id

Indexing for classes starts at 0. Do not include a background class as it is implicit.

For example:

cow,0
cat,1
bird,2

Results

MS COCO

The MS COCO model can be downloaded here. Results using the cocoapi are shown below (note: according to the paper, this configuration should achieve a mAP of 0.343).

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.325
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.513
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.342
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.149
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.354
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.465
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.288
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.437
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.464
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.263
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.510
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.623

Status

Example output images using keras-retinanet are shown below.

Example result of RetinaNet on MS COCO Example result of RetinaNet on MS COCO Example result of RetinaNet on MS COCO

Notes

  • This repository requires Keras 2.1.2.
  • This repository is tested using OpenCV 3.3.

Contributions to this project are welcome.

Discussions

Feel free to join the #keras-retinanet Keras Slack channel for discussions and questions.

keras-retinanet's People

Contributors

awilliamson avatar de-vri-es avatar enricoliscio avatar fangwudi avatar hello-program avatar hgaiser avatar mihaimorariu avatar mxvs avatar wassname avatar yhenon avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.