GithubHelp home page GithubHelp logo

tianchengg / mask_rcnn-for-sun-rgb-d Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aegorfk/mask_rcnn-for-sun-rgb-d

0.0 1.0 0.0 235.72 MB

Mask R-CNN for image segmentation of SUN RGB-D and NYU datasets

Jupyter Notebook 82.42% Python 17.58%

mask_rcnn-for-sun-rgb-d's Introduction

Mask R-CNN for Image Segmentation of SUN RGB-D and NYU datasets

Example detection

Description

Tools in this repository are designed to allow a user to retrain Mask R-CNN model on SUN RGB-D or NYU dataset for image segmentation task with pre-trained COCO weights. This repository is a follow-up development of a project created for my master's thesis (see here).

The library for these tools is based on Python implementation of Mask R-CNN by Waleed Abdulla, Matterport, Inc. (see here). The model generates bounding boxes and segmentation masks for an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.

The repository includes:

  • Source code of Mask R-CNN built on FPN and ResNet101.
  • Instruction and training code for the SUN RGB-D and NYU datasets.
  • Pre-trained weights on MS COCO.
  • Example of training on this datasets, with emphasize on adapting code to dataset with multiple classes.
  • Jupyter notebooks to visualize the detection result.
  • Integration with Kinect v2 and streaming video from webcam for image segmentation (super slow, but works).

Trained model on Video

Usage

Requirements

Python 3.4, TensorFlow GPU 1.10.0, Keras 2.1.3 and other common packages listed in requirements.txt. For Kinect v2 integration, pylibfreenect2 package and all dependencies are used, which might be installed from here.

For reproducing the results, download pre-trained COCO weights (mask_rcnn_coco.h5) from the releases page. For training or testing the model, pycocotools package is required. Installation guide might be found here.

Installation

It is developed under CentOS 7 with CUDA 9.0 and cuDNN v7.0.5. The program was mostly tested with Nvidia GeForce GTX 1080 Ti GPU.

git clone https://github.com/hateful-kate/Mask_RCNN.git
cd Mask_RCNN
mkdir datasets && cd datasets
wget http://rgbd.cs.princeton.edu/data/SUNRGBD.zip
unzip SUNRGBD.zip
rm SUNRGBD.zip
# if want to work with NYU dataset as separetelly
# mkdir NYU && cp -R SUNRGBD/kv1/NYUdata NYU
cd ..
pip3 install -r requirements.txt

Getting Started

  • demo.ipynb Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images. It includes code to run object detection and instance segmentation on arbitrary images. The same as in original Mask RCNN repository.

  • (model.py, utils.py, config.py): These files contain the main Mask RCNN implementation. Model.py is changed to work with a multiclass classification.

  • Read_big_Matlab_file.ipynb. This notebook reads the mapping file from SUNRGBD Toolbox from the SUNRGBD Toolbox page, parse the annotation to json format and does mapping with the rest of the data. Requires downloading SUNRGB-D Toolbox and 64 GB of RAM to process the results provided in this repository.

  • Preprocess_dataset.ipynb. This notebook creates the structure of dataset, parse the annotation to VGG format, does mapping to 13 or 37-40 classes and splits the data to train/val/test accordingly.

  • inspect_sun_data.ipynb. This notebook visualizes the different pre-processing steps to prepare the training data.

  • inspect_sun_model.ipynb This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline as well as the mAP calculation for every IoU level.

  • inspect_sun_weights.ipynb This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.

  • Segmentation_of_video.ipynb This notebooks converts mp4 video to a video with an image segmentation on top.

  • Kinect_streaming_segmentation.ipynb This notebooks uses Kinect v2 color channel for a video streaming with an image segmentation on top.

  • Image_segmentation_video_stream.ipynb This notebooks uses local camera to a video stream with an image segmentation on top.

Example streaming

Training using pre-trained MS COCO weights

MS COCO weights are used as a starting point for training the model on SUN RGB-D and NYU datasets. Training and evaluation code is in samples/sun/sun.py. For reproducing the results you need to run all scripts from the samples/sun directory from the command line as such:

# Train a new model starting from pre-trained COCO weights
python3 samples/sun/sun.py train --dataset=/path/to/sun/or/nyu --model=coco

# Continue training a model that you had trained earlier
python3 samples/sun/sun.py train --dataset=/path/to/sun/or/nyu --model=/path/to/weights.h5

# Continue training the last model you trained. This will find
# the last trained weights in the model directory.
python3 samples/sun/sun.py train --dataset=/path/to/sun/or/nyu --model=last

The training schedule, learning rate, and other parameters should be set in samples/sun/sun.py.

Differences from the Official Implementation

  • Mini-masks: All mini-masks are removed.
  • Multi-threading: Not alloved as it mostly slows down the whole system, because more resources are spent on administrative work, than on execution itself.
  • Learning Rate: The paper uses a learning rate of 0.02, here they are changed to 0.001 for a better convergence without significant decrease in speed.

License

Copyright (c) 2018 Ekaterina Lyapina. Contact me for commercial use (or rather any use that is not academic research) (email: ec16513 at qmul.ac.uk). Free for research use, as long as proper attribution is given and this copyright notice is retained.

mask_rcnn-for-sun-rgb-d's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.