GithubHelp home page GithubHelp logo

dannyhung1128 / deeplab-pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kazuto1011/deeplab-pytorch

0.0 1.0 0.0 12.97 MB

PyTorch implementation of DeepLab (ResNet-101) + COCO-Stuff

License: MIT License

Python 100.00%

deeplab-pytorch's Introduction

DeepLab with PyTorch

Unofficial implementation to train DeepLab v2 (ResNet-101) on COCO-Stuff 10k dataset. DeepLab is one of the CNN architectures for semantic image segmentation. COCO Stuff 10k is a semantic segmentation dataset, which includes 10,000 images from 182 thing/stuff classes. This reposotory contains DeepLab v3/v3+ model definition but not the training scripts.

Requirements

Usage

Preparation

Dataset

  1. Download COCO-Stuff 10k dataset and unzip it.
  2. Set the path to the dataset in config/cocostuff.yaml.
cocostuff-10k-v1.1
├── images
│   ├── COCO_train2014_000000000077.jpg
│   └── ...
├── annotations
│   ├── COCO_train2014_000000000077.mat
│   └── ...
└── imageLists
    ├── all.txt
    ├── test.txt
    └── train.txt

Caffemodel

  1. Download init.caffemodel pre-trained on MSCOCO under the directory data/models/deeplab_resnet101/coco_init/.
  2. Convert the caffemodel to pytorch compatible. No need to build the official implementation!
# This generates deeplabv2_resnet101_COCO_init.pth
python convert.py --dataset coco_init

Training

# Training
python train.py --config config/cocostuff.yaml
# Monitoring
tensorboard --logdir runs

See --help for more details.

Default settings

config/cocostuff.yaml

  • All the GPUs visible to the process are used. Please specify the scope with CUDA_VISIBLE_DEVICES=.
  • Stochastic gradient descent (SGD) is used with momentum of 0.9 and initial learning rate of 2.5e-4. Polynomial learning rate decay is employed; the learning rate is multiplied by (1-iter/max_iter)**power at every 10 iterations.
  • Weights are updated 20,000 iterations with mini-batch of 10. The batch is not processed at once due to high occupancy of video memories, instead, gradients of small batches are aggregated, and weight updating is performed at the end (batch_size * iter_size = 10).
  • Input images are randomly scaled by factors ranging from 0.5 to 1.5, zero-padded if needed, and randomly cropped so that the input size is fixed during training (see the example below).
  • Loss is defined as a sum of responses from multi-scale inputs (1x, 0.75x, 0.5x) and element-wise max across the scales. The "unlabeled" class (index -1) is ignored in the loss computation.
  • Moving average loss (average_loss in Caffe) can be monitored in TensorBoard (please specify a log directory, e.g., runs).
  • GPU memory usage is approx. 11.2 GB with the default setting (tested on the single Titan X). You can reduce it with a small batch_size.

Processed image vs. label examples

Data

Evaluation

python eval.py --config config/cocostuff.yaml \
               --model-path <PATH TO MODEL>

You can run with a option --crf. See --help for more details.

Results

After 20k iterations with a mini-batch of 10, without crf

Pixel Accuracy Mean Accuracy Mean IoU Frequency Weighted IoU
DeepLab v2 64.7% 45.4% 33.9% 50.0%
Official report 65.1% 45.5% 34.4% 50.4%

Demo

From a image

python demo.py --config config/cocostuff.yaml \
               --model-path <PATH TO MODEL> \
               --image-path <PATH TO IMAGE>

From a web camera

python uvcdemo.py --config config/cocostuff.yaml \
                  --model-path <PATH TO MODEL>

Visualize the model on TensorBoard

python draw_model.py

References

deeplab-pytorch's People

Contributors

kazuto1011 avatar

Watchers

Danny Hung avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.