GithubHelp home page GithubHelp logo

joeysu / pre-resnet-prototxt-for-caffe Goto Github PK

View Code? Open in Web Editor NEW

This project forked from coldmooon/variants-resnet-prototxt-caffe

0.0 1.0 0.0 68 KB

Create (Pre-)ResNet prototxt (including training and test) for Caffe

Python 100.00%

pre-resnet-prototxt-for-caffe's Introduction

ResNet-Prototxt-for-Caffe

This script is used to create ResNet (Deep residual networks https://arxiv.org/abs/1512.03385) and Pre-Resnet ("Identity Mappings in Deep Residual Networks" (http://arxiv.org/abs/1603.05027).) prototxt on cifar10/100 (60000 32x32 colour images in 10/100 classes, https://www.cs.toronto.edu/~kriz/cifar.html) for Caffe. Following the original paper, the parameter N needs to be given.

for ResNet, N =

  • 3 for 20-layer network
  • 5 for 32-layer network
  • 7 for 44-layer network
  • 9 for 56-layer network
  • 18 for 110-layer network

, and for Pre-ResNet, N =

  • 18 for 164-layer network
  • 111 for 1001-layer network

resnet_cifar.py is completely consistent with fb.resnet.torch/models/resnet.lua. preresnet_cifar.py is completely consistent with KaimingHe/resnet-1k-layers

Usage:

python (pre)resnet_cifar.py training-data-path test-data-path mean-file-path N

where

  • training-data-path: the path of training data (LEVELDB or LMDB)
  • test-data-path: the path of test data (LEVELDB or LMDB)
  • mean-file-path: the path of mean file for training data
  • N: a parameter introduced by the original paper, meaning the number of repeat of residualn building block for each feature map size (32, 16, 8). For example, N = 5 means that creat 5 residual building blocks for feature map size 32, 5 for feature map size 16, and 5 for feature map size 8. Besides, in each building block, two weighted layers are included. So there are (5 + 5 + 5)*2 + 2 = 32 layers.

Examples:

# 32-layer ResNet prototxt file
python resnet_cifar.py ./training-data ./test-data ./mean.binaryproto 5

# 1001-layer Pre-ResNet prototxt file
python preresnet_cifar.py ./training-data ./test-data ./mean.binaryproto 111

Validation error rate

model my test error test error from Tab.6
resnet20 8.5 8.75
resnet32 7.57 7.51
resnet44 7.13 7.17
resnet56 7 6.97

In these experiments, residual networks were trained on a preprocessed cifar10 dataset( GCN and ZCA) for 200 epochs( 80k iterations ). The initial learning rate was 0.1 and divided by 10 at 48k and 64k iterations, which is different from the origianl paper. For other settings, weight decay, momentum, batch size and data augmentation were the same as paper. I didn't use Nesterov momentum.

Nesterov momentum

I also tried Nesterov momentum on the resnet44 with other settings unchanged.

model Newterov momentum vanilla momentum
resnet44 7.35 7.13

Comparison of BN stats used in Test phase

In some repositories, BN stats was set to false in both Train and Test phase. However, this will harm the accuracy.

model BN stats False in Test BN stats True in Test
resnet32 8.81 8.11
resnet44 8.41 7.74
resnet56 7.98 7.36

All the settings were the same as the original paper: train for 64k iterations; learning rate starts at 0.1 and was divided by 10 at 32k and 48k iterations. Note that the final results of BN stats True in Test are slightly worse than the case of training for 200 epochs.

pre-resnet-prototxt-for-caffe's People

Contributors

coldmooon avatar h4ck3rm1k3 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.