GithubHelp home page GithubHelp logo

cnn-using-swish's Introduction

CNN using swish

Swish was introduce on oct 2017 as an alternative activation function to relu. Swish was found using a combinaton exhaustive search and reinforcement learning. In the originial paper [1], swish was demostrated improvement of top-1 classification by ImageNet by 0.9% by simply replacing all relu activation functions with swish. Nonethless, swish is very easy to implement just writing 1 line of code is enough to implement swish in tensorflow

Example

x1 = tf.nn.conv2d(X, W1, strides=[1,1,1,1], padding='SAME') + B1
Y1 = x1*tf.nn.sigmoid(beta1*x1)# output is 28x28

Results

alt text

During the inital phase of training the loss function remains , on average, the same this shows that swish suffers from poor intialisation during training, at least when using initally normal distributed weights with std_dev =0.1.

alt text

We were unable to replicate the results reported in the Swish paper, beta1 for us did not converge near 1 maybe because we didn't train our model long enough.

alt text

It seems that He initilisation doesn't really help this problem.

alt text

After change from SGD to RMSprop we immediately get better results.

Reference

  1. Searching for Activation Functions https://arxiv.org/abs/1710.05941

cnn-using-swish's People

Contributors

neoanarika avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.