GithubHelp home page GithubHelp logo

souravs17031999 / dog-breed-classifier-app Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 5.77 MB

This repo includes complete end to end algorithm for dog breed classification mechanism using deep learning.

Jupyter Notebook 46.65% Python 0.22% HTML 53.13%
deep-learning udacity-deep-learning neural-network face-recognition dog-breed-classifier convolutional-neural-networks histogram-of-oriented-gradients haar-cascade-classifier local-binary-patterns resnet-50

dog-breed-classifier-app's Introduction

Project Objective : Write an Algorithm for a Dog Identification App using Deep Learning

Dataset :

Click here for Dataset1 Click here for Dataset2.

Motivation :

learning and understanding of Convolutional Neural Networks

Pipeline for Project :

  • First thing in this pipeline is recognition algorithm for both humans and dogs and then classify it by giving out the exact name of breed.
    p1.png

Human Face detector :

We can try different pretrained algorithms by OpenCV.   
I have tried HOG, LBP, HAAR etc..   or any other deep learning based pre trained models.      
  • HAAR:
    This algorithm is also called voila jones algorithm based on HAAR like wavelets. HAAR wavelets are sequence of rescaled square shaped functions which is explained in detailed way here.

p3.png

HAAR like features for detection :

p2.png

A target window of some determined size is moved over the entire input image for all possible locations to calculate HAAR like features and since it was a very high computational task, therefore alternative method using integral images was designed. The way it works is described briefly below :

Integral images calculation reduced the computations :

In below figure, haar works by calculating the difference between sum of black and sum white shades and let's say here it comes out to be something like :
             haar feature                                                               real images

p4.png
p5.jpg

The closer this difference is to "1", then most probably a haar feature has been detected !

Integral images :
p6.jpg

Histogram of oriented gradients is calculated by taking difference in pixel intensities for every block of pixel in a 64 * 64 window, similar to sliding window over the entire image.
This is based on the fact that, certain regions of our face have slightly darker shades over the other and thus there becomes gradient oientation of vector in some localized portions of our face.

Like in this image, we can see the gradient magnitude and gradient direction:

p7.jpg

Now calculating for all pixel blocks:

p8.jpg

For more detailed explanation, click here.

Local binary patterns is a algorithm for feature detection based on local representation of texture.
How it's calculated ? Let's see...
For every block (in grayscale) , we select a center pixel value and construct a threshold by indicating 1 if value in center is greater than or equal to neighbouring one otherwise zero and then construct a 1 -D array by warping around either in clockwise or anticlockwise direction.
(Here i show for one of the central pixel - "10", but it is done for every other pixel block)

p9.png

Then, a histogram of 256 bin is constructed from the final output lbp pattern image.

p10.png

Dog Face detector :

  • Here also, we can use above detectors mainly deep learning based like VGG16, ResNet50 etc.

Some examples :

p11.JPG

CNN classification :

  • Now, that we have recognized that if image contains a dog face, a human face or none of them.
    It's time for training our own neural network for classifiying the breed of dog if image contains dog (or most resembled label for human !)
    So, let's get started.....
We can this here using two different approaches :    
* Constructing CNN from scratch    
* Using pre trained CNN models 

CNN from scratch and it's overview :

  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))        
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))      
  (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))     
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)         
  (fc1): Linear(in_features=50176, out_features=500, bias=True)     
  (fc2): Linear(in_features=500, out_features=133, bias=True)      
  (dropout): Dropout(p=0.5, inplace=False)       

Pre trained ResNet50 model :

(Reasons of choosing this model has been included in the notebook itself)

  • This architechture contains (conv1) as first convolutional layer containing in channels as 3 which is due to RGB input tensor , (bn1) as batch normalization layer, followed by ReLU and MaxPooling and then it contains 4 main layers named layer1, layer2, layer3 and layer4 which contains further sub layers of convolution followed by batchnorm followed by relu followed by maxpooling , and then finally fc.
  • ReLU activation is used as it's the most proven activation function for classification problems as it introduces good and right amount of non linearity with less chances of vanishing gradient problem !
  • Batch normalization helped in making the network more stable and learning faster thereby faster convergence.
  • Maxpooling helped in downsampling high number of parameters created by producing higher dimensional feature maps after convolution operation and thus selecting only relevant features from the high dimensioned feature matrix.
  • Then i replaced last layer of this architechture by fully connected layer containing two sub linear layers as follows : Linear(in_features=2048, out_features=512) Linear(in_features=512, out_features=133)
    with ReLU activations between the linears.

Optimizer and loss function :

* Used both CrossEntropyLoss() and NLLLoss()    
* Used SGD and Adam

p12.png

Some graphics of data augmentation used :

  • Augmentation used :
transforms.RandomRotation(10),       
transforms.RandomResizedCrop(224),      
transforms.RandomHorizontalFlip()     

p13.JPG

Finally some examples/results :

p15.JPG p16.JPG
p17.JPG
p18.JPG

p14.JPG

Getting started :

  • For getting started locally on your own system, click here.

Navigating Project :

Links to references for more detailed learning :

⭐️ this Project if you liked it !

dog-breed-classifier-app's People

Contributors

dependabot[bot] avatar souravs17031999 avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

prat12345

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.