GithubHelp home page GithubHelp logo

mnoorfawi / multi-layer-perceptron-with-julia Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 0.0 123 KB

Building a deep artificial neural network (Multi-Layer Perceptron MLP) using Julia and its package Knet to predict the Forest Cover Type ...

Julia 100.00%
deep-neural-networks julia-language data-science classification knet

multi-layer-perceptron-with-julia's Introduction

Multi-Layer Perceptron

A multi-layer perceptron, i.e. a fully connected feed-forward deep neural network, is basically a bunch of linear regression models stuck together with non-linearities in between connecting multiple layers in a directed graph, which means that the signal passes through the nodes in one direction. Each layer requires its own set of parameters and feeds in to the layer above it, until we generate an output. We calculate the value of each layer by first applying a linear function to the acivatiosn of the layer below, and then applying an element-wise nonlinear activation function. Finally, with the final hidden layer, we'll generate an output ...

Here we're going to use Julia programming languge and its Knet package to fit an MLP model to predict Forest Cover Type.

N.B. we'll be using Julia version 0.6.4
# we first import the libraries to be used
using Knet, Plots, DataFrames, GZip

# then we read the gzip file and inspecting its size
gz = GZip.open("covtype.data.gz");
data = readtable(gz, separator = ',', header = false); 
size(data)
# (581012, 55)

The data has 581012 rows and 55 columns, let's check how many classes in the predicted column .

unique(data[end]) , length(unique(data[end]))
# ([5, 2, 1, 7, 3, 6, 4], 7)

Seven class from 1 to 7 . So we have over haf a million inputs and 7 outputs.

After inspecting the data with describe(data) we see that the first 10 columns need to be scaled.

# encode the data with scaling first 10 columns 
x = DataFrame();
for i in 1:size(data, 2)-1
    x[i] = map(Float32, data[i])
    if i <= 10
        x[i] = map(Float32, (data[:, i]- minimum(data[:, i])) / (maximum(data[:, i]) - minimum(data[:, i])))
    end 
end
x[:y] = map(UInt8, data[end]); # changing the class of the output

We then split the data randomly to train and test sets every one of them has to have its x features and y outcome. Then we build minibatches of the sets using minibatch function to rearrange the data into chunks of 100 instances each ...

# split data randomly into train and test
splits = round(Int, 0.1 * size(x, 1));
shuffled = randperm(size(x, 1));
xtrain, ytrain = [Array(x[shuffled[splits + 1:end], 1:end-1])', Array(x[shuffled[splits + 1:end], end])];
xtest, ytest = [Array(x[shuffled[1:splits], 1:end-1])', Array(x[shuffled[1:splits], end])];

btest = minibatch(xtest, ytest, 100); # [(x1,y1), (x2,y2),...] where xi,yi are minibatches of 100
btrain = minibatch(xtrain, ytrain, 100); # xtype = KnetArray{Float32} to use gpu
length(btrain), length(btest)
# (5229, 581)

We then define the predict, loss, and train functions to be used by the model ...

function predict(w, x)
    for i in 1:2:length(w)
        x = w[i] * x .+ w[i+1]
        if i < length(w)-1
            x = max.(0, x)                         
        end
    end
    return x
end;

loss(w, x, yreal) = nll(predict(w, x), yreal)
lossgradient = grad(loss)

function train(model, data, o)
    for (x, y) in data
        grads = lossgradient(model, x, y)
        update!(model, grads, o)
    end
end;

Now as the model is defined, we need to train it. First we randomly initialize the weights and the optimizer ... The model is of 3 hidden layers of 96, 64 and 32 units respectively.

# initial weights 
# (3 layers, 54 inputs => 96 units => 64 units => 32 units => 7 outputs)
w = map(Array{Float32},
        Any[ 0.1f0*randn(96, size(xtrain, 1)), zeros(96, 1),
             0.1f0*randn(64, 96), zeros(64, 1),
             0.1f0*randn(32, 64), zeros(32, 1),
             0.1f0*randn(7, 32),  zeros(7, 1) ]);

# model optimizer
o = optimizers(w, Adam); # o =  optimizers(w, Sgd;  lr=0.01);
# initialize loss and error lists to be used for plotting
trnloss = [];
tstloss = [];
trnerror = [];
tsterror = [];

Finally we run the model printing the results.

This can take a minute or something because it does so many work in each step it calculates accuracy, loss and error for both train and test and they're more than a half a million record ...
println((:epoch, 0, :train_accuracy, accuracy(w, btrain, predict), :test_accuracy, accuracy(w, btest, predict)))
# (:epoch, 0, :train_accuracy, 0.04286861732644865, :test_accuracy, 0.04225473321858864)

for epoch in 1:10
    train(w, btrain, o)
    append!(trnloss, nll(w, btrain, predict))
    append!(tstloss, nll(w, btest, predict))
    append!(trnerror, 1-accuracy(w, btrain, predict))
    append!(tsterror, 1-accuracy(w, btest, predict))
    println((:epoch, epoch, :train_accuracy, 1-trnerror[epoch], :test_accuracy, 1-tsterror[epoch]))
    # to calculate only accuracy 	
    # println((:epoch, epoch, :train_accuracy, accuracy(w, btrain, predict), :test_accuracy, accuracy(w, btest, predict)))
end
# (:epoch, 1, :train_accuracy, 0.7492350353796137, :test_accuracy, 0.7504302925989673)
# ...
# (:epoch, 5, :train_accuracy, 0.8048154522853318, :test_accuracy, 0.8048020654044751)
# ...
# (:epoch, 10, :train_accuracy, 0.845922738573341, :test_accuracy, 0.8450602409638554)

We can still further improve the model but we now have nice results ... Let's check the loss and error in the train and test datasets .

# plot train loss and test loss
plot([trnloss tstloss], ylim=(0.2, 0.7),
    labels = [:training_loss :test_loss], xlabel = "Epochs", ylabel = "Loss")

# plot train error and test error   
plot([trnerror tsterror], ylim = (0.0, 0.4),
    labels = [:training_error :test_error], xlabel = "Epochs", ylabel = "Error")

The model is pretty fine. We how have an idea how we can build Deep Artificial Neural Networks using Julia ...

Furthermore, we can get predicted values to plot or whatever by;
yhat = predict(w, xtrain); 
maxval, maxind = findmax(yhat, 1);
yhat = map(x -> ind2sub(yhat, x)[1], maxind);
correct = (map(Float32, yhat[1, :]) .== ytrain);
mean(correct)
# 0.8359242777451612
yhat
# 2  2  2  2  2  1  2  2  4  6  2  2 ...

multi-layer-perceptron-with-julia's People

Contributors

mnoorfawi avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.