GithubHelp home page GithubHelp logo

stnamjef / simplenn Goto Github PK

View Code? Open in Web Editor NEW
6.0 2.0 1.0 16.6 MB

A simple neural network library in C++

License: MIT License

C++ 95.67% C 4.33%
deep-learning deep-learning-tutorial convolutional-neural-networks batch-normalization lenet-5 mnist-classification

simplenn's Introduction

SimpleNN

  • SimpleNN is a C++ implementation of convolutional neural network (CNN).
  • Provides GEMM(GEneral Matrix Multiplication), im2col, and col2im operations.
  • Reasonably fast, without GPU.
    • It takes about 13 seconds per epoch (MNIST 60,000 images).
    • Test env = model: lenet5; batch size: 32; CPU: i7-10750H.
  • [Update: 2021-06-24]

1. Requirements

  • Eigen 3.3.9
  • g++ 9.0 or higher

2. Supported networks

layer-types

  • fully connected
  • convolutional
  • average pooling
  • max pooling
  • batch normalization

activation functions

  • sigmoid
  • tanh
  • relu
  • softmax

loss functions

  • mean squared error
  • cross-entropy

weight decay

  • L2 regularization

optimization algorithms

  • stochastic gradient descent

3. Usage

3.1. Data preparation

  • Download MNIST datasets and extract them to the directory named "dataset".
  • It should have a directory structure as below. Please do not change the file names.
SimpleNN
    ├── dataset
    │   ├── t10k-images.idx3-ubyte		# testing images
    │   ├── t10k-labels.idx1-ubyte		# testing labels
    │   ├── train-images.idx3-ubyte		# training images
    │   └── train-labels.idx1-ubyte		# training labels
    ├── headers
    │   ├── activation_layer.h
    │   ├── average_pooling_layer.h
    │   ├── batch_normalization_1d_layer.h
    │   ├── batch_normalization_2d_layer.h
    │   ├── col2im.h
    │   ├── common.h
    │   ├── config.h
    │   ├── convolutional_layer.h
    │   ├── data_loader.h
    │   ├── file_manage.h
    │   ├── flatten_layer.h
    │   ├── fully_connected_layer.h
    │   ├── im2col.h
    │   ├── layer.h
    │   ├── loss_layer.h
    │   ├── max_pooling_layer.h
    │   ├── optimizers.h
    │   └── simple_nn.h
    └── main.cpp
  • If the dataset directory is different, --data_dir must be specified.

3.2. Compile and build

  • Pull the docker image.
# host shell
docker pull stnamjef/eigen_3.3.9:1.0
  • Run the docker image
# pwd -> the project directory (SimpleNN)
docker run -it -v $(pwd):/usr/build stnamjef/eigen_3.3.9:1.0
  • Compile
# container shell at /usr/build
g++ main.cpp --std=c++17 -I ../include -O2 -o simplenn

3.3. Train predefined models

  • SimpleNN provides two predefined models: lenet5 and linear.
  • Ex 1) model: lenet5, pool: max, hidden layer activation: relu, loss: cross entropy, weight initialization: lecun uniform
# the command below is the same as ./simplenn (the default setting)
./simplenn --mode=train --model=lenet5 --pool=max --activ=relu --loss=cross_entropy --init=lecun_uniform
  • Ex 2) The model above with batch normalization adopted
# the command below is the same as ./simplenn --use_batchnorm=1
./simplenn --mode=train --model=lenet5 --pool=max --activ=relu --loss=cross_entropy --init=lecun_uniform --use_batchnorm=1
  • Ex 3) model: lenet5, pool: average, hidden layer activation: tanh, loss: mean squared error, weight initialization: xavier uniform
./simplenn --mode=train --model=lenet5 --pool=avg --activ=tanh --loss=mse --init=xavier_uniform
  • Ex 4) model: linear, hidden layer activation: tanh, loss: cross entropy, weight initialization: xavier uniform
./simplenn --mode=train --model=linear --activ=tanh --loss=cross_entropy --init=xavier_uniform

3.4. Test pretrained models

  • SimpleNN provides one pretrained weight: lenet5
  • Ex) test default model (error rate: 1.07%)
# if pretained weights are not in ./model_zoo, --save_dir should be changed
./simplenn --mode=test --save_dir=./model_zoo --pretrained=lenet5.pth

4. Build custom models

  • If you want to build your own model, write it in main.cpp file and follow the same process as in 3.1. Since CLI options are not available for custom models, we strongly recommend setting parameters (e.g. batch size, learning rate, decay...) manually before compiling.
  • Ex 1) Train a simple three-layer DNN model. Note that this model is already defined in SimpleNN and named "linear".
#include "headers/simple_nn.h"
using namespace std;
using namespace simple_nn;
using namespace Eigen;

int main()
{
	int n_train = 60000, n_test = 10000;
	int batch = 32, channels = 1, height = 28, width = 28, n_label = 10;

	MatXf train_X = read_mnist("./dataset", "train-images.idx3-ubyte", n_train);
	VecXi train_Y = read_mnist_label("./dataset", "train-labels.idx1-ubyte", n_train);
	MatXf test_X = read_mnist("./dataset", "t10k-images.idx3-ubyte", n_test);
	VecXi test_Y = read_mnist_label("./dataset", "t10k-labels.idx1-ubyte", n_test);

	DataLoader train_loader(train_X, train_Y, batch, channels, height, width, true);
	DataLoader test_loader(test_X, test_Y, batch, channels, height, width, false);

	SimpleNN model;

	model.add(new Linear(784, 500, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new ReLU);
	model.add(new Linear(500, 150, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new ReLU);
	model.add(new Linear(150, 10, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new Softmax);

	int epochs = 30;
	float lr = 0.01f, decay = 0.f;

	model.compile({ batch, channels, height, width }, new SGD(lr, decay), new CrossEntropyLoss);
	model.fit(train_loader, epochs, test_loader);
	model.save("./model_zoo", "linear");

	return 0;
}
  • Ex 2) Test the above model.
#include "headers/simple_nn.h"
using namespace std;
using namespace simple_nn;
using namespace Eigen;

int main()
{
	int n_train = 60000, n_test = 10000;
	int batch = 32, channels = 1, height = 28, width = 28, n_label = 10;

	MatXf test_X = read_mnist("./dataset", "t10k-images.idx3-ubyte", n_test);
	VecXi test_Y = read_mnist_label("./dataset", "t10k-labels.idx1-ubyte", n_test);

	DataLoader test_loader(test_X, test_Y, batch, channels, height, width, false);

	SimpleNN model;

	model.add(new Linear(784, 500, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new ReLU);
	model.add(new Linear(500, 150, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new ReLU);
	model.add(new Linear(150, 10, "lecun_uniform"));
	//model.add(new BatchNorm1d);
	model.add(new Softmax);

	model.compile({ batch, channels, height, width });
	model.load("./model_zoo", "linear.pth");
	model.evaluate(test_loader);

	return 0;
}

5. CLI options

Command Data type Description
--mode string Program mode (options: train, test; default: train)
--model string Model name (options: lenet5, linear; default: lenet5)
--data_dir string Dataset directory (default: ./dataset)
--save_dir string Saving directory (default: ./model_zoo)
--pretrained string Pretrained file name (default: None)
--pool string Pooling method (options: max, avg; default: max)
--activ string Activation function for hidden layer (options: tanh, relu; default: relu)
--init string Weight initialization (options: uniform, normal, lecun_uniform, lecun_normal, xavier_uniform, xavier_normal, kaiming_uniform, kaiming_normal; default: lecun_uniform)
--loss string Loss function for training (options: cross_entropy, mse; default: cross_entropy)
--batch int Batch size (default: 32)
--epoch int Total epochs (default: 30)
--lr float Learning rate (default: 0.01)
--decay float L2 regularization (default: 0)
--use_batchnorm bool Use batch normalization (options: 0, 1; default: 0)
--shuffle_train bool Shuffle training dataset (options: 0, 1; default: 1)
--shuffle_test bool Shuffle testing dataset (options: 0, 1; default: 0)

simplenn's People

Contributors

stnamjef avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

ducbxfsoft

simplenn's Issues

ADAM CODE C++ EXAMPLE

Hello Mr,
Thanks for your code, i can know how operation of CNN and how to train in C++. But I want to use Adam instead of SGD, please help me to create Adam code C++.
This my code:
void Conv2d::update_weight(float lr, float decay, int t)
{
// Update W and b with batch and decay
float beta1 = 0.9f;
float beta2 = 0.999f;
float epsilon = 1e-8;
mkernel = beta1 * mkernel.array() + (1 - beta1) * (dkernel / batch).array();
vkernel = beta2 * vkernel.array() + (1 - beta2) * (dkernel / batch).array().square();
mbias = beta1 * mbias.array() + (1 - beta1) * (dbias / batch).array();
vbias = beta2 * vbias.array() + (1 - beta2) * (dbias / batch).array().square();

	// Compute bias-corrected first and second moments
	m_hat_kernel = mkernel / (1 - pow(beta1, t));
	v_hat_kernel = vkernel / (1 - pow(beta2, t));
	m_hat_bias = mbias / (1 - pow(beta1, t));
	v_hat_bias = vbias / (1 - pow(beta2, t));

	// Update kernel and bias
	/*kernel = kernel.array() - lr * mkernel.array() / (vkernel.array().sqrt() + epsilon);
	bias = bias.array() - lr * mbias.array() / (vbias.array().sqrt() + epsilon);*/
	kernel = kernel.array() - lr * m_hat_kernel.array() / (v_hat_kernel.array().sqrt() + epsilon);
	bias = bias.array() - lr * m_hat_bias.array() / (v_hat_bias.array().sqrt() + epsilon);
}

with t is the loop of epoch. I train your code with my Adam code above and your loss function and the result of loss is inf.
Thank you so much

MNIST data

Just to mention that the MNIST data - in fact the whole website directory - is behind some sort of login prompt now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.