gangchill / nip-convnet Goto Github PK

View Code? Open in Web Editor NEW

2.0 3.0 3.0 10.41 MB

Convolutional Autoencoder implemented in Tensorflow

License: Apache License 2.0

Python 52.01% Shell 10.79% TeX 37.21%

nip-convnet's Introduction

nip-convnet

execution of current version

Dependencies:

tensorflow (tested with 1.1.0 )
python 2.7 (tested with 2.7.12)
matplotlib (tested with 1.5.1 )
pandas (tested with 0.20.2)
Pillow (tested with 4.1.1)
scipy (tested with 0.19.0)
scikit-learn (tested with 0.18.1)

To train and test a simple single-layer autoencoder on the MNIST dataset, simply call 'python train_and_test_simple_mnist_autoencoder.py'

project description

We want to train a neural network to classify images. Before we do that, an Autoencoder is trained for the network to pertain information of its input. The weights obtained from training the autoencoder are used for initializing a neural network for image classification. It has been shown that this pre-training of the network allows for obtaining higher generalization performance than when starting from a random weight initialization. This project will be about using a convolutional architecture for the Autoencoder that is well suited for visual data in obtaining said improved weight initialization. Initially we will reproduce the experiment of following paper:

Masci, J., Meier, U., Cireşan, D., & Schmidhuber, J. (2011). Stacked convolutional auto-encoders for hierarchical feature extraction. Artificial Neural Networks and Machine Learning–ICANN 2011, 52-59.

Other Relevant Papers:

Bengio et al. Representation Learning: A Review and New Perspectives
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. Advances in neural information processing systems, 19, 153.
Makhzani, A., & Frey, B. (2014, September). A winner-take-all method for training sparse convolutional autoencoders. In NIPS Deep Learning Workshop.
D. Hamester, P. Barros and S. Wermter, Face expression recognition with a 2-channel Convolutional Neural Network, 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, 2015, pp. 1-8.

Datasets:

http://www.pitt.edu/~emotion/ck-spread.htm

Tutorials:

https://www.tensorflow.org/tutorials/

nip-convnet's People

Contributors

Stargazers

Watchers

Forkers

kashefy nanyangny helena-yqs

nip-convnet's Issues

When resuming training, it restarts from the best score so far -> some iterations are calculated twice if this was not the last one

Possible solutions:

always restore last setting not the best one (might be way worse and we loose the best setting)
always resume training from best iteration and delete old logs (seems strange)
store both at the same time and give the possibility to do both (probably best solution)

Find a Good Representation of Learned Convolutional Filters

Since the network learns a lot of kernels that could be visualized, we need a good way to verify whether we learned something meaningful.
In a convolutional layer that maps 32 feature maps to 64 maps with 'same' padding and strides [1,1,1,1], we get 32 * 64 kernels that we could potentially visualize

Visualize a random subset?

CNN classification accuracy for CK+ dataset is bad

Plan: Tune CNN architecture and parameters to improve classification accuracy for the CK+ dataset

Two-Layer Convolutional Autoencoder does not learn anything useful

When training a deeper convolutional autoencoder with max-pooling, the reconstructions as visualized by the visualize_ae_representation function are all zero / black, this might be a problem of the visualization itself because the error still changes during the training.

Example setup:

filter_dims = [(7,7), (3,3)]
hidden_channels = [4,6]
use_max_pooling = True
activation_function = 'relu'

Get CK+ dataset into tensorflow

Check whether fine-tuning works / compare different representations

Transfer Encoding Weights from Autoencoder to CNN

Implement a function in the convolutional autoencoder that stores the learned encoding weights and a function in the simple CNN that loads these encoding weights and uses them to initialize its own convolutional layers

Simple CNN accuracy too low

Possible factors:

bias initialization mu=0.1 seems too high. Recommend going lower (e.g. 0.001)
Dropout not used during training. Currently set to 1.0, which seems to be intended for testing but it's also being used during training instead of something like 0.5 during training

Non critical - but learning is bit slow. A combination of this may help:

increase optimizer learning rate. But don't go too high, otherwise it'll oscillate. (e.g. 0.01)
increase batch size. More stable gradients (e.g. 128), computes faster too.

training_and_weight_pass_demo needs to be adapted to the current version of cae and cnn

We changed the constructor for CAE and CNN, this needs to be changed in the file as well

Deep Autoencoder doesn't work with ReLU activation function

A deeper autoencoder only learns something useful with a sigmoid activation for instance, for the weight transfer to the CNN, a relu activation would e more useful. Find out why it doesn't work.

Put all Model parameters in a dictionary and init the models with it

Create demo files

Currently I am getting mixed results for the CAE training again because I was experimenting a lot with the parameter settings.
We already had a CNN with ReLUs and a CAE with sigmoid functions.
TODO: write demo scripts for both a CNN and a CAE that demonstrate working parameter settings

std::bad_alloc when using pooling_type other than 'strided_conv'

When setting pooling_type to None or max_pooling, we get a std::bad_alloc error in the first training iteration.
This occured after adding the possibility to use strided convolutions instead of max pooling.
Training now works with strided convolutions but the origin of the error is unknown.
Before adding the strided conovlutions, max_pooling worked.

Config load / storing

Add config loading / storig in train_and_test_cnn / cae

TODO Sabbir ,marks the lines in train_and_test_cnn.py that need to be adjusted and describes what needs to be done:

add flag that loads config if set to true (behaviour as in train_and_test_cnn_from_config), or lets us enter the parameters ourself (old behaviour)
store the config file in the folder (TODO Sabbir at the end)
do the same for the cae as well

Conduct some proper experiments

compare the accuracy of pre-training / no pre-training

Train a CK+ autoencoer

Implement Layer-Wise Training of the Convolutional Autoencoder

A layer-wise training method for the convolutional autoencoder is used in (Masci, Jonathan, et al. "Stacked convolutional auto-encoders for hierarchical feature extraction." Artificial Neural Networks and Machine Learning–ICANN 2011 (2011): 52-59. ) This might help to achieve good results for deeper autoencoders.

Check CIPHAR-10

Redo experiments on ciphar 10

std::bad_alloc if max_pooling is used in cnn on MNIST

Add Leaky ReLu's to the autoencoder

Currently deeper Autoencoders do not work with ReLu activations, leaky relus might solve ths problem. They are used in all the convolutional autoencoders we found available online.
example:
https://github.com/pkmital/tensorflow_tutorials/blob/master/python/09_convolutional_autoencoder.py

Does autoencoder implemented in greedy layer-wise fashion?

According to the paper "Greedy Layer-Wise Training of Deep Networks", 2006, each layer of the autoencoder should be trained greedily in a purely unsupervised way.

To simply put,

trained one layer at a time, from first to last, with unsupervised criterion
update parameters of the previous layers
add another layer and repeat (if any)

But in your convolutional ae implementation, I don't see you train each layer of CAE locally with a number of iteration and then add additional layer (repeat the above procedure) to form a stacked CAE. Instead, you just constructed a 3-layer CAE and train them globally.

Your pretraining part is fine, where you trained the CAE in the first step. And take the output of the CAE to initialize the input of the CNN as feature extraction.

Correct me if I'm wrong. I want to contribute to the correct implementation of stacked CAE pretraining. Thanks!

Exclude Individual Variables from Training

To evaluate the quality of a representation for classification, we need a fine-tune option for our convolutional neural network that only learns the classification weights. This would enable us to compare different encoding weights obtained from an autoencoder.