lisa-lab / deeplearningtutorials Goto Github PK

Deep Learning Tutorial notes and code. See the wiki for more info.

Home Page: http://deeplearning.net/tutorial

License: Other

Python 93.51% Shell 2.62% Perl 3.87%

deeplearningtutorials's Introduction

Deep Learning Tutorials

Deep Learning is a new area of Machine Learning research, which has been introduced with the objective of moving Machine Learning closer to one of its original goals: Artificial Intelligence. Deep Learning is about learning multiple levels of representation and abstraction that help to make sense of data such as images, sound, and text. The tutorials presented here will introduce you to some of the most important deep learning algorithms and will also show you how to run them using Theano. Theano is a python library that makes writing deep learning models easy, and gives the option of training them on a GPU.

The easiest way to follow the tutorials is to browse them online.

Main development of this project.

https://secure.travis-ci.org/lisa-lab/DeepLearningTutorials.png

Project Layout

Subdirectories:

code - Python files corresponding to each tutorial
data - data and scripts to download data that is used by the tutorials
doc - restructured text used by Sphinx to build the tutorial website
html - built automatically by doc/Makefile, contains tutorial website
issues_closed - issue tracking
issues_open - issue tracking
misc - administrative scripts

Build instructions

To build the html version of the tutorials, run python doc/scripts/docgen.py

deeplearningtutorials's People

Contributors

Stargazers

Watchers

Forkers

doomie jaberg pascanur fidlej dchichkov lumberlabs nouiz odelalleau yosinski gdesjardins scyoyo caglar srifai trunghlt yongsun wqren jekky donkang75 huamulan tener ageek jmarinero boulanni aelaguiz fbreuer skallumadi liang456 yiyinianhua shelocks odinlin mlnotes xwzhu fourleaves zhuwenxiang cuijianzhu xuyuan-qd xiaoyili jaewonk eedanny nvdnkpr loull521 yiiwood tek1031 huaijin-chen fateiswar xjzhou lijinhui yangls06 irwenqiang liyanghua dreamfrog ttang235 ohsaworks playcoin cslxiao sesas guozhimao kevin-de-granta crazyliu sheepufo wonbybiny hestendelin witgo gavinhwang charnugagoo git-zhyifeng lucktroy bebekifis kirk86 zuiwufenghua imclab paulhobbs coderxiang shirc xboard tongming fanfannothing richardhahahaha vkuznet a-covar liuqinggh able lkliukai mrgloom chaosconst honnix skyuuka lujun5011802 zenhacker k-fujikawa vivanac zetan qyouurcs emchristiansen jacoblsmith pombredanne samuelzeng kod3r arezu-moussavi iamima

deeplearningtutorials's Issues

DBN tutorial code needs to be updated

While training the tutorial DBN model on MNIST data, I noticed that each pre-training epoch was taking about 7 seconds to complete, as compared to other DBN architectures which took around 0.5 seconds per epoch. After profiling a few of the compiled theano functions and doing some research online, I discovered that the issue is exactly what is described in this ticket: Theano/Theano#1233.
The tutorial code should be updated so that those who are trying out theano for the first time by running the tutorial models are not turned away by this excessive training time cost. A simple fix would be to update lines 11 and 61 in DBN.py to the following:

[11]: import theano.sandbox.rng_mrg import MRG_RandomStreams
[61]: theano_rng = MRG_RandomStreams(numpy_rng.randint(2 ** 30))

-R. Feinman

Issue regarding loading a dataset

I am training CIFAR10 dataset using a CNN.I am facing a problem while loading the dataset into dictionary variables. CIFAR10 dataset contains 6 batches 5 of which can be used for training and validation while one batch can be used for testing. I would like to split the 5 batches into 4 batches for training and 1 for validation.But,these batches are stored in a serialized format using Pickle .I am facing a problem while loading these 6 batches into 3 dictionaries 1 each for training,validation and testing where each dictionary contains data and labels as mentioned in the link given below(The dataset can also be downloaded from there)
Link to download dataset: https://www.cs.toronto.edu/~kriz/cifar.html

Here is the code snippet which I have modified from convolutional_mlp.py file in load_data method

       dataset="path to the directory"
       file_train=dataset+"data_batch_1"
        f=open(file_train, 'rb')
        train_set= pickle.load(f, encoding='latin1')
        file_train=dataset+"data_batch_2"
        f=open(file_train, 'rb')
        train2_set= pickle.load(f, encoding='latin1')
        train_set.update(train2_set)
    #load 3rd batch
        file_train=dataset+"data_batch_3"
        f=open(file_train, 'rb')
        train3_set= pickle.load(f, encoding='latin1')
        train_set.update(train3_set)
    #load 4th batch
        file_train=dataset+"data_batch_4"
        f=open(file_train, 'rb')
        train4_set= pickle.load(f, encoding='latin1')
        train_set.update(train4_set)
    #load validation data

        file_validate=dataset+"data_batch_5"
        f=open(file_validate, 'rb')
        validation_set= pickle.load(f, encoding='latin1')

    #load test data
        file_test=dataset+"test_batch"
        f=open(file_test, 'rb')
        test_set= pickle.load(f, encoding='latin1')
    return train_set,validation_set,test_set

#modifcation in sgd_optimization_mnist method
def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000,
                           dataset='/home/ubuntu/Desktop/ramya/cifar-10-batches-py/',
                           batch_size=600):
    """
    Demonstrate stochastic gradient descent optimization of a log-linear
    model

    This is demonstrated on MNIST.

    :type learning_rate: float
    :param learning_rate: learning rate used (factor for the stochastic
                          gradient)

    :type n_epochs: int
    :param n_epochs: maximal number of epochs to run the optimizer

    :type dataset: string
    :param dataset: the path of the MNIST dataset file from
                 http://www.iro.umontreal.ca/~lisa/deep/data/mnist//home/ubuntu/Desktop/ramya/cifar-10-batches-py

    """
    datasets_0,datasets_1,datasets_2 = load_data(dataset)

    train_set_x, train_set_y = datasets_0
    valid_set_x, valid_set_y = datasets_1
    test_set_x, test_set_y = datasets_2

results of rnnslu.py not as expected

Hi,
I'm running your code rnnslu.py for ATIS (the the 5 folds of data you provide us with)
(DeepLearningTutorials/code/rnnslu.py)

I expected to get F1=94.98% as mentioned in the paper:
'Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding'
(table 3: Elman/Single,context(9))

Instead, the best performance with this code is F1=94.058% for cont_win=9 and emb_dim=100.
Is that what I'm suppose to get?

Thank you in advance!

checking uidx == 0 in lstm.py

uidx is initialised to 0 on line 544
then increased by 1 on line 555
so I think uidx == 0 on line 598 should be replaced by uidx == 1

Convolutional Neural Networks Tutorial code (code/convolutional_mlp.py) raises error

Trying to run the code belonging to the tutorial: http://deeplearning.net/tutorial/lenet.html raises the following error:

Downloading data from http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
... loading data
... building the model
Traceback (most recent call last):
  File "code/convolutional_mlp.py", line 345, in <module>
    evaluate_lenet5()
  File "code/convolutional_mlp.py", line 182, in evaluate_lenet5
    poolsize=(2, 2)
  File "code/convolutional_mlp.py", line 97, in __init__
    input_shape=image_shape
  File "/home/arthur/.local/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 149, in conv2d
    imshp=imshp, kshp=kshp, nkern=nkern, bsize=bsize, **kargs)
TypeError: __init__() got an unexpected keyword argument 'input_shape'

All I did was cloning this repo and running the following command:
python code/convolutional_mlp.py

As i was not able to find any information about this on the web I thought I might post it here.

lstm dimension match in function _step

Hello

I'am learning the deep learning models following your code. I have a question while reading the code for lstm. As I understand, in the function lstm_layer, your input state_below is a 3d tensor with dimension [time_stamp, n_samples, dim_proj], this is used in function _step with theano.scan.

rval, updates = theano.scan(_step,
                                sequences=[mask, state_below],
                                outputs_info=[tensor.alloc(numpy_floatX(0.),
                                                           n_samples,
                                                           dim_proj),
                                              tensor.alloc(numpy_floatX(0.),
                                                           n_samples,
                                                           dim_proj)],
                                name=_p(prefix, '_layers'),
                                n_steps=nsteps)

But in the function _step, I saw

preact = tensor.dot(h_, tparams[_p(prefix, 'U')])
preact += x_
i = tensor.nnet.sigmoid(_slice(preact, 0, options['dim_proj']))
f = tensor.nnet.sigmoid(_slice(preact, 1, options['dim_proj']))
o = tensor.nnet.sigmoid(_slice(preact, 2, options['dim_proj']))
c = tensor.tanh(_slice(preact, 3, options['dim_proj']))
c = f * c_ + i * c
c = m_[:, None] * c + (1. - m_)[:, None] * c_
h = o * tensor.tanh(c)
h = m_[:, None] * h + (1. - m_)[:, None] * h_
return h, c

It seems that h_ has the dimension [n_samples, dim_proj] and x_ is state_below, how can the h returned by this function keeps the same dimension and contains the values computed based on x_k in step k?

"Duplicate name" warning while dumping a SdA

There will be User Warning: Duplicate name W and User Warning: Duplicate name b while I dumping my SdA model using theano.misc.pkl_utils.dump. And I found that a SdA model cannot be loaded correctly because of the duplicate name of variables W and b.

So is it necessary to use different name for each theano tensor W and b in model classes? In fact, that's the way I solve the problem mentioned above.

typo in mlp.py

In line 119 of mlp.py. 'softamx' Should be 'softmax'.

how to use rnnrbm model to predict to which class of music one piano notes sequence belongs

thanks a lot..

MLP equations

In the equations of MLP

https://github.com/lisa-lab/DeepLearningTutorials/blob/master/doc/mlp.txt#L64

The equations seem missing transpose operations. Since the shape of W is n_input x n_output, i.e. D x D_hidden. There should be a transpose for W in the equation, like W^(1)T x.

rnnslu, is there a bug when update h0?

in rnnslu, h0 = h0 - lr * gradient_h0
should it be: h0 = h[-1]?

Add License

Is this code released under a license? It would be good to have a license file in the root of the project, or have the license (or lack thereof) mentioned in the readme.

convulational_mlp.py - python.exe has stopped working

I am using Windows 7 32 bit. When I run the convulational_mlp.py, Python crashes with exit code - 1073741819 (0xC0000005).

It crashes just after
.... training

Issue regarding pickling a convolutional neural network

I have modified "convolutional_mlp.py" file so that I can pickle the cnn class's object which will further help me to predict the classes of image using the saved model.But I am getting an error while doing so in the line-377 where I am actually calling the pickle.dump() method.
I am attaching the code for the file and the screenshot of the error.I guess the error is due to pickling the instance methods of logistic_sgd(though I am not sure about it)

self.layer3 = LogisticRegression(input=self.layer2.output, n_in=500, n_out=10)
self.negative_log_likelihood = (
self.layer3.negative_log_likelihood
)
#predicting the calss value and calculating errors
self.predict=self.layer3.y_pred
self.errors = self.layer3.errors
# the cost we minimize during training is the NLL of the model
#self.cost = self.layer3.negative_log_likelihood(y)
# create a list of all model parameters to be fit by gradient descent
self.params = self.layer3.params + self.layer2.params + self.layer1.params + self.layer0.params
self.input = input

covolutional.txt

Need help to get started

I am very new to this topics and deep learning.
I have questions that may sound stupid but I need to ask them because I have noone to ask and I could not find them in google! :)
So I am using Ubuntu 14.04 and installed anaconda and python 2.7 and installed Theano and make it work fine!

So here is my first question, do I need to install something else in order to use this DeepLearningTutorials? (if yes please tell me how).
Do I need to add any more PATH ? (hmmm if yes how :/ )

e.g. I was trying to do this tutorial regarding SdA (http://deeplearning.net/tutorial/code/SdA.py).
I could run this commands fine
import os
import sys
import timeit
import numpy
import theano
import theano.tensor as T
from theano.tensor.shared_randomstreams import RandomStreams.
However the next commands are not fine anymore, when I am trying to:
from logistic_sgd import LogisticRegression, load_data
from mlp import HiddenLayer
from dA import dA

I face the errors that no module to import....
cannot import LogisticRegression or etc anymore.

Can anyone please tell me what I need to do to be able use the tutorials?
Thanks in advance.

Question about dA / SdA

I would like to retrieve the encoding of the input vector from the autoencoder architecture. I am able to train an autoencoder, but would like to obtain the lower-dimensional encodings given by the hidden layers. Is there a way to do this?

Not compatible with Python 3+ , Pl. start a development branch

The current version is not compatible with Python 3+ . A development branch, to initiate migration to Python 3, would be great. I will be happy to contribute.
Thanks.

Saving the Model and Predicting

Hi;
Is it possible to save the network after every say 100 training samples.
Once the training is completed, is it possible to load the network from the HDD and use for predicting on a new dataset?
Regards
Varghese

Add comment to the LSTM example

During the summer school, a participant told me it would be useful to have more comments in the LSTM code.

TheanoConfigParser object has no attribute 'scan' when I run lstm.py

I run the command: python lstm.py

then throw follow error, what's the problem ?

Traceback (most recent call last):
File "lstm.py", line 577, in
theano.config.scan.allow_gc = False
AttributeError: 'TheanoConfigParser' object has no attribute 'scan'

batches counts error in logistic_sgd.py

n_train_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] / batch_size
n_test_batches = test_set_x.get_value(borrow=True).shape[0] / batch_size

should be:

n_train_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size + 1
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] / batch_size + 1
n_test_batches = test_set_x.get_value(borrow=True).shape[0] / batch_size + 1

More consistent code (self.x vs self.input)

Hi,

Some class have a self.x and other use self.input. We should be consistent.

lstm example is not working on windows, the problem is : perl file use makes error

lstm example is not working on windows, the problem is : perl file use makes error
from
http://deeplearning.net/tutorial/lstm.html
In order to use your own data, please use a (preprocessing script) provided as a part of this tutorial.
this file
https://raw.githubusercontent.com/kyunghyuncho/DeepLearningTutorials/master/code/imdb_preprocess.py
uses perl

Formula (4.12) - inverse of a vector

In (4.12) one can find: (x^0)^{-1}. What does it mean? How do you find inverse of a vector?

plot error in lstm.py

Hello:
can anyone give me a code to plot error in lstm?i have a trouble with it.

how to run RBM.py

Can you please tell me how to run rbm.py and dependencies it require

Model Saving

I am using the file SdA.py. I am wondering if there is a way to save the model which has been generated and trained; I would like to run vectors through the network and get back the "encoding" from the trained network.

Early Stop error in lstm.py

I think [:-patience, 0] should be replaced by [-patience:, 0] on line 609 in lstm.py, since it is the last patience results should be checked, not the first patience ones in history_errs.

Update: Just think twice and found that I was wrong. To check the first "-patience" results really works.
So just close this issue.

Is there this type of deep learning model?

Is there this type of deep learning model?
There are two labeled folders for binary classification.
ex) men and women, cats oand dogs, etc.
And then inserting the images to each folders as training data.
And then just run a simple command to train.
That’s all.
I need these simple training network model. Is there any?

A couple of possible issues with cA code

Hi, there might be a couple of issues in the cA code.
First, when computing the loss due to the jacobian (line 211) shouldn't the integer division (//) be replaced by a a true division (/) ?
Second, the result of the previous line of code is a scalar. Then, why taking its mean when computing the cost function in next line of code (218) ?
Thanks,
Gerard.

How can I get hidden layer representation of the given data

When deep belief network is implemented for representation learning, I'm confused about the representation of hidden layers for the original data matrix.
The method sigmoid_layers[-1].output seems doesn't work with no representation for the matrix acquired except 0.
Has anybody encountered such confusion?

Strange Error

Hello,

When I set my configuration on the SdA code to the following:

sda = SdA(
numpy_rng=numpy_rng,
n_ins=2880,
hidden_layers_sizes=[3000, 3000, 3000],
n_outs=4
)

I am getting the following error:

Traceback (most recent call last):
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 491, in
test_SdA()
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 440, in test_SdA
minibatch_avg_cost = train_fn(minibatch_index)
File "/home/dan/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 912, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/dan/.local/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/dan/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 899, in call
self.fn() if output_subset is None else
ValueError: y_i value out of bounds
Apply node that caused the error: CrossentropySoftmaxArgmax1HotWithBias(Dot22.0, b, Elemwise{Cast{int32}}.0)
Toposort index: 26
Inputs types: [TensorType(float64, matrix), TensorType(float64, vector), TensorType(int32, vector)]
Inputs shapes: [(1, 4), (4,), (1,)]
Inputs strides: [(32, 8), (8,), (4,)]
Inputs values: [array([[ 0., 0., 0., 0.]]), array([ 0., 0., 0., 0.]), array([4], dtype=int32)]
Outputs clients: [[Sum{acc_dtype=float64}(CrossentropySoftmaxArgmax1HotWithBias.0)], [CrossentropySoftmax1HotWithBiasDx(Elemwise{Inv}[(0, 0)].0, CrossentropySoftmaxArgmax1HotWithBias.1, Elemwise{Cast{int32}}.0)], []]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 491, in
test_SdA()
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 374, in test_SdA
n_outs=4
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 177, in init
self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/logistic_sgd.py", line 147, in negative_log_likelihood
return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

What might be causing this? This doesn't occur when the "n_outs" variable is set to 5 or 10, or so.

Mulitprocessing performance of dA and SdA

Hi,

I was wondering if someone could explain the discrepancy between dA and SdA.

Running test_dA() from dA.py results in all 12 cores on my computer being used at 100%, whereas running test_SdA() from SdA.py results in only a single core being used in the pretraining step of a single layer.

This pretraining step trains using exactly the same dA class so I would expect performance to be the same. I can't find a reason why the one method uses multiprocessing and the other doesn't.

Any ideas?

Thanks.

readme

a readme added to the code section will be useful for all..

Using theano.In breaks SdA

Hi,

Commit b701733 breaks SdA.py and the tests with error:

Traceback (most recent call last):
File "", line 1, in
File "SdA.py", line 382, in test_SdA
batch_size=batch_size)
File "SdA.py", line 226, in pretraining_functions
self.x: train_set_x[batch_begin: batch_end]
File "/local/lib/python2.7/site-packages/theano/compile/function.py", line 248, in function
"In() instances and tuple inputs trigger the old "
NotImplementedError: In() instances and tuple inputs trigger the old semantics, which disallow using updates and givens

Perhaps consider making these optional function arguments mandatory by removing theano.In

I can submit a pull request if it helps.

Thanks.

Lstm and rnnslu

I am able to run lstm and rnnslu successfully. By searching came to know, everyone has given these same examples in some deep learning library like theano,torch and tensorflow. I tried executing in all deep learning libraries. But none of them as told how to make these works for their own datasets. For example in IMDb lstm you had taken pickle file as input but for understanding I have text dataset with positive and negative sentences. I would like to give input as to your dB lstm example. How to do this. Also in rnnslu example instead of using Paris dataset I would like to use my own dataset
Could you please explain regarding this. It will be very helpful for beginners to try out new things.

LSTM: small finish up

raise maxlen to something more standard. The low 100 was for an exercices.
make it use floatX to support float64.
Quickly document the adadelta and rmsprop function.

starting dA

This is a question not an issue.
I don't know if I can write my question here or not.
I am very new to all deep learning as well as using python.
I apologize if my question is very simple or not smart.

So I run the dA.py code and it works fine.
I am trying to run the dA on just a single image (my image is not a binary image and its size is 28*28) and visualize my hidden layer (weights) and see the reconstructed image as well.
first I'd like to run it of dA and then on sda (stacked ae) and visualize all the layers.
I really appreciate it someone can help me. please let me know if I need to ask my question somewhere else.

Here is the code that I write so far:

import PIL.Image as Image
import dA
rng = dA.numpy.random.RandomState(123)
theano_rng = dA.RandomStreams(rng.randint(2 ** 30))
Img2 = dA.Image.open("fruits.jpg").convert('LA')
index = dA.T.lscalar()  # index to a [mini]batch
x = dA.T.matrix('x')

da = dA.dA(
    numpy_rng=rng,
    theano_rng=theano_rng,
    input=Img2,
    n_visible=28 * 28,
    n_hidden=500
)

so it works fine without any error, but I dont know what to do next.

In mlp.py, why are the parameters of the two models added together?

In mlp.py at line 194, the paramters of the hidden layer and the output layer are added together. Is this a typo? If not, then why are we adding the parameters?

Using SdA

Hello,

Sorry to have to post this question here. I am attempting to build an SdA with an input size of 2880 neurons, where each of the input values are in the range [0, 1]. But, when training, even for large hidden layer sizes (~3000 x 3), the pre-training cost for the first layer never falls below ~180. Should this cost be below 100? Is there something I am missing?

Add explanation of omission of bias vectors from regularization in doc/mlp.txt

Working through the Multilayer Perceptron tutorial (http://deeplearning.net/tutorial/mlp.html) I noticed the omission of the bias vectors from the L1, L2 regularization calculations. The linked explanation of regularization indicates that the regularization term is calculated over the entire parameter vector (http://deeplearning.net/tutorial/gettingstarted.html#l1-l2-regularization).

It's easy enough to Google for an explanation of this, but it would be helpful if the omission of the bias vectors were explained directly in the MLP tutorial.

Add a "predict" function for logistic regression model

It is a recurrent question, and has been suggested in https://groups.google.com/d/topic/theano-users/quGtLDGd_2Q/discussion

How to implement the real LeNet5 in theano?

By now, LeNet5 is implemented using the below code to get the convolved output:

conv_out = conv.conv2d(input=input, filters=self.W,
                                        filter_shape=filter_shape, image_shape=image_shape)

where conv.conv2d is a full-connected convolution operation. Each output feature map is got by convolving all the input feature maps with its kernel and suming them.
However, we know that LeNet5 actually is not working that way.
In LeNet5, each output feature map is got by convolving several selected feature maps with its kernel as below (image from Zohra Saidane and Christophe Garcia, “Automatic scene text recognition using a convolutional neural network,” in Proceedings of the Second International Workshop):

This mechanism works as a regularization process.
I think we could manually convolve some maps of the input and finally combine all the sub-outputs to achieve this.
Here is my question, is there a simple way to achieve this in theano?
Thanks.

Stacked Denoising Autoencoders pretraining input IN() error

The inputs (corruption and learning rate) are an instance of In() which appears to be a part of old semantics where updates are not available.
I changed it to to Param() and it worked for me.
Please make the required changes

Download links do not work for ATIS dataset

Just found the links in the download.sh for ATIS do not work.
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold0.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold1.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold2.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold3.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold4.pkl.gz

It reported 403 error.

Bias term be added twice in lstm.py

It seems that the bias term in lstm_layer is added twice. Is that a mistake ?

The bias is first added here:
state_below = (tensor.dot(state_below, tparams[_p(prefix, 'W')]) +
tparams[_p(prefix, 'b')])

and then in the step function:
def step(m, x, h_, c_):
preact = tensor.dot(h_, tparams[p(prefix, 'U')])
preact += x
preact += tparams[_p(prefix, 'b')]
......

ValueError: total size of new array must be unchanged

I have made certain changes in the "convolutional_mlp.py" code so that it can train CIFAR-10 dataset(images are 3 * 32 * 32 size) and it is executing without any error but while I am trying to load the pickled model to predict the class labels I am getting some error and I am unable to figure it out. I am attaching the screenshot of error and the modified file.
Thanks.
covolutional.txt
predict.txt
logistic_sgd (1).txt

Question about running logistic_cg.py

When I run the logistic_cg.py, I got these:

logistic_cg
... lThe code for file logistic_cg.py ran for 33.5s
The code for file mlp.py ran for 0.92m
The code for file convolutional_mlp.py ran for 0.82m
The no corruption code for file dA.py ran for 0.72m
The 30% corruption code for file dA.py ran for 0.73m
The pretraining code for file SdA.py ran for 1.84m
oading data
... building the model
Optimizing using scipy.optimize.fmin_cg...
validation error 29.989583 %
validation error 24.437500 %
validation error 20.760417 %
validation error 16.937500 %
validation error 14.270833 %
validation error 14.156250 %
validation error 13.177083 %
validation error 12.270833 %
validation error 11.697917 %
validation error 11.531250 %
validation error 10.531250 %
validation error 10.385417 %
validation error 10.135417 %
validation error 10.260417 %
validation error 9.885417 %
validation error 9.791667 %
validation error 9.208333 %
validation error 9.010417 %
validation error 8.937500 %
validation error 8.833333 %
validation error 8.760417 %
validation error 8.510417 %
validation error 8.354167 %
validation error 8.229167 %
validation error 8.270833 %
validation error 8.062500 %
validation error 7.979167 %
validation error 7.895833 %
validation error 7.875000 %
validation error 8.052083 %
Optimization complete with best validation score of 7.875000 %, with test performance 7.822917 %
The code run for 30 epochs in 0.559m, with 0.894673 epochs/sec

My question is that I don't know what these outputs mean:
"
... lThe code for file logistic_cg.py ran for 33.5s
The code for file mlp.py ran for 0.92m
The code for file convolutional_mlp.py ran for 0.82m
The no corruption code for file dA.py ran for 0.72m
The 30% corruption code for file dA.py ran for 0.73m
The pretraining code for file SdA.py ran for 1.84m
oading data
"
Is there something wrong with these output?

Wheather rbm.py can be used to extract features when input data is not binary?

The rbm test loads the mnist database as the input data that per value is 0 or 1 and gets the sample using binomial distribution. But when I want to use non-binary data as the input, is right still using binomial distribution? If not, what I need to modify?

rnnslu.py takes too long for nh=200

rnnslu.py takes too long for nh=200
any ideas to spead it up?