lisa-lab / deeplearningtutorials Goto Github PK
View Code? Open in Web Editor NEWDeep Learning Tutorial notes and code. See the wiki for more info.
Home Page: http://deeplearning.net/tutorial
License: Other
Deep Learning Tutorial notes and code. See the wiki for more info.
Home Page: http://deeplearning.net/tutorial
License: Other
This is a question not an issue.
I don't know if I can write my question here or not.
I am very new to all deep learning as well as using python.
I apologize if my question is very simple or not smart.
So I run the dA.py code and it works fine.
I am trying to run the dA on just a single image (my image is not a binary image and its size is 28*28) and visualize my hidden layer (weights) and see the reconstructed image as well.
first I'd like to run it of dA and then on sda (stacked ae) and visualize all the layers.
I really appreciate it someone can help me. please let me know if I need to ask my question somewhere else.
Here is the code that I write so far:
import PIL.Image as Image
import dA
rng = dA.numpy.random.RandomState(123)
theano_rng = dA.RandomStreams(rng.randint(2 ** 30))
Img2 = dA.Image.open("fruits.jpg").convert('LA')
index = dA.T.lscalar() # index to a [mini]batch
x = dA.T.matrix('x')
da = dA.dA(
numpy_rng=rng,
theano_rng=theano_rng,
input=Img2,
n_visible=28 * 28,
n_hidden=500
)
so it works fine without any error, but I dont know what to do next.
I am able to run lstm and rnnslu successfully. By searching came to know, everyone has given these same examples in some deep learning library like theano,torch and tensorflow. I tried executing in all deep learning libraries. But none of them as told how to make these works for their own datasets. For example in IMDb lstm you had taken pickle file as input but for understanding I have text dataset with positive and negative sentences. I would like to give input as to your dB lstm example. How to do this. Also in rnnslu example instead of using Paris dataset I would like to use my own dataset
Could you please explain regarding this. It will be very helpful for beginners to try out new things.
I am using the file SdA.py. I am wondering if there is a way to save the model which has been generated and trained; I would like to run vectors through the network and get back the "encoding" from the trained network.
There will be User Warning: Duplicate name W
and User Warning: Duplicate name b
while I dumping my SdA model using theano.misc.pkl_utils.dump
. And I found that a SdA
model cannot be loaded correctly because of the duplicate name of variables W
and b
.
So is it necessary to use different name for each theano tensor W
and b
in model classes? In fact, that's the way I solve the problem mentioned above.
Can you please tell me how to run rbm.py and dependencies it require
Hello,
Sorry to have to post this question here. I am attempting to build an SdA with an input size of 2880 neurons, where each of the input values are in the range [0, 1]. But, when training, even for large hidden layer sizes (~3000 x 3), the pre-training cost for the first layer never falls below ~180. Should this cost be below 100? Is there something I am missing?
I am training CIFAR10 dataset using a CNN.I am facing a problem while loading the dataset into dictionary variables. CIFAR10 dataset contains 6 batches 5 of which can be used for training and validation while one batch can be used for testing. I would like to split the 5 batches into 4 batches for training and 1 for validation.But,these batches are stored in a serialized format using Pickle .I am facing a problem while loading these 6 batches into 3 dictionaries 1 each for training,validation and testing where each dictionary contains data and labels as mentioned in the link given below(The dataset can also be downloaded from there)
Link to download dataset: https://www.cs.toronto.edu/~kriz/cifar.html
Here is the code snippet which I have modified from convolutional_mlp.py file in load_data method
dataset="path to the directory"
file_train=dataset+"data_batch_1"
f=open(file_train, 'rb')
train_set= pickle.load(f, encoding='latin1')
file_train=dataset+"data_batch_2"
f=open(file_train, 'rb')
train2_set= pickle.load(f, encoding='latin1')
train_set.update(train2_set)
#load 3rd batch
file_train=dataset+"data_batch_3"
f=open(file_train, 'rb')
train3_set= pickle.load(f, encoding='latin1')
train_set.update(train3_set)
#load 4th batch
file_train=dataset+"data_batch_4"
f=open(file_train, 'rb')
train4_set= pickle.load(f, encoding='latin1')
train_set.update(train4_set)
#load validation data
file_validate=dataset+"data_batch_5"
f=open(file_validate, 'rb')
validation_set= pickle.load(f, encoding='latin1')
#load test data
file_test=dataset+"test_batch"
f=open(file_test, 'rb')
test_set= pickle.load(f, encoding='latin1')
return train_set,validation_set,test_set
#modifcation in sgd_optimization_mnist method
def sgd_optimization_mnist(learning_rate=0.13, n_epochs=1000,
dataset='/home/ubuntu/Desktop/ramya/cifar-10-batches-py/',
batch_size=600):
"""
Demonstrate stochastic gradient descent optimization of a log-linear
model
This is demonstrated on MNIST.
:type learning_rate: float
:param learning_rate: learning rate used (factor for the stochastic
gradient)
:type n_epochs: int
:param n_epochs: maximal number of epochs to run the optimizer
:type dataset: string
:param dataset: the path of the MNIST dataset file from
http://www.iro.umontreal.ca/~lisa/deep/data/mnist//home/ubuntu/Desktop/ramya/cifar-10-batches-py
"""
datasets_0,datasets_1,datasets_2 = load_data(dataset)
train_set_x, train_set_y = datasets_0
valid_set_x, valid_set_y = datasets_1
test_set_x, test_set_y = datasets_2
In line 119 of mlp.py. 'softamx' Should be 'softmax'.
Hello,
When I set my configuration on the SdA code to the following:
sda = SdA(
numpy_rng=numpy_rng,
n_ins=2880,
hidden_layers_sizes=[3000, 3000, 3000],
n_outs=4
)
I am getting the following error:
Traceback (most recent call last):
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 491, in
test_SdA()
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 440, in test_SdA
minibatch_avg_cost = train_fn(minibatch_index)
File "/home/dan/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 912, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/dan/.local/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/dan/.local/lib/python2.7/site-packages/theano/compile/function_module.py", line 899, in call
self.fn() if output_subset is None else
ValueError: y_i value out of bounds
Apply node that caused the error: CrossentropySoftmaxArgmax1HotWithBias(Dot22.0, b, Elemwise{Cast{int32}}.0)
Toposort index: 26
Inputs types: [TensorType(float64, matrix), TensorType(float64, vector), TensorType(int32, vector)]
Inputs shapes: [(1, 4), (4,), (1,)]
Inputs strides: [(32, 8), (8,), (4,)]
Inputs values: [array([[ 0., 0., 0., 0.]]), array([ 0., 0., 0., 0.]), array([4], dtype=int32)]
Outputs clients: [[Sum{acc_dtype=float64}(CrossentropySoftmaxArgmax1HotWithBias.0)], [CrossentropySoftmax1HotWithBiasDx(Elemwise{Inv}[(0, 0)].0, CrossentropySoftmaxArgmax1HotWithBias.1, Elemwise{Cast{int32}}.0)], []]
Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 491, in
test_SdA()
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 374, in test_SdA
n_outs=4
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/SdA.py", line 177, in init
self.finetune_cost = self.logLayer.negative_log_likelihood(self.y)
File "/home/dan/PycharmProjects/DeepLearningTutorials/code/logistic_sgd.py", line 147, in negative_log_likelihood
return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
What might be causing this? This doesn't occur when the "n_outs" variable is set to 5 or 10, or so.
Hello
I'am learning the deep learning models following your code. I have a question while reading the code for lstm. As I understand, in the function lstm_layer, your input state_below
is a 3d tensor with dimension [time_stamp, n_samples, dim_proj]
, this is used in function _step with theano.scan.
rval, updates = theano.scan(_step,
sequences=[mask, state_below],
outputs_info=[tensor.alloc(numpy_floatX(0.),
n_samples,
dim_proj),
tensor.alloc(numpy_floatX(0.),
n_samples,
dim_proj)],
name=_p(prefix, '_layers'),
n_steps=nsteps)
But in the function _step, I saw
preact = tensor.dot(h_, tparams[_p(prefix, 'U')])
preact += x_
i = tensor.nnet.sigmoid(_slice(preact, 0, options['dim_proj']))
f = tensor.nnet.sigmoid(_slice(preact, 1, options['dim_proj']))
o = tensor.nnet.sigmoid(_slice(preact, 2, options['dim_proj']))
c = tensor.tanh(_slice(preact, 3, options['dim_proj']))
c = f * c_ + i * c
c = m_[:, None] * c + (1. - m_)[:, None] * c_
h = o * tensor.tanh(c)
h = m_[:, None] * h + (1. - m_)[:, None] * h_
return h, c
It seems that h_
has the dimension [n_samples, dim_proj]
and x_
is state_below
, how can the h
returned by this function keeps the same dimension and contains the values computed based on x_k
in step k
?
Hi,
Commit b701733 breaks SdA.py and the tests with error:
Traceback (most recent call last):
File "", line 1, in
File "SdA.py", line 382, in test_SdA
batch_size=batch_size)
File "SdA.py", line 226, in pretraining_functions
self.x: train_set_x[batch_begin: batch_end]
File "/local/lib/python2.7/site-packages/theano/compile/function.py", line 248, in function
"In() instances and tuple inputs trigger the old "
NotImplementedError: In() instances and tuple inputs trigger the old semantics, which disallow using updates and givens
Perhaps consider making these optional function arguments mandatory by removing theano.In
I can submit a pull request if it helps.
Thanks.
rnnslu.py takes too long for nh=200
any ideas to spead it up?
Trying to run the code belonging to the tutorial: http://deeplearning.net/tutorial/lenet.html raises the following error:
Downloading data from http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
... loading data
... building the model
Traceback (most recent call last):
File "code/convolutional_mlp.py", line 345, in <module>
evaluate_lenet5()
File "code/convolutional_mlp.py", line 182, in evaluate_lenet5
poolsize=(2, 2)
File "code/convolutional_mlp.py", line 97, in __init__
input_shape=image_shape
File "/home/arthur/.local/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 149, in conv2d
imshp=imshp, kshp=kshp, nkern=nkern, bsize=bsize, **kargs)
TypeError: __init__() got an unexpected keyword argument 'input_shape'
All I did was cloning this repo and running the following command:
python code/convolutional_mlp.py
As i was not able to find any information about this on the web I thought I might post it here.
a readme added to the code section will be useful for all..
uidx is initialised to 0 on line 544
then increased by 1 on line 555
so I think uidx == 0 on line 598 should be replaced by uidx == 1
I have modified "convolutional_mlp.py" file so that I can pickle the cnn class's object which will further help me to predict the classes of image using the saved model.But I am getting an error while doing so in the line-377 where I am actually calling the pickle.dump() method.
I am attaching the code for the file and the screenshot of the error.I guess the error is due to pickling the instance methods of logistic_sgd(though I am not sure about it)
self.layer3 = LogisticRegression(input=self.layer2.output, n_in=500, n_out=10)
self.negative_log_likelihood = (
self.layer3.negative_log_likelihood
)
#predicting the calss value and calculating errors
self.predict=self.layer3.y_pred
self.errors = self.layer3.errors
# the cost we minimize during training is the NLL of the model
#self.cost = self.layer3.negative_log_likelihood(y)
# create a list of all model parameters to be fit by gradient descent
self.params = self.layer3.params + self.layer2.params + self.layer1.params + self.layer0.params
self.input = input
It seems that the bias term in lstm_layer is added twice. Is that a mistake ?
The bias is first added here:
state_below = (tensor.dot(state_below, tparams[_p(prefix, 'W')]) +
tparams[_p(prefix, 'b')])
and then in the step function:
def step(m, x, h_, c_):
preact = tensor.dot(h_, tparams[p(prefix, 'U')])
preact += x
preact += tparams[_p(prefix, 'b')]
......
I am very new to this topics and deep learning.
I have questions that may sound stupid but I need to ask them because I have noone to ask and I could not find them in google! :)
So I am using Ubuntu 14.04 and installed anaconda and python 2.7 and installed Theano and make it work fine!
So here is my first question, do I need to install something else in order to use this DeepLearningTutorials? (if yes please tell me how).
Do I need to add any more PATH ? (hmmm if yes how :/ )
e.g. I was trying to do this tutorial regarding SdA (http://deeplearning.net/tutorial/code/SdA.py).
I could run this commands fine
import os
import sys
import timeit
import numpy
import theano
import theano.tensor as T
from theano.tensor.shared_randomstreams import RandomStreams.
However the next commands are not fine anymore, when I am trying to:
from logistic_sgd import LogisticRegression, load_data
from mlp import HiddenLayer
from dA import dA
I face the errors that no module to import....
cannot import LogisticRegression
or etc anymore.
Can anyone please tell me what I need to do to be able use the tutorials?
Thanks in advance.
I would like to retrieve the encoding of the input vector from the autoencoder architecture. I am able to train an autoencoder, but would like to obtain the lower-dimensional encodings given by the hidden layers. Is there a way to do this?
Hi,
Some class have a self.x and other use self.input. We should be consistent.
When deep belief network is implemented for representation learning, I'm confused about the representation of hidden layers for the original data matrix.
The method sigmoid_layers[-1].output seems doesn't work with no representation for the matrix acquired except 0.
Has anybody encountered such confusion?
The inputs (corruption and learning rate) are an instance of In() which appears to be a part of old semantics where updates are not available.
I changed it to to Param() and it worked for me.
Please make the required changes
Hello:
can anyone give me a code to plot error in lstm?i have a trouble with it.
During the summer school, a participant told me it would be useful to have more comments in the LSTM code.
I run the command: python lstm.py
then throw follow error, what's the problem ?
Traceback (most recent call last):
File "lstm.py", line 577, in
theano.config.scan.allow_gc = False
AttributeError: 'TheanoConfigParser' object has no attribute 'scan'
n_train_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] / batch_size
n_test_batches = test_set_x.get_value(borrow=True).shape[0] / batch_size
should be:
n_train_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size + 1
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] / batch_size + 1
n_test_batches = test_set_x.get_value(borrow=True).shape[0] / batch_size + 1
Just found the links in the download.sh for ATIS do not work.
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold0.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold1.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold2.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold3.pkl.gz
$DL_CMD http://www-etud.iro.umontreal.ca/~mesnilgr/atis/atis.fold4.pkl.gz
It reported 403 error.
I think [:-patience, 0] should be replaced by [-patience:, 0] on line 609 in lstm.py, since it is the last patience results should be checked, not the first patience ones in history_errs.
Update: Just think twice and found that I was wrong. To check the first "-patience" results really works.
So just close this issue.
Is this code released under a license? It would be good to have a license file in the root of the project, or have the license (or lack thereof) mentioned in the readme.
Hi;
Is it possible to save the network after every say 100 training samples.
Once the training is completed, is it possible to load the network from the HDD and use for predicting on a new dataset?
Regards
Varghese
It is a recurrent question, and has been suggested in https://groups.google.com/d/topic/theano-users/quGtLDGd_2Q/discussion
thanks a lot..
The current version is not compatible with Python 3+ . A development branch, to initiate migration to Python 3, would be great. I will be happy to contribute.
Thanks.
Hi,
I'm running your code rnnslu.py for ATIS (the the 5 folds of data you provide us with)
(DeepLearningTutorials/code/rnnslu.py)
I expected to get F1=94.98% as mentioned in the paper:
'Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding'
(table 3: Elman/Single,context(9))
Instead, the best performance with this code is F1=94.058% for cont_win=9 and emb_dim=100.
Is that what I'm suppose to get?
Thank you in advance!
I am using Windows 7 32 bit. When I run the convulational_mlp.py, Python crashes with exit code - 1073741819 (0xC0000005).
It crashes just after
.... training
While training the tutorial DBN model on MNIST data, I noticed that each pre-training epoch was taking about 7 seconds to complete, as compared to other DBN architectures which took around 0.5 seconds per epoch. After profiling a few of the compiled theano functions and doing some research online, I discovered that the issue is exactly what is described in this ticket: Theano/Theano#1233.
The tutorial code should be updated so that those who are trying out theano for the first time by running the tutorial models are not turned away by this excessive training time cost. A simple fix would be to update lines 11 and 61 in DBN.py to the following:
[11]: import theano.sandbox.rng_mrg import MRG_RandomStreams
[61]: theano_rng = MRG_RandomStreams(numpy_rng.randint(2 ** 30))
-R. Feinman
The rbm test loads the mnist database as the input data that per value is 0 or 1 and gets the sample using binomial distribution. But when I want to use non-binary data as the input, is right still using binomial distribution? If not, what I need to modify?
In the equations of MLP
https://github.com/lisa-lab/DeepLearningTutorials/blob/master/doc/mlp.txt#L64
The equations seem missing transpose operations. Since the shape of W is n_input x n_output, i.e. D x D_hidden. There should be a transpose for W in the equation, like W^(1)T x.
Hi,
I was wondering if someone could explain the discrepancy between dA and SdA.
Running test_dA() from dA.py results in all 12 cores on my computer being used at 100%, whereas running test_SdA() from SdA.py results in only a single core being used in the pretraining step of a single layer.
This pretraining step trains using exactly the same dA class so I would expect performance to be the same. I can't find a reason why the one method uses multiprocessing and the other doesn't.
Any ideas?
Thanks.
I have made certain changes in the "convolutional_mlp.py" code so that it can train CIFAR-10 dataset(images are 3 * 32 * 32 size) and it is executing without any error but while I am trying to load the pickled model to predict the class labels I am getting some error and I am unable to figure it out. I am attaching the screenshot of error and the modified file.
Thanks.
covolutional.txt
predict.txt
logistic_sgd (1).txt
Working through the Multilayer Perceptron tutorial (http://deeplearning.net/tutorial/mlp.html) I noticed the omission of the bias vectors from the L1, L2 regularization calculations. The linked explanation of regularization indicates that the regularization term is calculated over the entire parameter vector (http://deeplearning.net/tutorial/gettingstarted.html#l1-l2-regularization).
It's easy enough to Google for an explanation of this, but it would be helpful if the omission of the bias vectors were explained directly in the MLP tutorial.
lstm example is not working on windows, the problem is : perl file use makes error
from
http://deeplearning.net/tutorial/lstm.html
In order to use your own data, please use a (preprocessing script) provided as a part of this tutorial.
this file
https://raw.githubusercontent.com/kyunghyuncho/DeepLearningTutorials/master/code/imdb_preprocess.py
uses perl
In mlp.py at line 194, the paramters of the hidden layer and the output layer are added together. Is this a typo? If not, then why are we adding the parameters?
Hi, there might be a couple of issues in the cA code.
First, when computing the loss due to the jacobian (line 211) shouldn't the integer division (//) be replaced by a a true division (/) ?
Second, the result of the previous line of code is a scalar. Then, why taking its mean when computing the cost function in next line of code (218) ?
Thanks,
Gerard.
Is there this type of deep learning model?
There are two labeled folders for binary classification.
ex) men and women, cats oand dogs, etc.
And then inserting the images to each folders as training data.
And then just run a simple command to train.
That’s all.
I need these simple training network model. Is there any?
When I run the logistic_cg.py, I got these:
logistic_cg
... lThe code for file logistic_cg.py ran for 33.5s
The code for file mlp.py ran for 0.92m
The code for file convolutional_mlp.py ran for 0.82m
The no corruption code for file dA.py ran for 0.72m
The 30% corruption code for file dA.py ran for 0.73m
The pretraining code for file SdA.py ran for 1.84m
oading data
... building the model
Optimizing using scipy.optimize.fmin_cg...
validation error 29.989583 %
validation error 24.437500 %
validation error 20.760417 %
validation error 16.937500 %
validation error 14.270833 %
validation error 14.156250 %
validation error 13.177083 %
validation error 12.270833 %
validation error 11.697917 %
validation error 11.531250 %
validation error 10.531250 %
validation error 10.385417 %
validation error 10.135417 %
validation error 10.260417 %
validation error 9.885417 %
validation error 9.791667 %
validation error 9.208333 %
validation error 9.010417 %
validation error 8.937500 %
validation error 8.833333 %
validation error 8.760417 %
validation error 8.510417 %
validation error 8.354167 %
validation error 8.229167 %
validation error 8.270833 %
validation error 8.062500 %
validation error 7.979167 %
validation error 7.895833 %
validation error 7.875000 %
validation error 8.052083 %
Optimization complete with best validation score of 7.875000 %, with test performance 7.822917 %
The code run for 30 epochs in 0.559m, with 0.894673 epochs/sec
My question is that I don't know what these outputs mean:
"
... lThe code for file logistic_cg.py ran for 33.5s
The code for file mlp.py ran for 0.92m
The code for file convolutional_mlp.py ran for 0.82m
The no corruption code for file dA.py ran for 0.72m
The 30% corruption code for file dA.py ran for 0.73m
The pretraining code for file SdA.py ran for 1.84m
oading data
"
Is there something wrong with these output?
By now, LeNet5 is implemented using the below code to get the convolved output:
conv_out = conv.conv2d(input=input, filters=self.W,
filter_shape=filter_shape, image_shape=image_shape)
where conv.conv2d
is a full-connected convolution operation. Each output feature map is got by convolving all the input feature maps with its kernel and suming them.
However, we know that LeNet5 actually is not working that way.
In LeNet5, each output feature map is got by convolving several selected feature maps with its kernel as below (image from Zohra Saidane and Christophe Garcia, “Automatic scene text recognition using a convolutional neural network,” in Proceedings of the Second International Workshop):
This mechanism works as a regularization process.
I think we could manually convolve some maps of the input and finally combine all the sub-outputs to achieve this.
Here is my question, is there a simple way to achieve this in theano?
Thanks.
In (4.12) one can find: (x^0)^{-1}. What does it mean? How do you find inverse of a vector?
in rnnslu, h0 = h0 - lr * gradient_h0
should it be: h0 = h[-1]?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.