aaron-xichen / cnn-lstm-ctc Goto Github PK

View Code? Open in Web Editor NEW

63.0 63.0 28.0 23.51 MB

An implementation of LSTM and CTC to recognize simple english sentence image

Python 71.47% Shell 2.00% Jupyter Notebook 26.53%

cnn-lstm-ctc's People

Stargazers

Watchers

cnn-lstm-ctc's Issues

how to add a uni-LSTM layer after the biLSTM layer?

how to generate dataset?

i want to generate my dataset . how to get boudingboxs?
thanks

Training process

When I train the model from scratch, loss : nan phenomenon is detected in the first epoch.

no folder 'split_tiny_images'

in file 'utee.py' line 122
i don't have the folder 'split_tiny_images', where is the folder?

ImportError: No module named layers.net

Traceback (most recent call last):
File "train/train.py", line 9, in
from layers.net import Net
ImportError: No module named layers.net

Can you share '99.pkl' file to us?

thanks for your time to solve my problem.
Can you share '99.pkl' trained model file to us or give me a link to download it. so that I can make some data set to test it and I neednt train it again.
thank you very much

How

would you to implement this with Keras ?

Hello, Aaron

Recently I wanna do some tests about OCR, then I found your code,
I think maybe it's a good start.can you give us some data&&img samples?
BTW: Would you like to implement this with (Keras)[https://github.com/fchollet/keras] ?

Best Regards!

loss: nan, iter:1/455(1, 1.076s)

hello,请问为什么我使用您的程序ctc loss从一开始就为nan呢？希望您指导一下，非常感谢~
下面是显示的内容：
Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN not available)
C:\Anaconda\lib\site-packages\theano\tensor\signal\downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
loaded 29143 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\train_img_list.txt
loaded 2914 samples from D:\xugang\OCR\cnn-lstm-ctc-master\dataset\english_sentence\val_img_list.txt
building symbolic tensors(0.0799999237061)
setting parameters(0.0799999237061)
('n_classes: ', 95)
('multi-step: ', set([79625, 68250, 45500]))
building the model(0.0799999237061)
computing updates and function(0.240000009537)
using normal sgd and learning_rate:0.00999999977648
('bw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('fw_lstm_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('bw_lstm_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('bw_lstm_U', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('hidden_b', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
('hidden_W', <class 'theano.sandbox.cuda.var.CudaNdarraySharedVariable'>)
building training function(1.78999996185)
building validating function(29.6099998951)
begin to train(32.8609998226)
.epoch 1/200 begin(32.861)
[prefetch]height: 28, x_max_step:141.0, y_max_width:50
D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:137: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
x = np.zeros((batch_size, 1, height, x_max_len)). astype(config.floatX)
D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\layers\utee.py:138: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
x_mask = np.zeros((batch_size, x_max_len)).astype(config.floatX)
..loss: nan, iter:1/455(1, 1.076s)
..detect nan
..loss: nan, iter:1/455(1.076)
Traceback (most recent call last):
File "D:\xugang\OCR\cnn-lstm-ctc-master - 1.0\train.py", line 150, in
sys.exit()
SystemExit

Where can I find data set?

Can u give me a web link or sth that I can get data sets?
thanks very much.

Getting IndexError

I am using this model for similar data except that our images contain Sanskrit words. I created the train, val , test files similar to the ones(i.e image_name followed by ordinals for characters) used in this model.
But in our case, the number of characters(i.e n_classes) is 118(instead of 95 in original one) and y_max_len=200(instead of 50 in original one).
When I train the model , I am getting the following error

loaded 25996 samples from ./dataset/train_img_list.txt
loaded 756 samples from ./dataset/val_img_list.txt
building symbolic tensors(0.84720993042)
('#Train samples: ', 25996)
('#Val samples: ', 756)
('#Train Iterations: ', 406)
('#Val Iterations: ', 11)
setting parameters(0.848186016083)
('n_classes: ', 118)
('multi-step: ', set([40600, 71050, 60900]))
building the model(0.848335027695)
Subtensor{int64}.0
Shape.0
computing updates and function(1.2518889904)
using normal sgd and learning_rate:0.00999999977648
('bw_lstm_b', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('fw_lstm_W', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('fw_lstm_U', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('fw_lstm_b', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('bw_lstm_W', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('bw_lstm_U', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('hidden_b', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
('hidden_W', <class 'theano.tensor.sharedvar.TensorSharedVariable'>)
building training function(2.72188806534)
building validating function(25.7086689472)
begin to train(27.9824080467)
.epoch 1/200 begin(27.982)
[prefetch]height: 150, x_max_step:900.0, y_max_width:200
Traceback (most recent call last):
File "train/train.py", line 148, in
loss = train()
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 898, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 325, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 884, in call
self.fn() if output_subset is None else
File "/usr/local/lib/python2.7/dist-packages/theano/gof/op.py", line 872, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/subtensor.py", line 2173, in perform
out[0] = inputs[0].getitem(inputs[1:])
IndexError: index 121 is out of bounds for axis 2 with size 119
Apply node that caused the error: AdvancedSubtensor(Reshape{3}.0, SliceConstant{None, None, None}, InplaceDimShuffle{0,x}.0, <TensorType(int32, matrix)>)
Toposort index: 463
Inputs types: [TensorType(float32, 3D), <theano.tensor.type_other.SliceType object at 0x7f6a4d6d9510>, TensorType(int64, col), TensorType(int32, matrix)]
Inputs shapes: [(900, 64, 119), 'No shapes', (64, 1), (64, 200)]
Inputs strides: [(30464, 476, 4), 'No strides', (8, 8), (800, 4)]
Inputs values: ['not shown', slice(None, None, None), 'not shown', 'not shown']
Outputs clients: [[Reshape{2}(AdvancedSubtensor.0, MakeVector{dtype='int64'}.0), Shape_i{2}(AdvancedSubtensor.0), Shape_i{1}(AdvancedSubtensor.0), Shape_i{0}(AdvancedSubtensor.0)]]

Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "train/train.py", line 84, in
mid_layer_type = BLSTMLayer, forget=False)
File "/misc/me/rohits/aaron-cnn-lstm-ctc/layers/net.py", line 40, in init
blank = options['blank'], log_space = True)
File "/misc/me/rohits/aaron-cnn-lstm-ctc/layers/ctc_layer.py", line 25, in init
self.log_ctc(labels_len_const = labels_len_const)
File "/misc/me/rohits/aaron-cnn-lstm-ctc/layers/ctc_layer.py", line 94, in log_ctc
x1 = self.x[:, T.arange(n_samples)[:, None], self.y]

aaron-xichen / cnn-lstm-ctc Goto Github PK

cnn-lstm-ctc's People

Stargazers

Watchers

Forkers

cnn-lstm-ctc's Issues

how to add a uni-LSTM layer after the biLSTM layer?

how to generate dataset?

Training process

no folder 'split_tiny_images'

ImportError: No module named layers.net

Can you share '99.pkl' file to us?

How

would you to implement this with Keras ?

loss: nan, iter:1/455(1, 1.076s)

Where can I find data set?

Getting IndexError

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs