GithubHelp home page GithubHelp logo

watsonyanghx / cnn_lstm_ctc_tensorflow Goto Github PK

View Code? Open in Web Editor NEW
363.0 363.0 212.0 69 KB

CNN+LSTM+CTC based OCR implemented using tensorflow.

License: MIT License

Python 100.00%
cnn ctc lstm ocr tensorflow

cnn_lstm_ctc_tensorflow's People

Contributors

watsonyanghx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cnn_lstm_ctc_tensorflow's Issues

About the data

Hi,
Is it possible that you provide your data (maybe part of them) for those want to have a try? A link or something. Thanks.

Some problems aboult this code

Infact the "max_stepsize" in this code should't be 64.The "max_stepsize" is equal to 12,which is shrunk from original "image_width"(180) to 180/2/2/2/2 = 12.Remenber the core idea in CRNN+CTC is that we split the image vertically to many slices,and we predict each slice's classes,finally using CTC to decode the predicted sequence to the respectd result.For example "aaa_bb_c_"and "a__b_ccc" both respect to the same label "abc",you can also read the paper for more details.

But when I run the wrong code in author's dataset,and I got 98% accuracy while I got a bad result in VGGWord dataset.Finally I got a good result after changing the code.

So, why this code work in your situation,I am very courious about this.Thank you.

Problem with frozen pb

I trained the model with a custom dataset and got the checkpoint files. I froze the model using this script

import tensorflow as tf
def freeze_graph(model_dir, output_node_names, frozen_graph_name):
    if not tf.gfile.Exists(model_dir):
        raise AssertionError(
            "Export directory doesn't exists. Please specify an export "
            "directory: %s" % model_dir)

    if not output_node_names:
        print("You need to supply the name of a node to --output_node_names.")
        return -1

    # We retrieve our checkpoint fullpath
    checkpoint = tf.train.get_checkpoint_state(model_dir)
    input_checkpoint = checkpoint.model_checkpoint_path

    # We precise the file fullname of our freezed graph
    absolute_model_dir = "/".join(input_checkpoint.split('/')[:-1])
    output_graph = absolute_model_dir + "/" + frozen_graph_name + ".pb"

    # We clear devices to allow TensorFlow to control on which device it will load operations
    clear_devices = True

    # We start a session using a temporary fresh Graph
    with tf.Session(graph=tf.Graph()) as sess:
        # We import the meta graph in the current default Graph
        saver = tf.train.import_meta_graph(input_checkpoint + '.meta', clear_devices=clear_devices)

        # We restore the weights
        saver.restore(sess, input_checkpoint)
        gd = sess.graph.as_graph_def()
        # We use a built-in TF helper to export variables to constants
        output_graph_def = tf.graph_util.convert_variables_to_constants(
            sess,  # The session is used to retrieve the weights
            gd,  # The graph_def is used to retrieve the nodes
            output_node_names.split(",")  # The output node names are used to select the usefull nodes
        )

        # Finally we serialize and dump the output graph to the filesystem
        with tf.gfile.GFile(output_graph, "wb") as f:
            f.write(output_graph_def.SerializeToString())
        print("%d ops in the final graph." % len(output_graph_def.node))

    return output_graph_def

freeze_graph('./checkpoint','SparseToDense','ocr.pb')

But when I'm loading the graph from the protobuf file, I'm getting this error:

ValueError: Input 0 of node import/cnn/unit-4/bn4/BatchNorm/AssignMovingAvg/cnn/unit-4/bn4/BatchNorm/moving_mean/AssignAdd was passed float from import/cnn/unit-4/bn4/BatchNorm/cnn/unit-4/bn4/BatchNorm/moving_mean/local_step:0 incompatible with expected float_ref.

I know this is a little off topic but any help is appreciated.

Feature extraction using CNN

Hi,
I would like to extract feature sequence of a text line image using CNN.
How can I perform this ?
Thank you in advance for your help.

how to inference one by one?

sorry... the batch size was used in LSTM structure, so when I inference, I have to send a batch of data: one real data and other zeros.
So how can I inference one by one ?
thanks so much!

natural scene

Can this identify text in a natural scene, such as a letter on a billboard

测试集精度问题。

你好,我使用自己的数据在测试集上精度还不错,但是检查了一下错误的都是重叠的字符的漏检,下面是部分测试集结果:
38387_1077.jpg 107
38388_100,005.jpg 100,05
38389_1077.jpg 107
38393_100,005.jpg 10,05
38394_1077.jpg 107
38640_131,005.jpg 131,05
38797_61,051.jpg 61,0651
39128_44,438.jpg 4,438
39545_157,333.jpg 157,33
4876_314,554.jpg 314,54
4878_268,866.jpg 268,86
5223_111,055.jpg 111,05
5276_211,904.jpg 21,904
546_32,772.jpg 32,72
571_128,883.jpg 128,83
664_148,733.jpg 148,73
672_144,150.jpg 14,150
7218_102,332.jpg 102,32
7221_100,267.jpg 10,2657
7654_77,132.jpg 7,132
7731_215,223.jpg 215,23
7787_111,702.jpg 11,702
7791_104,773.jpg 104,73
请问针对这个问题有什么好点的解决方案吗?应该调整哪里?谢谢

Change the image width and height

Hello,I chang the Image width and height from(60,180)to(80,500),then I get an error:

InvalidArgumentError (see above for traceback): Matrix size-incompatible: In[0]: [40,288], In[1]: [176,512]
[[Node: lstm/rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](lstm/rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell/concat, lstm/rnn/while/multi_rnn_cell/cell_0/lstm_cell/lstm_cell/MatMul/Enter)]]
[[Node: Mean/_37 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_950_Mean", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]]

Is there anything else I should change to fix this error?

Training does not begin:

Hi Guys

I have prepared a small dataset just for trying out the network and see how it works. It seems like that its able to load the data set well and prints (Begin Training) but after that it just stops and do nothing.Here is what i see on screen:
CUDA_VISIBLE_DEVICES=0 python ./main.py --train_dir=./imgs/train/ --val_dir=./imgs/val/ --image_height=60 --image_width=180 --image_channel=1 --out_channels=64 --num_hidden=128 --batch_size=128 --log_dir=./log/train --num_gpus=1 --mode=train

feature_h: 4, feature_w: 12
lstm input shape: [128, 12, 256]
loading train data
('size: ', 11)
loading validation data
size: 6

2018-05-29 11:47:19.300427: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-05-29 11:47:19.954690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-05-29 11:47:19.955398: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:01:00.0
totalMemory: 3.95GiB freeMemory: 3.50GiB
2018-05-29 11:47:19.955416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-05-29 11:47:20.485722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-05-29 11:47:20.485760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-05-29 11:47:20.485768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-05-29 11:47:20.485968: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3237 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0, compute capability: 5.0)
=============================begin training=============================
as you can see Training does not begin and i dont get any errors either

How to inference in test images?

Hi, Dear All, thanks a lot for this great project!
I have trained the model with 32x128 OCR images successfully. I have a question, how do we test the new test images with the model? Using sliding window? I mean generally speaking, the images detected from the previous text detection branch are variable lengths, how do we input these images into the model to get the prediction? I thought about the sliding window, but could you please provide some advice or reference papers on this? Thanks.

Lablels in mode Infer

Hi @watsonyanghx i found your code and i think i'm going to use it for license plate ocr but i want to ask first :
In the inference mode do the images i want to test in mode infer have to have labels?

关于batch normalization层

如果想要保存跟新moving_mean和moving_variance的话,好像要写以下代码:
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(extra_update_ops):
train_op = optimizer.minimize(loss)
但是好像没找到相关内容,想问作者是不是漏了写了?

Helper.py error

Hey guys

Recently i have been trying to work with this network but when i want to prepare the data using helper.py i encounter some errors. I have not done any modifications on this file except the images and label paths, and that's all. Here is the error i get after running the script. I would appreciate if anyone could help me with this:
Traceback (most recent call last):
File "helper.py", line 116, in
image_path_list = load_img_path(images_path)
File "helper.py", line 68, in load_img_path
tmp.sort(key=lambda x: int(x.split('.')[0]))
File "helper.py", line 68, in
tmp.sort(key=lambda x: int(x.split('.')[0]))
ValueError: invalid literal for int() with base 10: 'labels'

IndexError: list index out of range

Hi, Thanks for sharing the great work!

I downloaded the data based on the suggestion of this link.

Then I tried running the training script, but encountered below error,

    train_feeder = utils.DataIterator(data_dir=train_dir)
  File "/home/levin/workspace/snrprj/CNN_LSTM_CTC_Tensorflow/utils.py", line 73, in __init__
    code = image_name.split('/')[-1].split('_')[1].split('.')[0]
IndexError: list index out of range

It looks to me that the script expects to get label for each image from its filename. So to get the code run properly and train the model, we will have to first rename the image files based on the labels.txt file, is this correct?

ValueError: need more than 2 values to unpack

use 60*180size to train ,num_classes = 12 then i got it

loading train data, please wait---------------------
('get image: ', 15000)
loading validation data, please wait---------------------
('get image: ', 4999)
2017-11-06 13:16:08.547013: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2017-11-06 13:16:08.641450: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-06 13:16:08.641675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1031] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.86
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.29GiB
2017-11-06 13:16:08.641691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
=============================begin training=============================
No handlers could be found for logger "Traing for OCR using CNN+LSTM+CTC"
('batch', 99, ': time', 0.1522228717803955)
('batch', 199, ': time', 0.1701350212097168)
('batch', 299, ': time', 0.14639997482299805)
('batch', 99, ': time', 0.1512739658355713)
Traceback (most recent call last):
File "main.py", line 215, in

File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 207, in main

File "main.py", line 111, in train

ValueError: need more than 2 values to unpack

the num of num_classes

hi, thank you for your codes. I am confused by the num_classes,.
+-* + () + 10 digit + blank + space
num_classes = 3 + 2 + 10 + 1 + 1
I understand the ctc loss need to add a special ctc_blank, but there is no space in the label ,why there are add two 1 ?
I notice that in the training phase, the code run the below part to generate label
charset = '0123456789+-*()'
encode_maps = {}
decode_maps = {}
for i, char in enumerate(charset, 1):
encode_maps[char] = i
decode_maps[i] = char

SPACE_INDEX = 0
SPACE_TOKEN = ''
encode_maps[SPACE_TOKEN] = SPACE_INDEX
decode_maps[SPACE_INDEX] = SPACE_TOKEN
I mean there is no space in you lable, so if remove encode_maps[SPACE_TOKEN] = SPACE_INDEX, does the num_class will not need to add another 1?

Feed Images with Variable length

Hey guys i am trying to use the model to train on the images with variable length size.As you know this data set is a fixed length size and there is no problem with running the model but when it comes to other datasets such as IAM we get error since the are not fixed size.One of the techniques that have been mentioned is to do zero padding. Now my question is am i suppose to do the zero padding on the images itself before feeding them to the network and is there any other ways to overcome this issue of variable sizes.

Thanks

Cost continuing reduce while accuracy is always ZERO

Hello, everyone:
I run this script with author's dataset well, but i get into into trouble like title when i train the model with my own dataset.
333
some pics of my dataset:
1000072_13 169 121 122 123 10 11 12 149 150 53 84 151 152 66 67 68 69 50 40 39 43 45 51 46
1000060_168 169 13 14 15 21 25 169 170 20 13 169 171 54 172 173 22 52 53 54 55 36 20 13 13
1000018_61 62 63 29 64 65 53 66 67 68 69 121 122 123 10 11 12 176 177 22 112 13 20 56 115
1000016_172 173 22 52 53 54 55 36 20 13 13 20 174 174 70 56 18 153 154 155 156 175 158 65 53
these pics are 30x500, 25 chars in each pic. i used about 260k of these to train, 65k to validate.
words in pics are randomly selected from some drug infos like this:
222
with open('thistxt', 'r', encoding='utf-8') as f:
# read each line into a list
all_lines = f.read().split('\n').strip()
# link each line to a string
data_str = ''.join(all_lines)
# generate word with random index
rand_word = data_str[a_rand_num, a_rand_num + word_length]
there are 196 unique chars in this txt, so my num_classes in the model is 196. is my dataset not large enough or what? i'd appreciate if anyone can help. 中文也可以

How does the inference work?

I strated trainning the model and i stoped it manually via keyboard exception to test the inference but when i run the command i get no errors and nothing happens?

raise _exceptions.DuplicateFlagError.from_flag

Hi, I try to train but using this cmd command:

main.py --train_dir=../imgs/train/ --val_dir=../imgs/val/ --image_height=60 --image_width=180 --image_channel=1 --out_channels=64 --num_hidden=128 --batch_size=128 --log_dir=./log/train --num_gpus=1 -mode=train

But got this error:

Traceback (most recent call last):
  File "C:\Projects\CNN_LSTM_CTC_Tensorflow\main.py", line 14, in <module>
    import cnn_lstm_otc_ocr
  File "C:\Projects\CNN_LSTM_CTC_Tensorflow\cnn_lstm_otc_ocr.py", line 6, in <module>
    import utils
  File "C:\Projects\CNN_LSTM_CTC_Tensorflow\utils.py", line 43, in <module>
    tf.app.flags.DEFINE_string('log_dir', './log', 'the logging dir')
  File "C:\Users\N1throServer\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow\python\platform\flags.py", line 58, in wrapper
    return original_function(*args, **kwargs)
  File "C:\Users\N1throServer\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\flags\_defines.py", line 241, in DEFINE_string
    DEFINE(parser, name, default, help, flag_values, serializer, **args)
  File "C:\Users\N1throServer\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\flags\_defines.py", line 82, in DEFINE
    flag_values, module_name)
  File "C:\Users\N1throServer\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\flags\_defines.py", line 104, in DEFINE_flag
    fv[flag.name] = flag
  File "C:\Users\N1throServer\AppData\Local\Programs\Python\Python37\lib\site-packages\absl\flags\_flagvalues.py", line 430, in __setitem__
    raise _exceptions.DuplicateFlagError.from_flag(name, self)
absl.flags._exceptions.DuplicateFlagError: The flag 'log_dir' is defined twice. First from absl.logging, Second from utils.  Description from first occurrence: directory to write logfiles into

How can it be fixed?

can this algorithm deal with dynamic length characters?

the image I made
I run this code successfully, including both train set and validation set. Then I changed one of the validation image to add 2 characters, previously it is '7+0 * 9', I changed it to '7+0 * 9+7'. But it was recognized as '7+(0 * 9)'. The '+7' font style is same with it in this image, I copied to add it, so it is not font style issue. I attached the image I made. Please take a look. Can you tell me why?

一个小问题

请问x.set_shape([shp[0], filters[3], 48])中的48怎么得来的,我改用 shp = x.get_shape().as_list(),然后换成shp[1]或shp[2]都会报错,为什么?请问有谁知道?

文件名中 73091_(8+9)*4.png 含有特殊字符,是不能命名成功的,不知道您是怎么处理的

在网上看过您的CNN_LSTM_CTC_Tensorflow 源码,也下载了数据集,想重现您的结果,有几个问题请教一下,谢谢!
1,源码是在这里下载的,https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow,数据集也下载解压了。
D:\Tensorflow\CNN_LSTM_CTC_Tensorflow-master\imgs
解压后目录结构
imgs\labels.txt
imgs\image_contest_level_1\

2.运行 helper.py 后,在imgs 目录下生成了 X_train.txt、 X_val.txt、 y_train.txt、 y_val.txt4个文件是正常的。

X_train.txt 训练的文件名
X_val.txt 测试的文件名

y_train.txt 训练的答案
y_val.txt 测试的答案


cp_file(X_train, y_train, './imgs/train/')
cp_file(X_val, y_val, './imgs/val/')
"D:\Program Files\Python365\python36.exe" D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/helper.py
['./imgs/image_contest_level_1/0.png' './imgs/image_contest_level_1/1.png'
'./imgs/image_contest_level_1/2.png' './imgs/image_contest_level_1/3.png'
'./imgs/image_contest_level_1/4.png' './imgs/image_contest_level_1/5.png'
'./imgs/image_contest_level_1/6.png' './imgs/image_contest_level_1/7.png'
'./imgs/image_contest_level_1/8.png' './imgs/image_contest_level_1/9.png']
Traceback (most recent call last):
File "D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/helper.py", line 129, in
cp_file(X_train, y_train, './imgs/train/')
File "D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/helper.py", line 102, in cp_file
shutil.copyfile(file_path, dest_filename)
File "D:\Program Files\Python365\lib\shutil.py", line 121, in copyfile
with open(dst, 'wb') as fdst:
OSError: [Errno 22] Invalid argument: './imgs/train/73091_(8+9)*4.png'

进程完成,退出码 1
文件名中 73091_(8+9)*4.png 含有特殊字符,是不能命名成功的,不知道您是怎么处理的

3.运行 cnn_lstm_otc_ocr.py 报错

"D:\Program Files\Python365\python36.exe" D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/cnn_lstm_otc_ocr.py
D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/cnn_lstm_otc_ocr.py:42: SyntaxWarning: assertion is always true, perhaps remove parentheses?
assert (FLAGS.cnn_count <= count_, "FLAGS.cnn_count should be <= {}!".format(count_))

4.运行 main.py

"D:\Program Files\Python365\python36.exe" D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/main.py

feature_h: 4, feature_w: 12
lstm input shape: [40, 12, 256]
loading train data
size: 0
loading validation data
size: 0

2018-07-12 11:02:14.624545: I c:\users\user\source\repos\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2018-07-12 11:02:14.844809: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.683
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.63GiB
2018-07-12 11:02:14.845239: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-07-12 11:02:16.119318: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-07-12 11:02:16.119683: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-07-12 11:02:16.119937: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-07-12 11:02:16.137500: I c:\users\user\source\repos\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6410 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
=============================begin training=============================

进程完成,退出码 0

  1. 运行 utils.py
    "D:\Program Files\Python365\python36.exe" D:/Tensorflow/CNN_LSTM_CTC_Tensorflow-master/utils.py

进程完成,退出码 0

谢谢指点,不知您是否有微信或qq方便联系,请教学习,谢谢

why seq_len equals batch_size?

accourding to
cnn_lstm_otc_ocr.py :
self.seq_len = tf.fill([x.get_shape().as_list()[0]], feature_w)

that is seq_len equals batch_size.
but why?

sharing the model

Hello,

Can you share the model that you got 99 accuracy? uploading on google drive or Box?

Thanks!

Mode: Infer does not work

I have trained the 100K images with 80:20 training to validation ratio. My model has completed 9 checkpoints. My test set consists of 40 images taken from the same validation set, just for testing the code. The test set is labeled 1 to 40. But when I pass this command :

python ./main.py --infer_dir=./imgs/infer/
--checkpoint_dir=./checkpoint/
--num_gpus=0
--mode=infer
following error is produced :

2018-01-31 15:25:43.350523: W tensorflow/core/framework/op_kernel.cc:1198] Failed precondition: sequence_length(0) <= 12
Traceback (most recent call last):
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1350, in _do_call
return fn(*args)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1329, in _run_fn
status, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_1, _arg_lstm/Fill_0_1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "./main.py", line 185, in
tf.app.run()
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "./main.py", line 180, in main
infer(FLAGS.infer_dir, FLAGS.mode)
File "./main.py", line 155, in infer
dense_decoded_code = sess.run(model.dense_decoded, feed)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_1, _arg_lstm/Fill_0_1)]]

Caused by op 'CTCBeamSearchDecoder', defined at:
File "./main.py", line 185, in
tf.app.run()
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 124, in run
_sys.exit(main(argv))
File "./main.py", line 180, in main
infer(FLAGS.infer_dir, FLAGS.mode)
File "./main.py", line 115, in infer
model.build_graph()
File "/home/anubhav/Downloads/Manish Sir/CNN_LSTM_CTC_Tensorflow-master (2)/cnn_lstm_otc_ocr.py", line 24, in build_graph
self._build_train_op()
File "/home/anubhav/Downloads/Manish Sir/CNN_LSTM_CTC_Tensorflow-master (2)/cnn_lstm_otc_ocr.py", line 158, in _build_train_op
merge_repeated=False)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/ops/ctc_ops.py", line 273, in ctc_beam_search_decoder
merge_repeated=merge_repeated))
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/ops/gen_ctc_ops.py", line 77, in _ctc_beam_search_decoder
top_paths=top_paths, merge_repeated=merge_repeated, name=name)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3160, in create_op
op_def=op_def)
File "/home/anubhav/.virtualenvs/cv/local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1625, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

FailedPreconditionError (see above for traceback): sequence_length(0) <= 12
[[Node: CTCBeamSearchDecoder = CTCBeamSearchDecoder[beam_width=100, merge_repeated=false, top_paths=1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lstm/transpose_1, _arg_lstm/Fill_0_1)]]

How to deal with this error and correctly run the program.

For further details checkout the issue #8

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.