GithubHelp home page GithubHelp logo

flming / crnn.tf2 Goto Github PK

View Code? Open in Web Editor NEW
149.0 9.0 54.0 1.34 MB

Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2

License: MIT License

Python 99.34% Dockerfile 0.66%
crnn ocr scene-text-recognition tensorflow2 keras ctc tf2 tensorflow-lite

crnn.tf2's Introduction

Convolutional Recurrent Neural Network for End-to-End Text Recognition - TensorFlow 2

TensorFlow version Python version Paper Zhihu

This is a re-implementation of the CRNN network, build by TensorFlow 2. This repository may help you to understand how to build an End-to-End text recognition network easily. Here is the official repo implemented by bgshih.

Abstract

This repo aims to build a simple, efficient text recognize network by using the various components of TensorFlow 2. The model build by the Keras API, the data pipeline build by tf.data, and training with model.fit, so we can use most of the functions provided by TensorFlow 2, such as Tensorboard, Distribution strategy, TensorFlow Profiler etc.

Installation

$ pip install -r requirements.txt

Demo

Here I provide an example model that trained on the Mjsynth dataset, this model can only predict 0-9 and a-z(ignore case).

$ wget https://github.com/FLming/CRNN.tf2/releases/download/v0.2.0/SavedModel.tgz
$ tar xzvf SavedModel.tgz
$ python tools/demo.py --images example/images/ --config configs/mjsynth.yml --model SavedModel

Then, You will see output like this:

Path: example/images/word_1.png, y_pred: [b'tiredness'], probability: [0.9998626]
Path: example/images/word_3.png, y_pred: [b'a'], probability: [0.67493004]
Path: example/images/2_Reimbursing_64165.jpg, y_pred: [b'reimbursing'], probability: [0.990946]
Path: example/images/word_2.png, y_pred: [b'kills'], probability: [0.9994573]
Path: example/images/1_Paintbrushes_55044.jpg, y_pred: [b'paintbrushes'], probability: [0.9984008]
Path: example/images/3_Creationisms_17934.jpg, y_pred: [b'creationisms'], probability: [0.99792457]

About decode methods, sometimes the beam search method will be better than the greedy method, but it's costly.

Train

Before you start training, maybe you should prepare data first. All predictable characters are defined by the table.txt file. The configuration of the training process is defined by the yml file.

This training script uses all GPUs by default, if you want to use a specific GPU, please set the CUDA_VISIBLE_DEVICES parameter.

$ python crnn/train.py --config configs/mjsynth.yml --save_dir PATH/TO/SAVE

The training process can visualize in Tensorboard.

$ tensorboard --logdir PATH/TO/MODEL_DIR

For more instructions, please refer to the config file.

Data prepare

To train this network, you should prepare a lookup table, images and corresponding labels. Example data is copy from MJSynth and ICDAR2013 dataset.

The file contains all characters and blank labels (in the last or any place both ok, but I find Tensorflow decoders can't change it now, so set it to last). By the way, you can write any word as blank.

Image data

It's an End-to-End method, so we don't need to indicate the position of the character in the image.

Paintbrushes Creationisms Reimbursing

The labels corresponding to these three pictures are Paintbrushes, Creationisms, Reimbursing.

Annotation file

We should write the image path and its corresponding label to a text file in a certain format such as example data. The data input pipeline will automatically detect the support format. Customization is also very simple, please check out the dataset factory.

Support format

Eval

$ python crnn/eval.py --config PATH/TO/CONFIG_FILE --weight PATH/TO/MODEL_WEIGHT

Converte & Ecosystem

There are many components here to help us do other things. For example, deploy by Tensorflow serving. Before you deploy, you can pick up a good weight, and convertes model to SavedModel format by this command, it will add the post processing layer in the last and cull the optimizer:

$ python tools/export.py --config PATH/TO/CONFIG_FILE --weight PATH/TO/MODEL_WEIGHT --pre rescale --post greedy --output PATH/TO/OUTPUT

And now Tensorflow lite also can convert this model, that means you can deploy it to Android, iOS etc.

Note. Decoders can't convert to Tensorflow lite because of the assets. Use the softmax layer or None.

crnn.tf2's People

Contributors

flming avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

crnn.tf2's Issues

Expected concatenating dimensions in the range [-1, 1), but got 1 [Op:ConcatV2] name: concat

您好,我在執行B106Roger的CRNN.tf2 (網址:https://github.com/B106Roger/CRNN.tf2) 的eval_full_tflite.py時使用Beam Search Decoder並出現下列錯誤
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Expected concatenating dimensions in the range [-1, 1), but got 1 [Op:ConcatV2] name: concat

image

因為該作者的code是從您這裡延伸的,而且在decoder.py的部分並沒有修改,所以想請問這樣的錯誤是否有出現過,也想請問解決辦法,感謝!

Empty preds when trained on my own data

Hello! I am trying to train on my own data based on exported_model.h5. After 100 epochs i get empty preds:

Снимок экрана 2020-11-28 в 00 44 55

My config:

train:
    dataset_builder: &ds_builder
        table_path: 'data/table.txt'
        # 1: Grayscale image, 3: RGB image
        img_channels: 3
        # The image that width greater than max img_width will be dropped.
        # Only work with image width is null.
        max_img_width: 400
        ignore_case: false
        # If it is not null, the image will be distorted.
        img_width: null
        # If change height, change the net.
        img_height: 32
    train_ann_paths:
        - 'data/dataset/annotation_train.txt'
        - 'data/dataset/annotation_val.txt'
    val_ann_paths:
        - 'data/dataset/annotation_test.txt'
    batch_size_per_replica: 256
    # The model for restore, even if the number of characters is different
    restore: 'exported_model.h5'
    learning_rate: 0.001
    # Number of epochs to train.
    epochs: 100
    # Reduce learning rate when a metric has stopped improving.
    reduce_lr:
        factor: 0.5
        patience: 5
        min_lr: 0.0001
    # Tensorboard
    tensorboard:
        histogram_freq: 1
        profile_batch: 0

eval:
    dataset_builder:
        <<: *ds_builder
    ann_paths:
        - '/datasets/ICDAR/2013/Challenge2_Test_Task3_gt.txt'
    batch_size: 1

Tensorboard:
Снимок экрана 2020-11-28 в 00 40 27

If i run demo with exported_model.h5 i get preds. What i am doing wrong?

Invalid argument error

I train with this code for Korean and I got this error:

(most recent call last):
  File "crnn/train.py", line 58, in <module>
    model.fit(train_ds, epochs=config['epochs'], callbacks=callbacks,
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1183, in fit
    tmp_logs = self.train_function(iterator)
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 889, in __call__
    result = self._call(*args, **kwds)
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 917, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3023, in __call__
    return graph_function._call_flat(
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1960, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 591, in call
    outputs = execute.execute(
  File "/home/***/miniconda3/envs/gpu38/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: 3 root error(s) found.
  (0) Invalid argument:  data/im\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00pg; Invalid argument
	 [[{{node ReadFile}}]]
	 [[MultiDeviceIteratorGetNextFromShard]]
	 [[RemoteCall]]
	 [[IteratorGetNextAsOptional]]
	 [[ReadVariableOp/_290]]
  (1) Invalid argument:  data/im\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00pg; Invalid argument
	 [[{{node ReadFile}}]]
	 [[MultiDeviceIteratorGetNextFromShard]]
	 [[RemoteCall]]
	 [[IteratorGetNextAsOptional]]
  (2) Invalid argument:  data/im\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00pg; Invalid argument
	 [[{{node ReadFile}}]]
	 [[MultiDeviceIteratorGetNextFromShard]]
	 [[RemoteCall]]
	 [[IteratorGetNextAsOptional]]
	 [[replica_1/assert_equal_1/Assert/Assert/data_3/_218]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_23254]

Function call stack:
train_function -> train_function -> train_function

I think if this codes can recognize Chinese with no problem it can do other utf-8 texts too, but it can't

What is cause about this error?
The difference of my data and example is file name(is Korean, ex: 가나다_453.jpg ), so is this cause about it?

Prediction accuracy in ONNX Runtime

I converted model trained in CRNN.tf2 to onnx format(tensorflow-onnx). Thanks to @FLming I know that it is impossible to run it in OpenCV so I tried ONNX Runtime. It works, but I don't know how to get prediction accuracy?

import onnxruntime as rt
from onnxruntime.datasets import get_example
import cv2
import numpy as np

img = cv2.imread("file.jpg")
img = img.astype(np.float32)

model = get_example("model.onnx")
sess = rt.InferenceSession(model, providers=["CPUExecutionProvider"])
input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
result = sess.run([output_name], {input_name: img})

can i help me

Epoch 1/20
2022-02-16 08:42:18.888733: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index 1 of dimension 0 out of bounds.
2022-02-16 08:42:18.888759: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index 1 of dimension 0 out of bounds.
2022-02-16 08:42:18.888775: W tensorflow/core/framework/op_kernel.cc:1692] OP_REQUIRES failed at strided_slice_op.cc:108 : Invalid argument: slice index 1 of dimension 0 out of bounds.
Traceback (most recent call last):
File "crnn/train.py", line 59, in
validation_data=val_ds)
File "/root/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1184, in fit
tmp_logs = self.train_function(iterator)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 885, in call
result = self._call(*args, **kwds)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call
return self._stateless_fn(*args, **kwds)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3040, in call
filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
ctx, args, cancellation_manager=cancellation_manager))
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 596, in call
ctx=ctx)
File "/root/anaconda3/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: slice index 1 of dimension 0 out of bounds.
[[node strided_slice_1 (defined at crnn/train.py:59) ]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]]
[[IteratorGetNextAsOptional]]
[[OptionalHasValue_1/_12]]
(1) Invalid argument: slice index 1 of dimension 0 out of bounds.
[[node strided_slice_1 (defined at crnn/train.py:59) ]]
[[MultiDeviceIteratorGetNextFromShard]]
[[RemoteCall]]
[[IteratorGetNextAsOptional]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_12630]

Function call stack:
train_function -> train_function

训练时出错

Traceback (most recent call last):
  File "train.py", line 71, in <module>
    validation_data=val_ds)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 108, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 1102, in fit
    tmp_logs = self.train_function(iterator)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 796, in __call__
    result = self._call(*args, **kwds)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 839, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 712, in _initialize
    *args, **kwds))
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2948, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3319, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 3181, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 614, in wrapped_fn
    return weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/framework/func_graph.py", line 973, in wrapper
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    /mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py:809 train_function  *
        return step_function(self, iterator)
    /mnt/d/github/CRNN.tf2/metrics.py:26 update_state  *
        values = tf.math.reduce_any(tf.math.not_equal(y_true, y_pred), axis=1)
    /mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py:201 wrapper  **
        return target(*args, **kwargs)
    /mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:1674 not_equal
        return gen_math_ops.not_equal(x, y, name=name)
    /mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py:6517 not_equal
        name=name)
    /mnt/e/ubuntu/anaconda3/envs/tflite/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py:537 _apply_op_helper
        inferred_from[input_arg.type_attr]))

    TypeError: Input 'y' of 'NotEqual' Op has type float32 that does not match type int64 of argument 'x'.

麻烦帮忙看下这个错误, 我debug后看到 y_true 是None, 谢谢

Is 'Sequence Accuracy' referring only to exact matches?

If I'm understanding correctly, the metric of 'sequence accuracy', for example val_sequence_accuracy: 0.52 means that 52% of each image in the validation set is read perfectly. Is there a way to ignore differences in spaces? I'm trying to read entire lines of images at once, and would like the prediction of 'hello there' to be considered equivalent to 'hello there' (even if the CTCLoss will be different, sequence_accuracy should be considered the same).

About Mjsynth

Thanks your sharing! The "Mjsynth"Dataset download link can not be opened, can share the data set in other ways, Baidu disk or mailbox [email protected]

the problem of Tibetan text recognition

Hi, dear, it seems that you use 'chars = tf.strings.unicode_split(labels, 'UTF-8')' to split label strings in the project, now I want using the graphemes function to split charactors of Tibetan(Tibetan characters are variable length unicode), but I get the error 'OperatorNotAllowedInGraphError: iterating over tf.Tensor is not allowed: AutoGraph did not convert this function. Try decorating it directly with @tf.function.' . What should I do? Or do you have any good suggestions?

Handling invalid image path or corrupted image files.

The problem:
In the current implementation, if a path in the annotation file provided to DatasetBuilder does not exist or one of the files is somehow corrupted, the entire training comes to a halt and you lose all your progress.
Considering it could take hours to iterate through all images, this becomes very frustrating.

I came across this problem because a few images (around 50) got somehow corrupted while downloading the MJSynth dataset. I did try to clean them up as this solution suggested, but I'm still encountering weird nonsensical errors:

     [30] try:
---> [31]    model.fit(train_ds,
     [32]              epochs=EPOCHS,
     [33]              callbacks=callbacks,
     [34]              validation_data=val_ds,
     [35]              use_multiprocessing=True)
     [36] except KeyboardInterrupt:
     [37]    pass

File c:\Users\somso\AppData\Local\Programs\Python\Python38\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File c:\Users\somso\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\eager\execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     52 try:
     53   ctx.ensure_initialized()
---> 54   tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     55                                       inputs, attrs, num_outputs)
     56 except core._NotOkStatusException as e:
     57   if name is not None:

InvalidArgumentError: Graph execution error:

2 root error(s) found.
  (0) INVALID_ARGUMENT:  jpeg::Uncompress failed. Invalid JPEG data or crop window.
	 [[{{node DecodeJpeg}}]]
	 [[IteratorGetNext]]
	 [[assert_equal_3/Assert/Assert/data_0/_4]]
  (1) INVALID_ARGUMENT:  jpeg::Uncompress failed. Invalid JPEG data or crop window.
	 [[{{node DecodeJpeg}}]]
	 [[IteratorGetNext]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_11983]

What could be done:
Catching the exceptions thrown while trying to load images and logging the catched exception in the terminal and skipping it.

I honestly tried to come up with a solution myself but I still cannot understand how DatasetBuilder works. lol
I'd be happy to make a PR myself if you have an idea of how to fix this.

predict not True

image

I use the demo file to predict, the config file is like below.
dataset_builder: &ds_builder
table_path: 'example/table.txt'
img_shape: [32, null, 3]
max_img_width: 400
ignore_case: true

train:
dataset_builder:
<<: *ds_builder
train_ann_paths:
- '/content/gdrive/MyDrive/CRNN/CRNN.tf2/example/annotation_train.txt'
- '/content/gdrive/MyDrive/CRNN/CRNN.tf2/example/annotation_val.txt'
val_ann_paths:
- '/content/gdrive/MyDrive/CRNN/CRNN.tf2/example/annotation_test.txt'
batch_size_per_replica: 32
# Number of epochs to train.
epochs: 2000
lr_schedule:
initial_learning_rate: 0.0001
decay_steps: 600000
alpha: 0.01
# TensorBoard Arguments
# https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard#arguments_1
tensorboard:
histogram_freq: 1
profile_batch: 0

eval:
dataset_builder:
<<: *ds_builder
ann_paths:
- '/content/gdrive/MyDrive/CRNN/CRNN.tf2/example/annotation_eval.txt'
batch_size: 1

probability: [0.9999411] , but result show is . It only displays numbers, not characters.
The result it returns is :
test_images/IMG_20211212_091803.jpg, y_pred: [b''], probability: [0.9999411] .
While I'm looking forward to is :"HSD".
Hope you answer.

chinese predicting problems

python tools/demo.py --images demo/ --config configs/mjsynth.yml --model save/
2022-03-07 13:51:11.325152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.334636: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.335111: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.335757: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-07 13:51:11.336192: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.336660: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.337103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.673181: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.673651: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.674058: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-03-07 13:51:11.674440: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9596 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:01:00.0, compute capability: 8.6
2022-03-07 13:51:17.374815: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
2022-03-07 13:51:18.719922: I tensorflow/stream_executor/cuda/cuda_dnn.cc:369] Loaded cuDNN version 8100
2022-03-07 13:51:20.915022: I tensorflow/stream_executor/cuda/cuda_blas.cc:1760] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
Path: demo/2.jpg, y_pred: [b''], probability: [1.]
Path: demo/3.jpg, y_pred: [b'\xe9\x80\xa2\xe5\x9c\xba\xe7\xab\xbf\xe6\x9c\xa8'], probability: [0.9996128]
Path: demo/5.jpg, y_pred: [b'\xe5\x8f\xa4\xe4\xbb\x8a\xe5\xbf\x83\xe4\xba\xba\xe6\xb4\x81\xe6\x96\xb9\xe5\x8d\xab'], probability: [0.11679293]
Path: demo/1.jpg, y_pred: [b''], probability: [1.]
Path: demo/0.jpg, y_pred: [b'\xe8\x81\x94\xe4\xba\xa7\xe5\x93\x81'], probability: [0.99998957]
Path: demo/6.jpg, y_pred: [b'\xe7\x91\x9e\xe6\x99\xaf\xe8\x8b\x91\xe9\xa4\x90'], probability: [0.98446]
Path: demo/4.jpg, y_pred: [b'\xe5\xa4\xa9\xe7\xa7\x8b\xe6\x9c\x89\xe9\x9b\x81\xe7\xbe\xa4'], probability: [0.99987626]

Getting Nan loss on MJSynth Dataset

I downloaded the MJSynth dataset and followed the instructions but while training at the first epoch the loss suddenly changes to Nan, i tried adding regularization and norm clipping but both didn't fix the problem

Also the sequence accuracy is 0.0 most of the time

中文版本

感谢作者开源这么好的项目,我是基于您的项目重新构建的一个中文识别的项目,并用tensorflow-serving构建服务端,提高服务端的处理能力,与大家共享。链接:https://www.jianshu.com/p/e0d9efaadb0f

你好,DatasetBuilder里面的tokenize有点疑问想请教一下作者

作者你好,我发现你在tokenize里面用了tf.ragged.map_flat_values这个方法,这里会用0作为填充的index,但是index=0在字典里面又是有对应的字符的,请问这样训练的话会不会出问题呢?另外字典里面的最后一个BLK是用来做什么的呢?

请问作者训练了多少个epoch之后,sequence_accuracy开始有值?

我把数据换成了中文的,加载了您的pretrained weight, 然后训练了10个epoch之后,sequence_accuracy依然是0,loss开始下降的还不错,但是两个epoch之后,loss下降到了30左右,之后不再变化。当然我也使用了ReduceLROnPlateau。 数据集可以确认是没问题的,因为我用相同的数据集,用pytorch版的随便跑了几个epoch,sequence_accuracy可以达到70% 。 想请问,您训练的时候,sequence_accuracy跑了多少epoch之后开始有值?

CTC decoder 时怎么知道哪个序号是我的 <blk>

decoded, neg_sum_logits = tf.nn.ctc_greedy_decoder(

在这里项目里面,您的 /< /BLK> 应该对应的 index 是 37 也可以理解为 max len sequnce - 1, 也可以理解为 index -1 , 在训练的时候 ctc loss 我们可以特别指定 blank_index 是任意值,但是在 decoder 的时候,函数里面就没有这个 blank_index 参数了,请问假设我在训练的时候 blank_index 随便赋予了一个 index 666,decoder 的时候这个 666 对应的blk字符 就会被显示在输出 string 里面,这个问题怎么理解,有没有好的解决方案

Does CRNN support text line images?

I tried to run the demo of a trained model on a number of images with text separated by spaces - but only the first word was predicted. I'm a little confused by this - there are multiple words in the images, and I have checked that a space is also in the character table.

tflite-converter.py problem

I'm learning a lot from this project. Thank you.

I'm testing with the h5 file you put on Google Drive.
However, the following error occurs.

python : 3.7.0
tensorlfow version : 2.4.0

1. export.py ==> Success

/opt/miniconda3/envs/crnn/bin/python /Users/nosun10005/PycharmProjects/CRNN.tf2-master/tools/export.py --model ../example/model/exported_model.h5 --output ../example/model/saved --config ../configs/mjsynth.yml --post greedy

2. tflite-converter.py ==> Fail

/opt/miniconda3/envs/crnn/bin/python /Users/nosun10005/PycharmProjects/CRNN.tf2-master/tools/tflite_converter.py -m ../example/model/saved -o ../example/model/exported_model.tflite
.......
function_optimizer: Graph size after: 752 nodes (709), 959 edges (915), time = 44.655ms.
function_optimizer: Graph size after: 752 nodes (0), 959 edges (0), time = 18.138ms.
Optimization results for grappler item: __inference_while_cond_5605_592
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: __inference_model_ctc_greedy_decoder_RaggedFromSparse_Assert_AssertGuard_true_6516_30714
function_optimizer: function_optimizer did nothing. time = 0ms.
function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: __inference_while_cond_4719_5412
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: __inference_while_body_5606_37121
function_optimizer: function_optimizer did nothing. time = 0ms.
function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: __inference_while_body_6046_31456
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: __inference_while_body_5160_7831
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0ms.
Optimization results for grappler item: __inference_while_cond_6045_40507
function_optimizer: function_optimizer did nothing. time = 0ms.
function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: __inference_while_cond_5159_3281
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: __inference_model_ctc_greedy_decoder_RaggedFromSparse_Assert_AssertGuard_false_6517_4821
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0.001ms.
Optimization results for grappler item: __inference_while_body_4720_763
function_optimizer: function_optimizer did nothing. time = 0.001ms.
function_optimizer: function_optimizer did nothing. time = 0ms.

Traceback (most recent call last):
File "/Users/nosun10005/PycharmProjects/CRNN.tf2-master/tools/tflite_converter.py", line 22, in
tflite_model = converter.convert()
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1117, in convert
return super(TFLiteConverterV2, self).convert()
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 920, in convert
_convert_to_constants.convert_variables_to_constants_v2_as_graph(
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/python/framework/convert_to_constants.py", line 1102, in convert_variables_to_constants_v2_as_graph
converter_data = _FunctionConverterData(
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/python/framework/convert_to_constants.py", line 806, in init
self._build_tensor_data()
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/python/framework/convert_to_constants.py", line 825, in _build_tensor_data
data = val_tensor.numpy()
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1071, in numpy
maybe_arr = self._numpy() # pylint: disable=protected-access
File "/opt/miniconda3/envs/crnn/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1039, in _numpy
six.raise_from(core._status_to_exception(e.code, e.message), None) # pylint: disable=protected-access
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array.

想要训练识别中文

我想要训练一个能够识别中文的模型,能够按照训练英文这样训练吗,应该修改那些步骤?

How to get the Last layer (ctc_greedy_decoder)?

Hi, nice work!!
The model provided here for demo (SavedModel) has the last layer ctc_greedy_decoder, while the model in train.py this layer is not present. How do we get it? is this layer added after training if so can you please tell, how to get it done?

logits (Dense) (None, None, 38) 19494


ctc_greedy_decoder (CTCGreed ((None,), (None,)) 0

Load model in OpenCV

I trained model and try to load it in OpenCV(Python).

When I convert it to frozen_graph and load:

Traceback (most recent call last):
  File "frozen_test.py", line 4, in <module>
    net = cv.dnn.readNet('frozen_graph.pb')
cv2.error: OpenCV(4.5.1) /tmp/pip-req-build-ms668fyv/opencv/modules/dnn/src/tensorflow/tf_importer.cpp:1061: error: (-2:Unspecified error) Input layer not found: StatefulPartitionedCall/StatefulPartitionedCall/model/logits/Tensordot in function 'populateNet'

When I convert it to ONNX(tensorflow-onnx) and load:

  File "text_recognition.py", line 32, in loadRecognitionModel
    self.recognizer = cv.dnn.readNet(modelRecognition)
cv2.error: OpenCV(4.5.1) /tmp/pip-req-build-ms668fyv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:1887: error: (-2:Unspecified error) in function 'handleNode'
> Node [Gather]:(StatefulPartitionedCall/model/reshape7/Shape:0) parse error: OpenCV(4.5.1) /tmp/pip-req-build-ms668fyv/opencv/modules/dnn/src/onnx/onnx_importer.cpp:1648: error: (-215:Assertion failed) indexMat.total() == 1 in function 'handleNode'

Python 3.8.5
Tensorflow 2.5.0
OpenCV 4.5.1
Ubuntu 20.04.2 LTS

What is the best way to use CRNN.tf2 in OpenCV?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.