GithubHelp home page GithubHelp logo

sanster / text_renderer Goto Github PK

View Code? Open in Web Editor NEW
1.4K 43.0 382.0 12.92 MB

Generate text images for training deep learning ocr model

License: MIT License

Python 87.17% C++ 10.04% Cython 2.79%
synthtext crnn ocr

text_renderer's Introduction

Text Renderer

Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.

Setup

  • Ubuntu 16.04
  • python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Demo

By default, simply run python3 main.py will generate 20 text images and a labels.txt file in output/default/.

example1.jpg example2.jpg

example3.jpg example4.jpg

Use your own data to generate image

  1. Please run python3 main.py --help to see all optional arguments and their meanings. And put your own data in corresponding folder.

  2. Config text effects and fraction in configs/default.yaml file(or create a new config file and use it by --config_file option), here are some examples:

Effect name Image
Origin(Font size 25) origin
Perspective Transform perspective
Random Crop rand_crop
Curve curve
Light border light border
Dark border dark border
Random char space big random char space big
Random char space small random char space small
Middle line middle line
Table line table line
Under line under line
Emboss emboss
Reverse color reverse color
Blur blur
Text color font_color
Line color line_color
  1. Run main.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

bad_example1

bad_example2

bad_example3

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['', '', '广', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '','', '', '', ''...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

debug_demo

Todo

See https://github.com/Sanster/text_renderer/projects/1

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,
  author =       {weiqing.chu},
  title =        {text_renderer},
  howpublished = {\url{https://github.com/Sanster/text_renderer}},
  year =         {2021}
}

text_renderer's People

Contributors

light201212 avatar luvata avatar parsa-ra avatar sanster avatar wyg1997 avatar xiaomaxiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

text_renderer's Issues

关于黑底白字

博主您好,我想请教您按照您的代码可以生成黑底白字的样本嘛

python main.py --strict

我想请问下
生成图片的时候出现了方块 如图
image
然后我加了--strict想看看哪些字不适合该字体 结果就报错了
TypeError: 'NoneType' object is not subscriptable
请问有人遇到和我一样的问题吗 有什么解决方法吗

生成图片有丢字

不知道大家有没有出现这种情况:生成的图片有的会丢字,
_20181029144108
_20181029145048

比如 label是:就是1块钱的线路 但是图片上显示的却是:就是1 的 路
这种图片还不少,请问是什么原因?

生成÷×()异常

你好~在使用您的代码生成公式时,×÷()等字符会自动跳过,比如2÷3会显示23,请问是什么原因呢?

字体生成图片异常

OCRB10N.TTF 这种字体在word中正常显示,用代码无法生成图片,都是方块,请问一下怎么处理

关于default.yaml文件配置问题

博主您好,请问default.yaml文件那些配置中有些配置会有enable这个参数,=ture或者false,但有些配置没写enable这个参数,没写的都是默认enable=ture的吗,还有关于font_size参数,我刻意改成了min=49 和max =50,后面的prydown改成了false,按理说生成的样本图片文字应该都比较大,为什么生成的图片还是有一些很小的文字呢

标签文件中的文字改成汉字的索引

默认生成的tmp_labels.txt标签的内容是 ‘ 00000001 命形式原始得令人吃惊 ’ ,这种形式的,现在想把它改成 ‘ 44955828_2248996261.jpg 29 403 2 172 586 167 10 172 110 121 ’ 这种形式的,后面的10个整数对应其在 char_std_5990.txt (5990个汉字字符,网上下载的)中的索引,如果数据量较大的话,重新写个脚本进行转换应该比较耗时,能否直接修改作者的代码进行转换,如果可以,请指教,谢谢!

添加背景图

我将原来的背景图移走,添加了我自己的背景图,生成的数据不是我加入的背景图,还有就是这个背景图是会被灰度化的吗?不是原来的彩色?

如何在生成的同一幅图中有多个字体?

您好,能不能问一下,怎样实现生成的图像中,可以是多个字体,比如我生成的图像中有数字和汉字,数字是一种字体,汉字是一种字体,这个目前能实现吗?谢谢

关于背景和黑底白字

3
我想生成上图背景的白字图像,我在设置reverse_color后发现生成的图像背景与想要的背景还是不一样
00000200
请问要怎么设置

博主您好,请教一下您关于字体大小的问题

博主您好,我想生成大概尺寸为70×350的图片,想让图片上的文字大一点,字符高度接近70,我把font_size改成了min:78 max:80,也更改了renderer.py中的dst_height = self.out_height,defalut.yaml文件也修改了
perspective_transform:
max_x: 1
max_y: 1
max_z: 1
然而生成的样本文字高度还是比较小,如下:
QQ截图20190527104347

恳请博主大大赐教

字符集chn.txt是怎么来的?

你好,博主chn.txt5990的字符集是怎么来的,我看你随机产生的图片 是从语料库文件中取的 但是也必须是这5990中的才可以,这样的话最多只能识别这5990的字符;现在我有两个问题:
1、chn.txt 如何产生的,我看一级字库和二级字库的都有 2、代码中语料库文件只有一个,如果我要添加语料库文件是不是只要把相应的文件放到语料库文件夹即可

语料库

感谢您提供的便利!
请问您语料库的组织有什么要求吗,怎样才能兼顾中英文及阿拉伯数字的识别,有没有参考链接呢?
谢谢!

template<> PyObject* pyopencv_from(const Mat& m) 这里有问题吧?

template<>
PyObject* pyopencv_from(const Mat& m)
{
if (!m.data)
Py_RETURN_NONE;
Mat temp, p = (Mat)&m;
if (!p->u || p->allocator != &g_numpyAllocator)
{
temp.allocator = &g_numpyAllocator;
ERRWRAP2(m.copyTo(temp));
p = &temp;
}
PyObject* o = (PyObject*)p->u->userdata;
Py_INCREF(o);
return o;
}

模板参数为何可以省略,还有定义模板时,不可以用具体的实参啊,Mat应该为T啊,不然编译不通过啊

增加腐蚀、笔画淡化等效果

首先我得感谢这个项目的贡献者,这是我看到的功能最强大、最有效的文字图片生成项目。我之前也通过修改light border实现了拍照时黑边反白的效果。
但在实际测试过程中,还有字体腐蚀、笔画淡化时识别效果不好的问题,不知道怎么来模拟。请问有什么好的方法吗?

关于字体大小限制问题

博主您好,我想生成大概尺寸为70×350的图片,想让图片上的文字大一点,字符高度接近70,我把font_size改成了min:78 max:80,也更改了renderer.py中的dst_height = self.out_height,defalut.yaml文件也修改了
perspective_transform:
max_x: 1
max_y: 1
max_z: 1
然而生成的样本文字高度还是比较小,如下:
00000058
感觉这个字符的最大高度好像被限制了一样,想请教一下您我该如何修改

Optimize

  • add corpus for get_sample line by line from txt file: 9f8b256
  • add helper runner to generate images by different configs : 02147a6
  • image width depending on font size, not fixed: fd38e93
  • make generate image faster, do not use x8 for better effect, we still need x8
  • use image with ROI label to crop background, make it possible to generate images target specific scene
  • change image light by gradient

请教一下,这种样本应该如何生成

请教一下,这种样本应该如何生成 。谢谢!
用您的程序 我配置不出这样的字体。
前两张的特点是:字符的骨架都是单像素的
后一张的特点是:字符是黏在一起的

24_
23_1532
23_782868

关于生成身份证背景的样本

博主您好,我想生成特定背景下的图片,比如身份证那种带纹理的背景,但是我手动截了一些身份证真实含有纹理的背景,bg参数也设为了1,可生成的背景并不是带纹理的真实背景,还是那种普通的灰度背景,我想请教下一下您怎么样生成那种真实背景样本呢,不胜感激

python main.py

  你好,打扰了。我在直接运行python main.py 会报错:glob.glob(fonts_dir + '//*',recursive=True),提示多了recursive参数。删去recursive参数能运行,但是会在加载font和txt文件的时候报错.删去 " / "后能正确加载文件(以上代码所在行数分别是 text_renderer/libs/font_utils.py/line18和 text_renderer/textrenderer/corpus.py/line20),再次运行python main.py时一直反馈异常,输出 Retry gen_image(异常所在main.py中line71,应该是render.py的gen_image()函数执行出错)我是在ubuntu14和ubuntu18里面跑的,难道只能在ubuntu16里跑么?
  谢谢了。
  
   

关于各个分支的问题

博主,您好,请问一下您这几个分支有什么区别呢,比如我看make_draw_text_on_transparent_bg_work这个分支中default.yaml文件中的im_bg = 1,而master分支是等于0.5,还有dev和effects这两个分支又分别有什么不同呢,另外博主大大,生成图片的背景是随机的从bg文件夹中的背景随机挑选几张组合而成的嘛,还是随机仅仅挑选一张作为背景。

添加背景

请问一下,已在bg文件夹中添加自己的彩色背景图后,生成数据并不是添加的背景,只是普通的白底,这是为什么呢

centos 报错

Traceback (most recent call last):
File "main.py", line 25, in
counter = mp.Value('i', 0)
File "/opt/conda/lib/python3.6/multiprocessing/context.py", line 135, in Value
ctx=self.get_context())
File "/opt/conda/lib/python3.6/multiprocessing/sharedctypes.py", line 73, in Value
obj = RawValue(typecode_or_type, *args)
File "/opt/conda/lib/python3.6/multiprocessing/sharedctypes.py", line 48, in RawValue
obj = new_value(type)
File "/opt/conda/lib/python3.6/multiprocessing/sharedctypes.py", line 40, in _new_value
wrapper = heap.BufferWrapper(size)
File "/opt/conda/lib/python3.6/multiprocessing/heap.py", line 248, in init
block = BufferWrapper._heap.malloc(size)
File "/opt/conda/lib/python3.6/multiprocessing/heap.py", line 230, in malloc
(arena, start, stop) = self._malloc(size)
File "/opt/conda/lib/python3.6/multiprocessing/heap.py", line 128, in _malloc
arena = Arena(length)
File "/opt/conda/lib/python3.6/multiprocessing/heap.py", line 81, in init
self.buffer = mmap.mmap(self.fd, self.size)
FileNotFoundError: [Errno 2] No such file or directory

random crop的参数添加

在readme里看到有random crop这个功能,但在default.yaml里面没看到有random crop的选项,后来看了下你代码里面,自己添加了一个,但貌似没起作用,能不能帮忙看看,非常感谢!!
crop:
top:
min: 20
max: 50
bottom:
min: 20
max: 50
我在配置表里加了这个,但生成的没什么用

虚拟机ubuntu16生成样本卡住不动

我在虚拟机ubuntu16.04下,想测试下。直接运行python3 main.py会卡住不动,图像也没有生成。

laoma@ubuntu:~/text_renderer-master$ python3 main.py
Load fonts from /home/laoma/text_renderer-master/data/fonts/chn
Total fonts num: 1
Background num: 1
Loading corpus from: ./data/corpus
Loading chn corpus: 1/1
Generate text images in ./output/default
这里就卡住不动了,是环境没配置好吗?

字频平衡

感谢分享数据生成的代码!
请问有什么思路做字频平衡吗?有可以参考的程序吗

.

问题已解决

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.