jinpengli / deep_ocr Goto Github PK
View Code? Open in Web Editor NEWmake a better chinese character recognition OCR than tesseract
make a better chinese character recognition OCR than tesseract
after "cat deep_ocr_workspace.z* > unsplit_deep_ocr_workspace.zip" this command my ubuntu go die
mybe the file is too big
运行测试程序时出现错误:
! image to reco: data/holiday_notification.jpg
Traceback (most recent call last):
File "./bin/deep_ocr_reco", line 137, in
show_img(raw, title="raw image")
File "./bin/deep_ocr_reco", line 27, in show_img
plt.gray()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 3829, in gray
set_cmap("gray")
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2269, in set_cmap
im = gci()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 335, in gci
return gcf()._gci()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 601, in gcf
return figure()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 548, in figure
**kwargs)
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/backends/_backend_tk.py", line 1044, in new_figure_manager_given_figure
window = Tk.Tk(className="matplotlib")
File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 1767, in init
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable
环境:
Ubuntu 14.04
Python 2.7
请问windows下可以运行deep_ocr吗,能不能用python调用,还是只能使用命令行运行。
另:之前没有使用过docker,从docker官网下载windows稳定版时无法安装。百度云上下载两个文件的速度很慢。不知有没有方法可以解决。
Thanks for your great share! I run your wonderful code successfully with your test image.
I replaced the test image with data/id_card_img.jpg which is a funny id card. but i get nothing ,
Then i use the scripts in the ./bin/deep_ocr_id_card_segmentation to get a gray_texts.jpg. Then i run
the reco_chars.py for the gray_texts.jpg. The result is not reasonable at all. i think the reason is that the these characters are not contained in train data .Is that right?
Maybe i need finetune the model on a large character dataset ,could you tell me how to finetune?
Thanks for your help and nice work!
自然场景文字识别效果目前还不及Tesseract,为什么不考虑用深度网络检测文字呢?
出现如下错误:
libdc1394 error: Failed to initialize libdc1394
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0507 12:27:13.683022 47 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0507 12:27:13.683099 47 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0507 12:27:13.683104 47 _caffe.cpp:142] Net('/workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt', 1, weights='/workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel')
Traceback (most recent call last):
File "/opt/deep_ocr/reco_chars.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "/opt/deep_ocr/reco_chars.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt
请大佬指点
我下载了deep_ocr_workspace.zip 和 reco_chars.py 运行脚本出现以下错误,而且你的压缩包在window下解压出错。 我感觉是你的压缩文件有问题
root@orange-VirtualBox:~/caffe/python# python reco_chars.py
WARNING: Logging before InitGoogleLogging() is written to STDERR
W1223 17:24:19.496032 3764 _caffe.cpp:122] DEPRECATION WARNING - deprecated use of Python interface
W1223 17:24:19.496183 3764 _caffe.cpp:123] Use this instead (with the named "weights" parameter):
W1223 17:24:19.496206 3764 _caffe.cpp:125] Net('/workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt', 1, weights='/workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel')
Traceback (most recent call last):
File "test.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "test.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel
there is some error on the bellow:
python3 reco_chars.py
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 6:15: Message type "caffe.LayerParameter" has no field named "input_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0103 17:15:15.282599 7488 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt
*** Check failure stack trace: ***
已放弃 (核心已转储)
what's the matter of this file?
best regards!
PYTHON的BUG
㧟 䏝 㤘 䥽 䁖 䦃 㸆
这几个字无法通过PIL画出来,不信你试试
我平常不用PYTHON,这种BUG该往哪报啊??
简单的测试代码
font = ImageFont.truetype("STXIHEI.TTF", 300)
img = Image.new("L", (300, 300), "black")
draw = ImageDraw.Draw(img)
ch = u'㸆'
draw.text((0, -75), ch, 255, font=font)
img.show()
你可以试试,哈哈蛤
把STXIHEI.TTF找出来,系统里就有,是华文细黑!!
首先,感谢你的开源项目~
我的问题是:我看到之前的问题中有提到训练集的问题,也下载了百度网盘的数据。知道可以用字体文件生成训练集。请问训练集难道每个类别只有一张图片吗?如果不是的话更多的训练数据是如何自动产生的?
我仔细检查了每个依赖库、模型文件路径、解压等因素,还是报这个错误?所以,是你上传的压缩文件错误吗?
Traceback (most recent call last):
File "reco_chars.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "reco_chars.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototx
best regards!
运行deep_ocr_reco,报错:ImportError: No module named ocrolib
剩下的几个可执行文件报错则是:ImportError: No module named cv2
看了一下requirement.txt里面确实没有cv2,不知道这里怎么过的。
我暂时没有训练数据库,希望你能发给我一份,谢谢
我的邮箱[email protected]
。。。。
网上的都是手写汉字数据集
Traceback (most recent call last):
File "./bin/deep_ocr_reco", line 19, in
import deep_ocr.ocrolib as ocrolib
ModuleNotFoundError: No module named 'deep_ocr.ocrolib'
您好!你的博客代码非常优雅简洁,可否fork你的博客代码?GitHub上好像没有看到。
你好, 按照步骤配置好虚拟环境后, 执行如下:
haiyun@dell-Precision-Tower-5810:~$ source ~/deep_ocr_env/bin/activate && cd ~/deep_ocr && ./bin/deep_ocr_reco data/holiday_notification.jpg -v -d
! image to reco: data/holiday_notification.jpg
Traceback (most recent call last):
File "./bin/deep_ocr_reco", line 137, in
show_img(raw, title="raw image")
File "./bin/deep_ocr_reco", line 27, in show_img
plt.gray()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 3932, in gray
set_cmap("gray")
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2372, in set_cmap
im = gci()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 335, in gci
return gcf()._gci()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 601, in gcf
return figure()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 548, in figure
**kwargs)
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/backends/_backend_tk.py", line 1044, in new_figure_manager_given_figure
window = Tk.Tk(className="matplotlib")
File "/home/haiyun/install/python-2.7.11/lib/python2.7/lib-tk/Tkinter.py", line 1814, in init
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display ":0"
在网上搜索“_tkinter.TclError: couldn't connect to display ":0"” 尝试去解决没有成功, 烦请指点! 多谢!
大哥您好,
非常赞赏你的工作。
能不能告知您训练的网络是基于哪个网络呢是以下这个吗?:https://github.com/JinpengLI/deep_ocr/blob/master/data/caffe_nets/lower_eng/lenet_train_test.prototxt
使用docker怎么添加自己的图片测试啊,感觉添加起来好麻烦啊
我没仔细看,听说您是用深度学习做的
可是那么多汉字的训练样本您是哪找到的啊
你别告诉我自己一个个截图,我无法接受
tesseract可以输出识别到的每个字的坐标,请问deep_ocr该怎么用才能输出字坐标呢? 谢谢
Great thanks for the models & codes.
I noticed in http://chongdata.com/ocr your api could recognize images with some texts very well. But in the open sourced codes here the results are not so good, with some useless letters in the results.
I think you have done some text detection in your online api, any ideas to share the codes here?
Thanks again for the codes shared.
Thank you your code.
I have a ubuntu desktop computer with nvidia gpu. I have install the caffe with gpu.
How to use this ocr program?
如果能将用caffe训练的过程写个简单的说明就好了!
您好:
执行虫数据中lesson4的deep_ocr_make_caffe_dataset命令时候,images文件夹生成了,但是没有生成图片, 报错代码:
File "/opt/deep_ocr/bin/deep_ocr_make_caffe_dataset", line 83, in
lang_chars = lang_chars_gen.do()
File "build/bdist.linux-x86_64/egg/deep_ocr/lang_aux.py", line 27, in do
ImportError: No module named langs.lower_eng
langs已经添加到了python模块中,请问这个是什么原因导致的呢?
我上次问过,您没有回答清楚啊~
我没看出来你拿到身份证字体格式后,如何遍历下所有文字形成图片??
你能说得清楚些吗
如何从文字转到图片??
Line 296 in 450148c
test_image = "/opt/deep_ocr/data/test_data.png"
最新的为什么还有这个问题 是该怎么解决呢
我在centos下进行了安装,执行了这个命令./bin/deep_ocr_reco data/holiday_notification.jpg -v
只得到如下输出:
! image to reco: data/holiday_notification.jpg
! no-normalization
estimate skew angle and rotate
estimate_thresholds lo 0.964706 and hi 1.000000
data/holiday_notification.jpg lo-hi (0.96 1.00) angle 0.0 ! no-normalization
scale= 5.744562646538029
computing column separators
considering at most 3 whitespace column separators
computing lines
propagating labels
spreading labels
hi, @JinpengLI ,
so appreciated with your great work.
i am so interested in your character segment in ID-card.
do you mean deep_ocr_id_card_segmentation is your code for segmention?
i test it in new ID cards. it seams a little bad for them.
or do you have the new version of the code?
thanks so much.
look forward to your reply .
1, 下载deep_ocr_workspace.zip
2,docker pull jinpengli/deep_ocr_cpu_docker:latest
3,docker run -ti --volume=${HOME}/deep_ocr_workspace:/workspace jinpengli/deep_ocr_cpu_docker:latest /bin/bash
4,python /opt/deep_ocr/reco_chars.py
会出现错误:
root@dd66a9208e12:/workspace# python /opt/deep_ocr/reco_chars.py
libdc1394 error: Failed to initialize libdc1394
WARNING: Logging before InitGoogleLogging() is written to STDERR
。。。中间省略一些普通日志。。。
Traceback (most recent call last):
File "/opt/deep_ocr/reco_chars.py", line 364, in
output_tag_to_max_proba = caffe_cls.predict_cv2_imgs(np_char_imgs)
File "/opt/deep_ocr/reco_chars.py", line 66, in predict_cv2_imgs
self._predict_cv2_imgs_sub(cv2_imgs, i, pos_end)
File "/opt/deep_ocr/reco_chars.py", line 53, in _predict_cv2_imgs_sub
item = (self.y_tag_json[str(index)],
KeyError: '5613'
volume用于mount到container里面,这样可以获取上面的识别结果
这个命令没有看懂,不知道如何从宿主机上接收docker里的识别结果
请问是如何训练的?您的字体数据文件有没有做什么改动?
Traceback (most recent call last):
File "reco_chars.py", line 2, in
import caffe
ModuleNotFoundError: No module named 'caffe'
您好,请问您的训练样本容量是多大的,然后训练样本是如何获得的?
windows 上可以运行吗
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.