GithubHelp home page GithubHelp logo

deep_ocr's Issues

测试程序运行出错

运行测试程序时出现错误:
! image to reco: data/holiday_notification.jpg
Traceback (most recent call last):
File "./bin/deep_ocr_reco", line 137, in
show_img(raw, title="raw image")
File "./bin/deep_ocr_reco", line 27, in show_img
plt.gray()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 3829, in gray
set_cmap("gray")
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2269, in set_cmap
im = gci()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 335, in gci
return gcf()._gci()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 601, in gcf
return figure()
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 548, in figure
**kwargs)
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/mnt/OCR/deep_ocr_env/local/lib/python2.7/site-packages/matplotlib/backends/_backend_tk.py", line 1044, in new_figure_manager_given_figure
window = Tk.Tk(className="matplotlib")
File "/usr/lib/python2.7/lib-tk/Tkinter.py", line 1767, in init
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: no display name and no $DISPLAY environment variable

环境:
Ubuntu 14.04
Python 2.7

你好~请问windows下可以运行deep_ocr吗

请问windows下可以运行deep_ocr吗,能不能用python调用,还是只能使用命令行运行。
另:之前没有使用过docker,从docker官网下载windows稳定版时无法安装。百度云上下载两个文件的速度很慢。不知有没有方法可以解决。

Get_nothing_about_another_image

Thanks for your great share! I run your wonderful code successfully with your test image.
I replaced the test image with data/id_card_img.jpg which is a funny id card. but i get nothing ,

Then i use the scripts in the ./bin/deep_ocr_id_card_segmentation to get a gray_texts.jpg. Then i run
the reco_chars.py for the gray_texts.jpg. The result is not reasonable at all. i think the reason is that the these characters are not contained in train data .Is that right?
Maybe i need finetune the model on a large character dataset ,could you tell me how to finetune?

Thanks for your help and nice work!

docker环境无法运行

出现如下错误:
libdc1394 error: Failed to initialize libdc1394
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0507 12:27:13.683022 47 _caffe.cpp:139] DEPRECATION WARNING - deprecated use of Python interface
W0507 12:27:13.683099 47 _caffe.cpp:140] Use this instead (with the named "weights" parameter):
W0507 12:27:13.683104 47 _caffe.cpp:142] Net('/workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt', 1, weights='/workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel')
Traceback (most recent call last):
File "/opt/deep_ocr/reco_chars.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "/opt/deep_ocr/reco_chars.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt

请大佬指点

RuntimeError: Could not open file

我下载了deep_ocr_workspace.zip 和 reco_chars.py 运行脚本出现以下错误,而且你的压缩包在window下解压出错。 我感觉是你的压缩文件有问题

root@orange-VirtualBox:~/caffe/python# python reco_chars.py
WARNING: Logging before InitGoogleLogging() is written to STDERR
W1223 17:24:19.496032 3764 _caffe.cpp:122] DEPRECATION WARNING - deprecated use of Python interface
W1223 17:24:19.496183 3764 _caffe.cpp:123] Use this instead (with the named "weights" parameter):
W1223 17:24:19.496206 3764 _caffe.cpp:125] Net('/workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt', 1, weights='/workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel')
Traceback (most recent call last):
File "test.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "test.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/lenet_iter_50000.caffemodel

Failed to parse NetParameter file

there is some error on the bellow:

python3 reco_chars.py
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 6:15: Message type "caffe.LayerParameter" has no field named "input_param".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0103 17:15:15.282599 7488 upgrade_proto.cpp:928] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt
*** Check failure stack trace: ***
已放弃 (核心已转储)

what's the matter of this file?

best regards!

找到BUG了!!

PYTHON的BUG
㧟 䏝 㤘 䥽 䁖 䦃 㸆
这几个字无法通过PIL画出来,不信你试试
我平常不用PYTHON,这种BUG该往哪报啊??
简单的测试代码

font = ImageFont.truetype("STXIHEI.TTF", 300)
img = Image.new("L", (300, 300), "black")
draw = ImageDraw.Draw(img)
ch = u'㸆'
draw.text((0, -75), ch, 255, font=font)
img.show()

你可以试试,哈哈蛤
把STXIHEI.TTF找出来,系统里就有,是华文细黑!!

识别准确率很低

  • 费了好大劲终于编译成功,试验了提供的图片识别率还不错,但是自己拍照书上的文字完全无法识别,用文本编辑器输入文字后再截图能识别出来,但是错误非常多,大概不到50%的识别准确率。
  • 试验了身份证识别也是一样的情况,样例图片能够识别,但是网上下载的清晰的身份证图片识别率很低,很多错误,自己拍的身份证图片也是一样的几乎无法识别

关于训练集的疑问

首先,感谢你的开源项目~
我的问题是:我看到之前的问题中有提到训练集的问题,也下载了百度网盘的数据。知道可以用字体文件生成训练集。请问训练集难道每个类别只有一张图片吗?如果不是的话更多的训练数据是如何自动产生的?

Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototxt

我仔细检查了每个依赖库、模型文件路径、解压等因素,还是报这个错误?所以,是你上传的压缩文件错误吗?
Traceback (most recent call last):
File "reco_chars.py", line 294, in
caffe_cls = CaffeCls(model_def, model_weights, y_tag_json_path)
File "reco_chars.py", line 20, in init
caffe.TEST)
RuntimeError: Could not open file /workspace/data/chongdata_caffe_cn_sim_digits_64_64/deploy_lenet_train_test.prototx

best regards!

binary image has result but rgb image not

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 597502990
I0110 09:56:18.573720 37 net.cpp:761] Ignoring source layer mnist
I0110 09:56:18.760609 37 net.cpp:761] Ignoring source layer loss
image
the following image has and result is not as good as raw OCR
image

11.2号,我没有成功

运行deep_ocr_reco,报错:ImportError: No module named ocrolib
剩下的几个可执行文件报错则是:ImportError: No module named cv2
看了一下requirement.txt里面确实没有cv2,不知道这里怎么过的。

博客代码

您好!你的博客代码非常优雅简洁,可否fork你的博客代码?GitHub上好像没有看到。

请教

你好, 按照步骤配置好虚拟环境后, 执行如下:
haiyun@dell-Precision-Tower-5810:~$ source ~/deep_ocr_env/bin/activate && cd ~/deep_ocr && ./bin/deep_ocr_reco data/holiday_notification.jpg -v -d
! image to reco: data/holiday_notification.jpg
Traceback (most recent call last):
File "./bin/deep_ocr_reco", line 137, in
show_img(raw, title="raw image")
File "./bin/deep_ocr_reco", line 27, in show_img
plt.gray()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 3932, in gray
set_cmap("gray")
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2372, in set_cmap
im = gci()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 335, in gci
return gcf()._gci()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 601, in gcf
return figure()
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/pyplot.py", line 548, in figure
**kwargs)
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 161, in new_figure_manager
return cls.new_figure_manager_given_figure(num, fig)
File "/home/haiyun/deep_ocr_env/lib/python2.7/site-packages/matplotlib/backends/_backend_tk.py", line 1044, in new_figure_manager_given_figure
window = Tk.Tk(className="matplotlib")
File "/home/haiyun/install/python-2.7.11/lib/python2.7/lib-tk/Tkinter.py", line 1814, in init
self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use)
_tkinter.TclError: couldn't connect to display ":0"
在网上搜索“_tkinter.TclError: couldn't connect to display ":0"” 尝试去解决没有成功, 烦请指点! 多谢!

codes for detecting characters in non-text images

Great thanks for the models & codes.

I noticed in http://chongdata.com/ocr your api could recognize images with some texts very well. But in the open sourced codes here the results are not so good, with some useless letters in the results.

I think you have done some text detection in your online api, any ideas to share the codes here?

Thanks again for the codes shared.

deep_ocr_make_caffe_dataset的时候报错

您好:
执行虫数据中lesson4的deep_ocr_make_caffe_dataset命令时候,images文件夹生成了,但是没有生成图片, 报错代码:
File "/opt/deep_ocr/bin/deep_ocr_make_caffe_dataset", line 83, in
lang_chars = lang_chars_gen.do()
File "build/bdist.linux-x86_64/egg/deep_ocr/lang_aux.py", line 27, in do
ImportError: No module named langs.lower_eng
langs已经添加到了python模块中,请问这个是什么原因导致的呢?

关于文字转图片的问题

我上次问过,您没有回答清楚啊~
我没看出来你拿到身份证字体格式后,如何遍历下所有文字形成图片??
你能说得清楚些吗
如何从文字转到图片??

如何得到输出结果?

我在centos下进行了安装,执行了这个命令./bin/deep_ocr_reco data/holiday_notification.jpg -v
只得到如下输出:

! image to reco: data/holiday_notification.jpg
! no-normalization
estimate skew angle and rotate
estimate_thresholds lo 0.964706 and hi 1.000000
data/holiday_notification.jpg lo-hi (0.96 1.00) angle  0.0 ! no-normalization
scale= 5.744562646538029
computing column separators
considering at most 3 whitespace column separators
computing lines
propagating labels
spreading labels

Chinese Character segment in ID-card

hi, @JinpengLI ,
so appreciated with your great work.
i am so interested in your character segment in ID-card.
do you mean deep_ocr_id_card_segmentation is your code for segmention?
i test it in new ID cards. it seams a little bad for them.
or do you have the new version of the code?
thanks so much.
look forward to your reply .

ubuntu16.0.4 docker 执行reco_chars.py 出错

1, 下载deep_ocr_workspace.zip
2,docker pull jinpengli/deep_ocr_cpu_docker:latest
3,docker run -ti --volume=${HOME}/deep_ocr_workspace:/workspace jinpengli/deep_ocr_cpu_docker:latest /bin/bash
4,python /opt/deep_ocr/reco_chars.py
会出现错误:
root@dd66a9208e12:/workspace# python /opt/deep_ocr/reco_chars.py
libdc1394 error: Failed to initialize libdc1394
WARNING: Logging before InitGoogleLogging() is written to STDERR

。。。中间省略一些普通日志。。。

Traceback (most recent call last):
File "/opt/deep_ocr/reco_chars.py", line 364, in
output_tag_to_max_proba = caffe_cls.predict_cv2_imgs(np_char_imgs)
File "/opt/deep_ocr/reco_chars.py", line 66, in predict_cv2_imgs
self._predict_cv2_imgs_sub(cv2_imgs, i, pos_end)
File "/opt/deep_ocr/reco_chars.py", line 53, in _predict_cv2_imgs_sub
item = (self.y_tag_json[str(index)],
KeyError: '5613'

识别结果如何接收

volume用于mount到container里面,这样可以获取上面的识别结果

这个命令没有看懂,不知道如何从宿主机上接收docker里的识别结果

通过 Docker 安装,识别问题

您好,按步骤搭设环境,运行python /opt/deep_ocr/reco_chars.py 出现错误 如下图:
1

而后执行命令识别,这个时候例子提供的图片只识别出了号码,文字没有显示。如图
2017-11-18_16-11-50
请指导,谢谢!!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.