GithubHelp home page GithubHelp logo

rapidai / rapidocr Goto Github PK

View Code? Open in Web Editor NEW
2.6K 39.0 331.0 18.45 MB

Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle. (将PaddleOCR模型做了转换,采用ONNXRuntime推理,速度很快)

Home Page: https://rapidai.github.io/RapidOCRDocs

License: Apache License 2.0

C 0.49% Python 89.75% HTML 6.17% CSS 1.60% Jupyter Notebook 1.99%
ocr onnxruntime crnn dbnet openvino rapidocr chineseocr easyocr paddleocr onnxocr

rapidocr's Introduction

Shows an illustrated sun in light mode and a moon with stars in dark mode.
 
Open source OCR for the security of the digital world
 

Open in Colab PyPI SemVer2.0

简体中文 | English

Introduction

💖 Introducing the foremost multi-platform, multi-lingual OCR tool that boasts unparalleled speed, expansive support, and complete openness. This exceptional software is entirely free and renowned for facilitating swift offline deployments. Core to its efficiency is the ONNXRuntime inference engine, offering 4 to 5 times the speed of PaddlePaddle's engine while ensuring no memory leaks.

🦜 Supported Languages: It inherently supports Chinese and English, with self-service conversion required for additional languages. Please refer here for specific language support details.

🔎 Rationale: Acknowledging the limitations in PaddleOCR's architecture, we embarked on a mission to simplify OCR inference across diverse platforms. This endeavor culminated in converting PaddleOCR's model to the versatile ONNX format and seamlessly integrating it into Python, C++, Java, and C# environments.

🎓 Etymology: Derived from its essence, RapidOCR embodies lightness, velocity, affordability, and intelligence. Rooted in deep learning, this OCR technology underscores AI's prowess and emphasizes compact models, prioritizing swiftness without compromising efficacy.

😉 Usage Scenarios:

  • Instant Deployment: If the pre-existing models within our repository suffice, simply leverage RapidOCR for swift deployment.
  • Customization: In case of specific requirements, refine PaddleOCR with your data and proceed with RapidOCR deployment, ensuring tailored results.

If our repository proves beneficial to your endeavors, kindly consider leaving a star ⭐ on GitHub to show your appreciation. It means the world to us!

Visualization (more)

Demo

Installation

pip install rapidocr_onnxruntime

Usage

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

img_path = 'tests/test_files/ch_en_num.jpg'
result, elapse = engine(img_path)
print(result)
print(elapse)

Documentation

Full documentation can be found on docs, in Chinese.

Acknowledgements

  • Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
  • Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
  • Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
  • Many thanks to AutumnSun1996 for the suggestion in the #42.
  • Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
  • Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

🎖 Code Contributors

Important

If you want to sponsor the project, you can directly click the Buy me a coffee image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.

Sponsor Applied Products
-

Citation

If you find this project useful in your research, please consider cite:

@misc{RapidOCR 2021,
    title={{Rapid OCR}: OCR Toolbox},
    author={RapidAI Team},
    howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
    year={2021}
}

⭐️ Stargazers over time

Stargazers over time

License

The copyright of the OCR model is held by Baidu, while the copyrights of all other engineering scripts are retained by the repository's owner.

This project is released under the Apache 2.0 license.

rapidocr's People

Contributors

aurorawright avatar autumnsun1996 avatar benben17 avatar benjaminwan avatar ccddos avatar debanjum avatar dependabot[bot] avatar dogevenci avatar hal9000com avatar hidolen avatar lwq2edu avatar myq-c avatar rensir avatar skeathytomas avatar swhl avatar theikkila avatar znsoftm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rapidocr's Issues

微信小程序

大佬,我试了您做的ocrweb,很不错!
不知能否做一个简单的拍照识别小程序,既可以方便手机上演示,又可以帮助大家在此基础上开发自己的小程序。
我想这是一个方便推广该项目的好主意,敬请大佬考虑此功能。
感谢大佬团队贡献了如此优秀的项目!

内存不断增长并稳定在一定值内

这个库很好用。但是目前部署在函数计算的时候发现一个小问题,一开始识别的时候内存占用很少,但是在多次识别后会不断增长,并稳定在 3G占用 左右,函数计算收费是按照内存占用的,目前有办法对这个问题进行优化么,在单次调用后释放掉本次占用的内存。

探索更合适的GPU推理引擎

  • 因onnxruntime在gpu下推理不是很好,考虑尝试TensorRT推理引擎,支持GPU下快速推理
  • 如有其他轻量且可在GPU上推理模型的引擎,欢迎推荐

关于 onnxruntime 版本安装的问题

我如果在docker中安装本库,会制作一个 requirements.txt ,然而我发现由于本库的setup里直接依赖的 onnxruntime ,即使我在 requirements.txt 里指定了 onnxruntime-gpu ,在安装本库时也会在装一遍 onnxruntime 覆盖掉 onnxruntime-gpu ,而且在进入docker后尝试用python执行 get_device() 只会得到 CPU,请问该如何解决?

建议python版本中,关于CPU与GPU的onnx代码,自动判断该选择哪一个版本

看到README中关于切换到GPU的代码如下图:
image

类似需要修改的代码分布在多个文件中,让用户去每个地方修改的话,不容易找到,还不一定能改得正确。建议作者在代码中处理一下。

由于 onnxrt 在同一环境中,只能安装 CPU 或 GPU 两版本中的一个,不能同时安装两者,所以让用户在运行时去选择GPU还是CPU没有意义,在安装运行库时就决定好是CPU还是GPU了。onnxrt可在运行中判断当前使用的版本,如下图:
image

因此,根据 ort.get_device() 的返回结果,在代码中处理一下,就不必让用户去按照 README 自行修改代码了。
这样处理后,可能的问题是具有GPU的系统,却想要人为地选择以CPU的方式运行,这种情况估计需要参数去辅助判断。
或者作者想想更好的处理办法,尽量避免用户去修改代码,也有利于之后的RAPIDOCR版本升级。
或者既然已经将大部分参数写进 config.yaml中,那也不差把CPU或GPU这个版本选择参数添加上去了。有关的代码依据这个参数进行改动。这样用户在使用时,只需要改 config.yaml文件中一个地方就行。

onnxruntime error on arm64

1 环境
硬件:RK3399
onnxruntime: 使用github最新代码编译(在chineseocr_lite中测试通过)

2 执行 (同样的逻辑在PC端执行一切正常)
inaro@linaro-alip:~/RapidOCR/python$ sh rapidOCR.sh
dt_boxes num : 17, elapse : 1.2671310901641846
cls num : 17, elapse : 0.2634892463684082
2021-02-25 03:23:09.794914083 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}
Traceback (most recent call last):
File "RapidOCR.py", line 272, in
dt_boxes, rec_res = text_sys(args.image_path)
File "RapidOCR.py", line 196, in call
rec_res, elapse = self.text_recognizer(img_crop_list)
File "/home/linaro/RapidOCR/python/ch_ppocr_mobile_v2_rec/text_recognize.py", line 119, in call
preds = self.session.run(None, onnx_inputs)[0]
File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}

E:onnxruntime:, sequential_executor.cc:339 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Me

环境:
windows
工具:
Anaconda3-2020.11-Windows-x86_64
在Anaconda里面:
conda create -n base37 python=3.7
然后:在base37里面安装了 requirements.txt
然后,windows下面使用 base37下面的执行 rapidOCR.py

报错:
C:\ProgramData\Anaconda3\python.exe E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py
dt_boxes num : 17, elapse : 0.11702466011047363
cls num : 17, elapse : 0.016003131866455078
2021-06-06 17:06:33.2157753 [E:onnxruntime:, sequential_executor.cc:339 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}
Traceback (most recent call last):
File "E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py", line 271, in
dt_boxes, rec_res = text_sys(args.image_path)
File "E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py", line 195, in call
rec_res, elapse = self.text_recognizer(img_crop_list)
File "E:\comm_Item\Item_doing\ocr_recog_py\RapidOCR\python\ch_ppocr_mobile_v2_rec\text_recognize.py", line 115, in call
preds = self.session.run(None, onnx_inputs)[0]
File "C:\ProgramData\Anaconda3\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}

Process finished with exit code 1

Loading .onnx models by opencv

Discussed in #58

Originally posted by senstar-hsoleimani December 6, 2022
I downloaded the onnx models provided in GoogleDrive , but I could not read them by OpenCv {cv::dnn::ReadNet()}.
Can anyone help please?

python版在相同条件下,PPOCR-v3 比 v2 识别速度慢很多

在同一软、硬件条件下,仅修改 config.yaml(修改部分完全按readme要求,见附图),发现 PPOCR-v3 比 v2 识别速度慢很多,而且有警告信息,部分识别结果有误。如下图:
PPOCR-v2:
image

PPOCR-v3:
image

附图:config.yaml 修改部分
image

python+onnx+onnxRuntime推理时间疑问

您好,我在测试的时候,发现python+onnx+onnxRuntime的推理速度慢于python+paddle+mkl的时间,想问下是我某些设置没有开启嘛?我将两个代码的预处理参数统一了。
我的cpu是Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz 1.19 GHz。

ocrweb -> ValueError: not enough values to unpack (expected 4, got 2)

Issue

  • working with RapidOCR - ocrweb
  • when submitting an image without text, it will cause the following issue
    • testing sample: test2.png (a plain white picture)

test2

╰─○ python main.py
 * Serving Flask app 'main' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:9003 (Press CTRL+C to quit)
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/css/main.css HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/js/jquery-3.0.0.min.js HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/css/favicon.ico HTTP/1.1" 200 -
dt_boxes num: 0, elapse: 0.14057278633117676
[2022-06-29 09:40:48,101] ERROR in app: Exception on /ocr [POST]
Traceback (most recent call last):
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 2077, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1525, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1523, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1509, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "main.py", line 35, in ocr
    return detect_recognize(image)
  File "/Users/userXXX/Downloads/ocrweb_debug/task.py", line 29, in detect_recognize
    dt_boxes, rec_res, img, elapse_part = text_sys(image)
ValueError: not enough values to unpack (expected 4, got 2)
127.0.0.1 - - [29/Jun/2022 09:40:48] "POST /ocr HTTP/1.1" 500 -

Cause & Proposed solution

  • root cause: rapid_ocr_api.py, return logic within if dt_boxes is None or len(dt_boxes) < 1:
class TextSystem(object):
    ... ...
    def __call__(self, img: np.ndarray):
        dt_boxes, det_elapse = self.text_detector(img)
        if self.print_verbose:
            print(f'dt_boxes num: { len(dt_boxes)}, elapse: {det_elapse}')

        if dt_boxes is None or len(dt_boxes) < 1:
            return None, None
            # fixed with: return None, None, img, None

eclipse无法运行jvm版本

使用IDEA能正常运行jvm版本,
String modelsDir = "";
String detName = "ch_ppocr_mobile_v2.0_det_infer.onnx";
String clsName = "ch_ppocr_mobile_v2.0_cls_infer.onnx";
String recName = "ch_ppocr_mobile_v2.0_rec_infer.onnx";
String keysName = "ppocr_keys_v1.txt";
int padding = 0;
float boxScoreThresh = 0.5f;
float boxThresh = 0.3f;
float unClipRatio = 1.6f;
boolean doAngle = true;
boolean mostAngle = true;
String imagePath="D:\ocr\images\cn.png";
OcrEngine ocrEngine = new OcrEngine();
ocrEngine.initEngine("D:\ocr\win-lib-x64\BaiPiaoOcrOnnx.dll");
String version = ocrEngine.getVersion();
System.out.println("version=" + version);

    //------- setNumThread -------
    ocrEngine.setNumThread(2);

    //------- init Logger -------
    ocrEngine.initLogger(   true,true, true );
    ocrEngine.enableResultText(imagePath);

    //------- init Models -------
    boolean initModelsRet = ocrEngine.initModels(modelsDir, detName, clsName, recName, keysName);
    if (!initModelsRet) {
        System.out.println("Error in models initialization, please check the models/keys path!");
        return;
    }

    //------- set param -------
    ocrEngine.setPadding(padding); //图像外接白框,用于提升识别率,文字框没有正确框住所有文字时,增加此值。
    ocrEngine.setBoxScoreThresh(boxScoreThresh); //文字框置信度门限,文字框没有正确框住所有文字时,减小此值
    ocrEngine.setBoxThresh(boxThresh); //请自行试验
    ocrEngine.setUnClipRatio(unClipRatio); //单个文字框大小倍率,越大时单个文字框越大
    ocrEngine.setDoAngle(doAngle); //启用(1)/禁用(0) 文字方向检测,只有图片倒置的情况下(旋转90~270度的图片),才需要启用文字方向检测
    ocrEngine.setMostAngle(mostAngle); //启用(1)/禁用(0) 角度投票(整张图片以最大可能文字方向来识别),当禁用文字方向检测时,此项也不起作用
    //------- start detect -------
    Long b=System.currentTimeMillis();
    OcrResult ocrResult = ocrEngine.detect(imagePath, 1024); //按图像长边进行总体缩放,放大增加识别耗时但精度更高,缩小减小耗时但精度降低,maxSideLen=0代表不缩放
    Long e=System.currentTimeMillis();
    System.out.println("检测耗时"+(e-b)+"毫秒");
    System.out.println("-----------------------");
    //使用native方法,可以让OcrEngine成为单例
    //OcrResult ocrResult = ocrEngine.detect(imagePath, padding, maxSideLen, boxScoreThresh, boxThresh, unClipRatio, doAngle, mostAngle);

    //------- print result -------
    System.out.println(ocrResult.toString());

打包jar后在eclipse中运行报错如下:

A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff99d31f6d6, pid=39464, tid=0x000000000000be98

JRE version: Java(TM) SE Runtime Environment (8.0_221-b11) (build 1.8.0_221-b11)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.221-b11 mixed mode windows-amd64 compressed oops)

Problematic frame:

C [BaiPiaoOcrOnnx.dll+0x117f6d6]

Failed to write core dump. Minidumps are not enabled by default on client versions of Windows

An error report file with more information is saved as:

D:\jianyan\simpleness\simpleness-service\hs_err_pid39464.log

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

The crash happened outside the Java Virtual Machine in native code.

See problematic frame for where to report the bug.

Bug: Cannot process sample image with python onnx model loading

请提供下述完整信息以便快速定位问题
(Please provide the following information to quickly locate the problem)

I am trying to build a simple solution of RapidOCR from onnx models and I am stuck with problem of inference data. I have found guide that advice to load the model with onnxruntime and then prepare the image according to NCHW format then process the numpy array to model run but when I am trying to run the inference code block - I got error that is mentioned below, but when I scale the image up to really small dimensions - error disappear but I guess that the results won't give a correct answer to certain image.

  • 系统环境/System Environment:Linux 20.04
  • 使用的是哪门语言的程序/Which programing language:Python
  • 所使用语言相关版本信息/Version: rapidocr-onnxruntime==1.2.3
  • OnnxRuntime版本/OnnxRuntime Version onnxruntime==1.14.1
  • 可复现问题的demo/Demo of reproducible problemshttps://colab.research.google.com/drive/11NBn3RRiqZrEu9kKZuBAXrF7EcG_0cWy?usp=sharing
  • 完整报错/Complete Error Message:Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Concat node. Name:'Concat_1' Status Message: concat.cc:157 PrepareForCompute Non concat axis dimensions must match: Axis 2 has mismatched dimensions of 1 and 20
  • 可能的解决方案/Possible solutions: When I reduce the image dimensions to 32x32 - error disappear but I am confident that scaling from 500x500 (as example) up to 32x32 won't give a real results

关于API方式返回结果的一点建议

从文档里看,目前的返回结果是这样的:

[['0', '香港深圳抽血', '0.93583983'], ['1', '专业查性别', '0.89865875'], ['2', '专业鉴定B超单', '0.9955703'], ['3', 'b超仪器查性别', '0.99489486'], ['4', '加微信eee', '0.99073666'], ['5', '可邮寄', '0.99923944']]

坐标其实还是比较重要的,可以用在后期对内容进行分段处理。

下面的数据格式是我在项目里做的一个api结果的例子。

lines:提供了一个合并后的文本结果。方便直接显示使用。
regions:识别到的区域,每个区域的文字和坐标。

代码在这里 https://github.com/cuiliang/RapidOCR/blob/Quicker/ocrweb/api_task.py
不太懂py,代码只是跟随感觉拼凑的😂,供大佬参考。

{
  "result": {
    "lines": "Filters Is:issueis:open\n\n10pen 35Closed\n\n建议:将ppocr_keys等信息直接存储到onnx模型\n#42 opened 4daysagoby AutumnSun1996",
    "regions": [
      {
        "text": "Filters",
        "confidence": 0.9966548,
        "rect": {
          "left": 57,
          "top": 0,
          "right": 116,
          "bottom": 2
        }
      },
      {
        "text": "Is:issueis:open",
        "confidence": 0.84303313,
        "rect": {
          "left": 210,
          "top": 2,
          "right": 347,
          "bottom": 3
        }
      },
      {
        "text": "10pen",
        "confidence": 0.976416,
        "rect": {
          "left": 89,
          "top": 88,
          "right": 160,
          "bottom": 88
        }
      },
      {
        "text": "35Closed",
        "confidence": 0.9819431,
        "rect": {
          "left": 213,
          "top": 89,
          "right": 305,
          "bottom": 89
        }
      },
      {
        "text": "建议:将ppocr_keys等信息直接存储到onnx模型",
        "confidence": 0.97398514,
        "rect": {
          "left": 90,
          "top": 158,
          "right": 594,
          "bottom": 158
        }
      },
      {
        "text": "#42 opened 4daysagoby AutumnSun1996",
        "confidence": 0.9657532,
        "rect": {
          "left": 91,
          "top": 199,
          "right": 442,
          "bottom": 198
        }
      }
    ]
  },
  "info": {
    "total_elapse": 0.45919999999999994,
    "elapse_part": {
      "det_elapse": "0.3858",
      "cls_elapse": "0.0011",
      "rec_elapse": "0.0723"
    }
  }
}

对文本检测框部分排序不对应

程序运行的临时文件被占用

我用的swing集成的ocr,识别图片完成后 swing 程序没退出,产生了一些 *.jpg-result.txt 文件无法删除,必须要退出swing 后才可以删。

No results from inference when using onnxruntime with TensorRT

I built onnxruntime with TensorRT to see if there could be any performance improvements with RapidOCR but unfortunately, the inference returned an empty array. Here's the log:

C:\Users\samay\Documents\RapidOCR\python>python rapidOCR.py
2021-07-08 00:09:15.7074765 [E:onnxruntime:Default, tensorrt_execution_provider.h:51 onnxruntime::TensorrtLogger::log] [2021-07-08 05:09:15   ERROR] Parameter check failed at: engine.cpp::nvinfer1::rt::ExecutionContext::setBindingDimensions::1136, condition: profileMaxDims.d[i] >= dimensions.d[i]
Traceback (most recent call last):
  File "C:\Users\samay\Documents\RapidOCR\python\rapidOCR.py", line 257, in <module>
    dt_boxes, rec_res = text_sys(args.image_path)
  File "C:\Users\samay\Documents\RapidOCR\python\rapidOCR.py", line 177, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "C:\Users\samay\Documents\RapidOCR\python\ch_ppocr_mobile_v2_det\text_detect.py", line 136, in __call__
    dt_boxes = post_result[0]['points']
IndexError: list index out of range

I'm by no means an expert in model conversion so I'm guessing tensorrt simply doesn't support the converted onnx model ? Is there a way to make it work ?

请问在线的demo用的哪个模型呢?

大佬好,请问您在线demo利用的是哪个模型呢?我试了网盘里面的几个模型,也试过了他们的组合。但是效果和您在线示例中的都存在差异,在一些细节上面效果差很多。😂可以分享一下那个模型吗?

识别高概率出现

1T7I4JXSMXDKMS_G QCSEI5
同一个张图片,java代码识别多次,只有第一次是正确的,后面识别全部是乱码

个别图片检测不出文字

image

原始图片:
20220710_110751_110

作为对比,Paddle在线的是可以的,而且检测速度也很快。
目前感觉有一个现象,越小的图片,除了足够小跳过检测的,相比于较大的图片检测花费的时间更长。不知道有没有什么优化的办法。
image

onnx转openvino出错

用的onnx模型是在您提供的网盘下载的ch_ppocr_mobile_v2.0_rec_infer.onnx
转换的命令是
python "C:\Program Files (x86)\Intel\openvino_2021\deployment_tools\model_optimizer\mo.py" --input_model=ch_ppocr_mobile_v2.0_rec_infer.onnx --output_dir=. --model_name=model_rec --data_type=FP32
出现如下错误
image
请问这错误是啥意思?该怎么解决呢?

Trouble with installation

pip install https://github.com/RapidAI/RapidOCR/raw/main/release/python_sdk/sdk_rapidocr_v1.0.0/rapidocr-1.0.0-py3-none-any.whl -i https://pypi.douban.com/simple/ Looking in indexes: https://pypi.douban.com/simple/ Collecting rapidocr==1.0.0 Using cached https://github.com/RapidAI/RapidOCR/raw/main/release/python_sdk/sdk_rapidocr_v1.0.0/rapidocr-1.0.0-py3-none-any.whl (18 kB) Collecting six>=1.15.0 Downloading https://pypi.doubanio.com/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB) Requirement already satisfied: numpy>=1.19.3 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from rapidocr==1.0.0) (1.21.4) Collecting pyclipper>=1.2.1 Downloading https://pypi.doubanio.com/packages/24/6e/b7b4d05383cb654560d63247ddeaf8b4847b69b68d8bc6c832cd7678dab1/pyclipper-1.3.0.zip (142 kB) |████████████████████████████████| 142 kB 2.7 MB/s Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: Shapely>=1.7.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from rapidocr==1.0.0) (1.8.0) ERROR: Could not find a version that satisfies the requirement onnxruntime>=1.7.0 (from rapidocr) (from versions: none) ERROR: No matching distribution found for onnxruntime>=1.7.0

But I do have onnxruntime on my Mac.

🍺 /opt/homebrew/Cellar/onnxruntime/1.9.1: 77 files, 11.9MB

cv2.dnn

请问是否有通过cv2.dnn方式部署的paddleocr

ImportError: cannot import name 'escape' from 'jinja2'


└─(16:17:30)──> python main.py                                                                                                                                                                         ──(Fri,Jun24)─┘
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    from flask import Flask, render_template, request
  File "/Users/XXXXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/__init__.py", line 14, in <module>
    from jinja2 import escape
ImportError: cannot import name 'escape' from 'jinja2' (/Users/XXXXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/jinja2/__init__.py)

python版cuda使用问题。

onnxruntime-gpu==1.10
cuda==11.40
cudnn==8.2.4
ubuntu 1804
环境如上,在目标检测:加载模型阶段使用cuda,代码:
.....
self.preprocess_op = create_operators(pre_process_list)
self.postprocess_op = DBPostProcess(thresh=0.3,
box_thresh=0.5,
max_candidates=1000,
unclip_ratio=1.6,
use_dilation=True)
providers = [
('CUDAExecutionProvider', {
'device_id': 0,
'arena_extend_strategy': 'kNextPowerOfTwo',
'gpu_mem_limit': 2 * 1024 * 1024 * 1024,
'cudnn_conv_algo_search': 'EXHAUSTIVE',
'do_copy_in_default_stream': True,
}),
'CPUExecutionProvider',
]
self.session = onnxruntime.InferenceSession(det_model_path, providers=providers)
......

错误如下:
ValueError: This ORT build has ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. Since ORT 1.9, you are required to explicitly set the providers parameter when instantiating InferenceSession. For example, onnxruntime.InferenceSession(..., providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'], ...)

请问如何解决,谢谢

Python recognition inference result is weird

When performing inference on a screenshot from wikipedia in english, the ocr results are... weird. Some lines are perfectly recognised while others are completely wrong. Is there a parameter I need to change ?

Original image:
wiki

Inference result:
infer_wiki

dotnet测试 InitModel就出差了

public void InitModel(string path, int numThread)
{
try
{
SessionOptions op = new SessionOptions(); //这行
op.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_EXTENDED;
op.InterOpNumThreads = numThread;
op.IntraOpNumThreads = numThread;
dbNet = new InferenceSession(path, op);
inputNames = dbNet.InputMetadata.Keys.ToList();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message + ex.StackTrace);
throw ex;
}
}

执行到:
public SessionOptions()
: base(IntPtr.Zero, true)
{
NativeApiStatus.VerifySuccess(NativeMethods.OrtCreateSessionOptions(out handle));
}
出错:
在 OcrLiteLib.DbNet.InitModel(String path, Int32 numThread) 位置 F:\RapidOCR\dotnet\BaiPiaoOcrOnnxCs\OcrLib\DbNet.cs:行号 45
在 OcrLiteLib.OcrLite.InitModels(String detPath, String clsPath, String recPath, String keysPath, Int32 numThread) 位置 F:\RapidOCR\dotnet\BaiPiaoOcrOnnxCs\OcrLib\OcrLite.cs:行号 29

我在win7上 测试PYTHON 很好 检测图片速度也好
请大佬看看 引用的包 没有升级 我看了一下dotnet 已经半年没有更新了

Model files without Baidu?

Since I don't have a Baidu account, I can't download them.
Is it possible to have them hosted on another platform?

建议: 将ppocr_keys等信息直接存储到onnx模型

建议将ppocr_keys, rec_img_shape等信息直接存储到onnx模型

目前, ppocr_keys是单独存放在txt文件, 然后在config.yaml中配置文件路径; rec_img_shape是在config.yaml中配置
这两个参数是和onnx模型强相关的, 可以直接作为元数据存储到onnx模型内, 减少配置的需求.
尤其是ppocr_keys, 目前通过另一个文件来分发, 容易出现两边不一致的情况.
ONNX本身支持自定义元信息的存储. 使用这种方式, 部署相关的配置应该会更简单.

参考代码:

# 添加meta信息
import onnx

model = onnx.load_model('/path/to/model.onnx')
meta = model.metadata_props.add()
meta.key = 'dictionary'
meta.value = open('/path/to/ppocr_keys_v1.txt', 'r', -1, 'u8').read()

meta = model.metadata_props.add()
meta.key = 'shape'
meta.value = '[3,48,320]'

onnx.save_model(model, '/path/to/model.onnx')

# 获取meta信息
import json
import onnxruntime as ort

sess = ort.InferenceSession('/path/to/model.onnx')
metamap = sess.get_modelmeta().custom_metadata_map
chars = metamap['dictionary'].splitlines()
input_shape = json.loads(metamap['shape'])

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.