rapidai / rapidocr Goto Github PK

Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVION and PaddlePaddle. （将PaddleOCR模型做了转换，采用ONNXRuntime推理，速度很快）

Home Page: https://rapidai.github.io/RapidOCRDocs

License: Apache License 2.0

C 0.49% Python 89.75% HTML 6.17% CSS 1.60% Jupyter Notebook 1.99%

ocr onnxruntime crnn dbnet openvino rapidocr chineseocr easyocr paddleocr onnxocr

rapidocr's Introduction

Open source OCR for the security of the digital world

简体中文 | English

Introduction

💖 Introducing the foremost multi-platform, multi-lingual OCR tool that boasts unparalleled speed, expansive support, and complete openness. This exceptional software is entirely free and renowned for facilitating swift offline deployments. Core to its efficiency is the ONNXRuntime inference engine, offering 4 to 5 times the speed of PaddlePaddle's engine while ensuring no memory leaks.

🦜 Supported Languages: It inherently supports Chinese and English, with self-service conversion required for additional languages. Please refer here for specific language support details.

🔎 Rationale: Acknowledging the limitations in PaddleOCR's architecture, we embarked on a mission to simplify OCR inference across diverse platforms. This endeavor culminated in converting PaddleOCR's model to the versatile ONNX format and seamlessly integrating it into Python, C++, Java, and C# environments.

🎓 Etymology: Derived from its essence, RapidOCR embodies lightness, velocity, affordability, and intelligence. Rooted in deep learning, this OCR technology underscores AI's prowess and emphasizes compact models, prioritizing swiftness without compromising efficacy.

😉 Usage Scenarios:

Instant Deployment: If the pre-existing models within our repository suffice, simply leverage RapidOCR for swift deployment.
Customization: In case of specific requirements, refine PaddleOCR with your data and proceed with RapidOCR deployment, ensuring tailored results.

If our repository proves beneficial to your endeavors, kindly consider leaving a star ⭐ on GitHub to show your appreciation. It means the world to us!

Visualization (more)

Installation

pip install rapidocr_onnxruntime

Usage

from rapidocr_onnxruntime import RapidOCR

engine = RapidOCR()

img_path = 'tests/test_files/ch_en_num.jpg'
result, elapse = engine(img_path)
print(result)
print(elapse)

Documentation

Full documentation can be found on docs, in Chinese.

Acknowledgements

Many thanks to DeliciaLaniD for fixing the misplaced start position of scan animation in ocrweb.
Many thanks to zhsunlight for the suggestion about parameterized call GPU reasoning and the careful and thoughtful testing.
Many thanks to lzh111222334 for fixing some bugs of rec preprocessing under python version.
Many thanks to AutumnSun1996 for the suggestion in the #42.
Many thanks to DeadWood8 for providing the document which packages rapidocr_web to exe by Nuitka.
Many thanks to Loovelj for fixing the bug of sorting the text boxes. For details see issue 75.

🎖 Code Contributors

Sponsor

Important

If you want to sponsor the project, you can directly click the Buy me a coffee image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.

Sponsor	Applied Products

	-

Citation

If you find this project useful in your research, please consider cite:

@misc{RapidOCR 2021,
    title={{Rapid OCR}: OCR Toolbox},
    author={RapidAI Team},
    howpublished = {\url{https://github.com/RapidAI/RapidOCR}},
    year={2021}
}

⭐️ Stargazers over time

License

The copyright of the OCR model is held by Baidu, while the copyrights of all other engineering scripts are retained by the repository's owner.

This project is released under the Apache 2.0 license.

rapidocr's People

Contributors

Stargazers

Watchers

Forkers

blyema benjaminwan channingss zhilangtaosha l976308589 lsy1770 ygexe huoran559 wuxiaolianggit trinsanity hzjai0624 xinsuinizhuan demooooooo yaozn kuustudio aliushn leftorright001 jingmouren benjamesbabala 471417367 zhuewizz huhaibo bianchh liushuchun zlszhonglongshen xiaohuihuichao hnwlxywns zineos dandelion111 weizhonghai dongphuongman huabao97 chros425 i-spark m-liu1987 lower-than-absolute-zero rayzhb jiaocq1972 zhuofalin maxpark aqqwvfbukn ocrorg gait1314 zhangshabao lukairui visimulator hongkuncc chuawei buptsb weiqq317 rfsn cuiliang crackercat iwaitu baxiprince xinyujituan ponderfly yxpandjay lucs-c kuyoeku kaylio lyrhy rdaim jacke121 forack loulousky antonizdp orangeoi realcorebb deftruth ronglejing carto1111 beijinggao jinhill light201212 wudigepimao cowsmiles pzlpy99 dmtan90 xcc313 lwzbuaa aslily1234 15236626983 jeff-cn zhys513 snolkmg zszczh xjl-le chen-shixin daanye tyronebj aureliuspatiens adityavarmauddaraju crzayjay youmengwuyan lukstc tailless-monkey delicialanid dogevenci wanggs950730

rapidocr's Issues

微信小程序

大佬，我试了您做的ocrweb，很不错！
不知能否做一个简单的拍照识别小程序，既可以方便手机上演示，又可以帮助大家在此基础上开发自己的小程序。
我想这是一个方便推广该项目的好主意，敬请大佬考虑此功能。
感谢大佬团队贡献了如此优秀的项目！

内存不断增长并稳定在一定值内

这个库很好用。但是目前部署在函数计算的时候发现一个小问题，一开始识别的时候内存占用很少，但是在多次识别后会不断增长，并稳定在 3G占用左右，函数计算收费是按照内存占用的，目前有办法对这个问题进行优化么，在单次调用后释放掉本次占用的内存。

探索更合适的GPU推理引擎

因onnxruntime在gpu下推理不是很好，考虑尝试TensorRT推理引擎，支持GPU下快速推理
如有其他轻量且可在GPU上推理模型的引擎，欢迎推荐

关于 onnxruntime 版本安装的问题

我如果在docker中安装本库，会制作一个 requirements.txt ，然而我发现由于本库的setup里直接依赖的 onnxruntime ，即使我在 requirements.txt 里指定了 onnxruntime-gpu ，在安装本库时也会在装一遍 onnxruntime 覆盖掉 onnxruntime-gpu ，而且在进入docker后尝试用python执行 get_device() 只会得到 CPU，请问该如何解决？

建议python版本中，关于CPU与GPU的onnx代码，自动判断该选择哪一个版本

看到README中关于切换到GPU的代码如下图：

类似需要修改的代码分布在多个文件中，让用户去每个地方修改的话，不容易找到，还不一定能改得正确。建议作者在代码中处理一下。

由于 onnxrt 在同一环境中，只能安装 CPU 或 GPU 两版本中的一个，不能同时安装两者，所以让用户在运行时去选择GPU还是CPU没有意义，在安装运行库时就决定好是CPU还是GPU了。onnxrt可在运行中判断当前使用的版本，如下图：

因此，根据 ort.get_device() 的返回结果，在代码中处理一下，就不必让用户去按照 README 自行修改代码了。
这样处理后，可能的问题是具有GPU的系统，却想要人为地选择以CPU的方式运行，这种情况估计需要参数去辅助判断。
或者作者想想更好的处理办法，尽量避免用户去修改代码，也有利于之后的RAPIDOCR版本升级。
或者既然已经将大部分参数写进 config.yaml中，那也不差把CPU或GPU这个版本选择参数添加上去了。有关的代码依据这个参数进行改动。这样用户在使用时，只需要改 config.yaml文件中一个地方就行。

希望RapidOCR能C++部署能支持GPU版本

希望RapidOCR在C++方面部署方面能支持GPU版本

BaiPiaoOcrOnnx【帮助】

System.loadLibrary("BaiPiaoOcrOnnx")
这个BaiPiaoOcrOnnx动态库哪里来？

onnxruntime error on arm64

1 环境
硬件：RK3399
onnxruntime: 使用github最新代码编译（在chineseocr_lite中测试通过）

2 执行（同样的逻辑在PC端执行一切正常）
inaro@linaro-alip:~/RapidOCR/python$ sh rapidOCR.sh
dt_boxes num : 17, elapse : 1.2671310901641846
cls num : 17, elapse : 0.2634892463684082
2021-02-25 03:23:09.794914083 [E:onnxruntime:, sequential_executor.cc:339 Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}
Traceback (most recent call last):
File "RapidOCR.py", line 272, in
dt_boxes, rec_res = text_sys(args.image_path)
File "RapidOCR.py", line 196, in call
rec_res, elapse = self.text_recognizer(img_crop_list)
File "/home/linaro/RapidOCR/python/ch_ppocr_mobile_v2_rec/text_recognize.py", line 119, in call
preds = self.session.run(None, onnx_inputs)[0]
File "/usr/local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}

E:onnxruntime:, sequential_executor.cc:339 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Me

环境：
windows
工具：
Anaconda3-2020.11-Windows-x86_64
在Anaconda里面：
conda create -n base37 python=3.7
然后：在base37里面安装了 requirements.txt
然后，windows下面使用 base37下面的执行 rapidOCR.py

报错：
C:\ProgramData\Anaconda3\python.exe E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py
dt_boxes num : 17, elapse : 0.11702466011047363
cls num : 17, elapse : 0.016003131866455078
2021-06-06 17:06:33.2157753 [E:onnxruntime:, sequential_executor.cc:339 onnxruntime::SequentialExecutor::Execute] Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}
Traceback (most recent call last):
File "E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py", line 271, in
dt_boxes, rec_res = text_sys(args.image_path)
File "E:/comm_Item/Item_doing/ocr_recog_py/RapidOCR/python/rapidOCR.py", line 195, in call
rec_res, elapse = self.text_recognizer(img_crop_list)
File "E:\comm_Item\Item_doing\ocr_recog_py\RapidOCR\python\ch_ppocr_mobile_v2_rec\text_recognize.py", line 115, in call
preds = self.session.run(None, onnx_inputs)[0]
File "C:\ProgramData\Anaconda3\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 188, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running ScatterND node. Name:'ScatterND@1' Status Message: updates tensor should have shape equal to indices.shape[:-1] + data.shape[indices.shape[-1]:]. updates shape: {1}, indices shape: {1}, data shape: {3}

Process finished with exit code 1

Loading .onnx models by opencv

Discussed in #58

^{Originally posted by senstar-hsoleimani December 6, 2022}
I downloaded the onnx models provided in GoogleDrive , but I could not read them by OpenCv {cv::dnn::ReadNet()}.
Can anyone help please?

python版在相同条件下，PPOCR-v3 比 v2 识别速度慢很多

在同一软、硬件条件下，仅修改 config.yaml（修改部分完全按readme要求，见附图），发现 PPOCR-v3 比 v2 识别速度慢很多，而且有警告信息，部分识别结果有误。如下图：
PPOCR-v2：

PPOCR-v3:

附图：config.yaml 修改部分

python 测试程序是否支持gpu

安装 onnxruntime-gpu，就可以支持gpu推理吗

python+onnx+onnxRuntime推理时间疑问

您好，我在测试的时候，发现python+onnx+onnxRuntime的推理速度慢于python+paddle+mkl的时间，想问下是我某些设置没有开启嘛？我将两个代码的预处理参数统一了。
我的cpu是Intel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz 1.19 GHz。

ocrweb -> ValueError: not enough values to unpack (expected 4, got 2)

Issue

working with RapidOCR - ocrweb
when submitting an image without text, it will cause the following issue
- testing sample: test2.png (a plain white picture)

╰─○ python main.py
 * Serving Flask app 'main' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:9003 (Press CTRL+C to quit)
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/css/main.css HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/js/jquery-3.0.0.min.js HTTP/1.1" 200 -
127.0.0.1 - - [29/Jun/2022 09:40:38] "GET /static/css/favicon.ico HTTP/1.1" 200 -
dt_boxes num: 0, elapse: 0.14057278633117676
[2022-06-29 09:40:48,101] ERROR in app: Exception on /ocr [POST]
Traceback (most recent call last):
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 2077, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1525, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1523, in full_dispatch_request
    rv = self.dispatch_request()
  File "/Users/userXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/app.py", line 1509, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "main.py", line 35, in ocr
    return detect_recognize(image)
  File "/Users/userXXX/Downloads/ocrweb_debug/task.py", line 29, in detect_recognize
    dt_boxes, rec_res, img, elapse_part = text_sys(image)
ValueError: not enough values to unpack (expected 4, got 2)
127.0.0.1 - - [29/Jun/2022 09:40:48] "POST /ocr HTTP/1.1" 500 -

Cause & Proposed solution

root cause: rapid_ocr_api.py, return logic within if dt_boxes is None or len(dt_boxes) < 1:

class TextSystem(object):
    ... ...
    def __call__(self, img: np.ndarray):
        dt_boxes, det_elapse = self.text_detector(img)
        if self.print_verbose:
            print(f'dt_boxes num: { len(dt_boxes)}, elapse: {det_elapse}')

        if dt_boxes is None or len(dt_boxes) < 1:
            return None, None
            # fixed with: return None, None, img, None

eclipse无法运行jvm版本

使用IDEA能正常运行jvm版本，
String modelsDir = "";
String detName = "ch_ppocr_mobile_v2.0_det_infer.onnx";
String clsName = "ch_ppocr_mobile_v2.0_cls_infer.onnx";
String recName = "ch_ppocr_mobile_v2.0_rec_infer.onnx";
String keysName = "ppocr_keys_v1.txt";
int padding = 0;
float boxScoreThresh = 0.5f;
float boxThresh = 0.3f;
float unClipRatio = 1.6f;
boolean doAngle = true;
boolean mostAngle = true;
String imagePath="D:\ocr\images\cn.png";
OcrEngine ocrEngine = new OcrEngine();
ocrEngine.initEngine("D:\ocr\win-lib-x64\BaiPiaoOcrOnnx.dll");
String version = ocrEngine.getVersion();
System.out.println("version=" + version);

    //------- setNumThread -------
    ocrEngine.setNumThread(2);

    //------- init Logger -------
    ocrEngine.initLogger(   true,true, true );
    ocrEngine.enableResultText(imagePath);

    //------- init Models -------
    boolean initModelsRet = ocrEngine.initModels(modelsDir, detName, clsName, recName, keysName);
    if (!initModelsRet) {
        System.out.println("Error in models initialization, please check the models/keys path!");
        return;
    }

    //------- set param -------
    ocrEngine.setPadding(padding); //图像外接白框，用于提升识别率，文字框没有正确框住所有文字时，增加此值。
    ocrEngine.setBoxScoreThresh(boxScoreThresh); //文字框置信度门限，文字框没有正确框住所有文字时，减小此值
    ocrEngine.setBoxThresh(boxThresh); //请自行试验
    ocrEngine.setUnClipRatio(unClipRatio); //单个文字框大小倍率，越大时单个文字框越大
    ocrEngine.setDoAngle(doAngle); //启用(1)/禁用(0) 文字方向检测，只有图片倒置的情况下(旋转90~270度的图片)，才需要启用文字方向检测
    ocrEngine.setMostAngle(mostAngle); //启用(1)/禁用(0) 角度投票(整张图片以最大可能文字方向来识别)，当禁用文字方向检测时，此项也不起作用
    //------- start detect -------
    Long b=System.currentTimeMillis();
    OcrResult ocrResult = ocrEngine.detect(imagePath, 1024); //按图像长边进行总体缩放，放大增加识别耗时但精度更高，缩小减小耗时但精度降低，maxSideLen=0代表不缩放
    Long e=System.currentTimeMillis();
    System.out.println("检测耗时"+(e-b)+"毫秒");
    System.out.println("-----------------------");
    //使用native方法，可以让OcrEngine成为单例
    //OcrResult ocrResult = ocrEngine.detect(imagePath, padding, maxSideLen, boxScoreThresh, boxThresh, unClipRatio, doAngle, mostAngle);

    //------- print result -------
    System.out.println(ocrResult.toString());

打包jar后在eclipse中运行报错如下：

A fatal error has been detected by the Java Runtime Environment:

EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00007ff99d31f6d6, pid=39464, tid=0x000000000000be98

JRE version: Java(TM) SE Runtime Environment (8.0_221-b11) (build 1.8.0_221-b11)

Java VM: Java HotSpot(TM) 64-Bit Server VM (25.221-b11 mixed mode windows-amd64 compressed oops)

Problematic frame:

C [BaiPiaoOcrOnnx.dll+0x117f6d6]

Failed to write core dump. Minidumps are not enabled by default on client versions of Windows

An error report file with more information is saved as:

D:\jianyan\simpleness\simpleness-service\hs_err_pid39464.log

If you would like to submit a bug report, please visit:

http://bugreport.java.com/bugreport/crash.jsp

The crash happened outside the Java Virtual Machine in native code.

See problematic frame for where to report the bug.

Bug: Cannot process sample image with python onnx model loading

请提供下述完整信息以便快速定位问题
(Please provide the following information to quickly locate the problem)

I am trying to build a simple solution of RapidOCR from onnx models and I am stuck with problem of inference data. I have found guide that advice to load the model with onnxruntime and then prepare the image according to NCHW format then process the numpy array to model run but when I am trying to run the inference code block - I got error that is mentioned below, but when I scale the image up to really small dimensions - error disappear but I guess that the results won't give a correct answer to certain image.

系统环境/System Environment：Linux 20.04
使用的是哪门语言的程序/Which programing language：Python
所使用语言相关版本信息/Version: rapidocr-onnxruntime==1.2.3
OnnxRuntime版本/OnnxRuntime Version onnxruntime==1.14.1
可复现问题的demo/Demo of reproducible problems：https://colab.research.google.com/drive/11NBn3RRiqZrEu9kKZuBAXrF7EcG_0cWy?usp=sharing
完整报错/Complete Error Message：Fail: [ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Concat node. Name:'Concat_1' Status Message: concat.cc:157 PrepareForCompute Non concat axis dimensions must match: Axis 2 has mismatched dimensions of 1 and 20
可能的解决方案/Possible solutions: When I reduce the image dimensions to 32x32 - error disappear but I am confident that scaling from 500x500 (as example) up to 32x32 won't give a real results

Nice job!!

关于API方式返回结果的一点建议

从文档里看，目前的返回结果是这样的：

[['0', '香港深圳抽血', '0.93583983'], ['1', '专业查性别', '0.89865875'], ['2', '专业鉴定B超单', '0.9955703'], ['3', 'b超仪器查性别', '0.99489486'], ['4', '加微信eee', '0.99073666'], ['5', '可邮寄', '0.99923944']]

坐标其实还是比较重要的，可以用在后期对内容进行分段处理。

下面的数据格式是我在项目里做的一个api结果的例子。

lines：提供了一个合并后的文本结果。方便直接显示使用。
regions：识别到的区域，每个区域的文字和坐标。

代码在这里 https://github.com/cuiliang/RapidOCR/blob/Quicker/ocrweb/api_task.py
不太懂py，代码只是跟随感觉拼凑的😂，供大佬参考。

{
  "result": {
    "lines": "Filters Is:issueis:open\n\n10pen 35Closed\n\n建议：将ppocr_keys等信息直接存储到onnx模型\n#42 opened 4daysagoby AutumnSun1996",
    "regions": [
      {
        "text": "Filters",
        "confidence": 0.9966548,
        "rect": {
          "left": 57,
          "top": 0,
          "right": 116,
          "bottom": 2
        }
      },
      {
        "text": "Is:issueis:open",
        "confidence": 0.84303313,
        "rect": {
          "left": 210,
          "top": 2,
          "right": 347,
          "bottom": 3
        }
      },
      {
        "text": "10pen",
        "confidence": 0.976416,
        "rect": {
          "left": 89,
          "top": 88,
          "right": 160,
          "bottom": 88
        }
      },
      {
        "text": "35Closed",
        "confidence": 0.9819431,
        "rect": {
          "left": 213,
          "top": 89,
          "right": 305,
          "bottom": 89
        }
      },
      {
        "text": "建议：将ppocr_keys等信息直接存储到onnx模型",
        "confidence": 0.97398514,
        "rect": {
          "left": 90,
          "top": 158,
          "right": 594,
          "bottom": 158
        }
      },
      {
        "text": "#42 opened 4daysagoby AutumnSun1996",
        "confidence": 0.9657532,
        "rect": {
          "left": 91,
          "top": 199,
          "right": 442,
          "bottom": 198
        }
      }
    ]
  },
  "info": {
    "total_elapse": 0.45919999999999994,
    "elapse_part": {
      "det_elapse": "0.3858",
      "cls_elapse": "0.0011",
      "rec_elapse": "0.0723"
    }
  }
}

对文本检测框部分排序不对应

RapidOCR现有方案：

RapidOCR/python/rapidocr_onnxruntime/rapid_ocr_api.py

Lines 153 to 171 in 0dbac44

 def sorted_boxes(dt_boxes): 

 """ 

  Sort text boxes in order from top to bottom, left to right 

  args: 

  dt_boxes(array):detected text boxes with shape [4, 2] 

  return: 

  sorted boxes(array) with shape [4, 2] 

  """ 

 num_boxes = dt_boxes.shape[0] 

 sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0])) 

 _boxes = list(sorted_boxes) 

 for i in range(num_boxes - 1): 

 if abs(_boxes[i + 1][0][1] - _boxes[i][0][1]) < 10 and \ 

 (_boxes[i + 1][0][0] < _boxes[i][0][0]): 

 tmp = _boxes[i] 

 _boxes[i] = _boxes[i + 1] 

 _boxes[i + 1] = tmp 

 return _boxes

PaddleOCR现有方案：
https://github.com/PaddlePaddle/PaddleOCR/blob/013870d9bc963f7508d5a76f81da872b84471a71/tools/infer/predict_system.py#L113-L134
感谢小伙伴的指出，现改为PaddleOCR方案

程序运行的临时文件被占用

我用的swing集成的ocr，识别图片完成后 swing 程序没退出，产生了一些 *.jpg-result.txt 文件无法删除，必须要退出swing 后才可以删。

No results from inference when using onnxruntime with TensorRT

I built onnxruntime with TensorRT to see if there could be any performance improvements with RapidOCR but unfortunately, the inference returned an empty array. Here's the log:

C:\Users\samay\Documents\RapidOCR\python>python rapidOCR.py
2021-07-08 00:09:15.7074765 [E:onnxruntime:Default, tensorrt_execution_provider.h:51 onnxruntime::TensorrtLogger::log] [2021-07-08 05:09:15   ERROR] Parameter check failed at: engine.cpp::nvinfer1::rt::ExecutionContext::setBindingDimensions::1136, condition: profileMaxDims.d[i] >= dimensions.d[i]
Traceback (most recent call last):
  File "C:\Users\samay\Documents\RapidOCR\python\rapidOCR.py", line 257, in <module>
    dt_boxes, rec_res = text_sys(args.image_path)
  File "C:\Users\samay\Documents\RapidOCR\python\rapidOCR.py", line 177, in __call__
    dt_boxes, elapse = self.text_detector(img)
  File "C:\Users\samay\Documents\RapidOCR\python\ch_ppocr_mobile_v2_det\text_detect.py", line 136, in __call__
    dt_boxes = post_result[0]['points']
IndexError: list index out of range

I'm by no means an expert in model conversion so I'm guessing tensorrt simply doesn't support the converted onnx model ? Is there a way to make it work ?

.net项目程序在开发电脑上运行正常，拷贝到其他电脑时运行报错

程序在开发电脑上运行正常，拷贝到其他电脑时运行报错，请大神指导下，谢谢

群号不对

搜不到QQ群

请问在线的demo用的哪个模型呢?

大佬好，请问您在线demo利用的是哪个模型呢？我试了网盘里面的几个模型，也试过了他们的组合。但是效果和您在线示例中的都存在差异，在一些细节上面效果差很多。😂可以分享一下那个模型吗？

识别高概率出现

同一个张图片，java代码识别多次，只有第一次是正确的，后面识别全部是乱码

ONNXRuntime 1.8.1 推理崩溃

'rapidocrtester.exe' 不是内部或外部命令，也不是可运行的程序. win10 x64, x86都报错.微信nlanguage

个别图片检测不出文字

原始图片：

作为对比，Paddle在线的是可以的，而且检测速度也很快。
目前感觉有一个现象，越小的图片，除了足够小跳过检测的，相比于较大的图片检测花费的时间更长。不知道有没有什么优化的办法。

onnx转openvino出错

用的onnx模型是在您提供的网盘下载的ch_ppocr_mobile_v2.0_rec_infer.onnx
转换的命令是
python "C:\Program Files (x86)\Intel\openvino_2021\deployment_tools\model_optimizer\mo.py" --input_model=ch_ppocr_mobile_v2.0_rec_infer.onnx --output_dir=. --model_name=model_rec --data_type=FP32
出现如下错误

请问这错误是啥意思?该怎么解决呢？

模型转换链接，404

能不能用MinGW编译呢

我想用MinGW在Windows上编译使用，是否可行？

英文识别不了空格

识别不了空格，全粘在一块

请问这个项目下载到本地后可以断网运行吗，还是发送到服务端进行识别?

Trouble with installation

pip install https://github.com/RapidAI/RapidOCR/raw/main/release/python_sdk/sdk_rapidocr_v1.0.0/rapidocr-1.0.0-py3-none-any.whl -i https://pypi.douban.com/simple/ Looking in indexes: https://pypi.douban.com/simple/ Collecting rapidocr==1.0.0 Using cached https://github.com/RapidAI/RapidOCR/raw/main/release/python_sdk/sdk_rapidocr_v1.0.0/rapidocr-1.0.0-py3-none-any.whl (18 kB) Collecting six>=1.15.0 Downloading https://pypi.doubanio.com/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB) Requirement already satisfied: numpy>=1.19.3 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from rapidocr==1.0.0) (1.21.4) Collecting pyclipper>=1.2.1 Downloading https://pypi.doubanio.com/packages/24/6e/b7b4d05383cb654560d63247ddeaf8b4847b69b68d8bc6c832cd7678dab1/pyclipper-1.3.0.zip (142 kB) |████████████████████████████████| 142 kB 2.7 MB/s Installing build dependencies ... done Getting requirements to build wheel ... done Preparing metadata (pyproject.toml) ... done Requirement already satisfied: Shapely>=1.7.1 in /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages (from rapidocr==1.0.0) (1.8.0) ERROR: Could not find a version that satisfies the requirement onnxruntime>=1.7.0 (from rapidocr) (from versions: none) ERROR: No matching distribution found for onnxruntime>=1.7.0

But I do have onnxruntime on my Mac.

🍺 /opt/homebrew/Cellar/onnxruntime/1.9.1: 77 files, 11.9MB

模型是不支持中文路径吗？

原版paddleocr可以使用中文路径，但RapidOCR使用中文路径会报错😂

如果仅仅使用onnxruntime-1.7.0-shared.7z和 opencv-3.4.13-sharedLib.7z，cmake编译问题

操作系统WIN10 x64
语言C++ 编译RapidOCR
如果仅仅使用onnxruntime-1.7.0-shared.7z和 opencv-3.4.13-sharedLib.7z, 直接运行build.bat会报错，提示cmake找不到onnxruntime，所以只能使用onnxruntime-1.6.0-sharedLib.7z，

cv2.dnn

请问是否有通过cv2.dnn方式部署的paddleocr

ImportError: cannot import name 'escape' from 'jinja2'

working with RapidOCR/ocrweb
https://github.com/RapidAI/RapidOCR/tree/main/ocrweb
- requirements.txt
  - Flask==1.1.2
    - Jinja2==3.1.2

└─(16:17:30)──> python main.py                                                                                                                                                                         ──(Fri,Jun24)─┘
Traceback (most recent call last):
  File "main.py", line 9, in <module>
    from flask import Flask, render_template, request
  File "/Users/XXXXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/flask/__init__.py", line 14, in <module>
    from jinja2 import escape
ImportError: cannot import name 'escape' from 'jinja2' (/Users/XXXXXX/anaconda3/envs/rapid/lib/python3.7/site-packages/jinja2/__init__.py)

Fixed by upgrading flask to Version: 2.1.2
ref: https://stackoverflow.com/questions/71718167/importerror-cannot-import-name-escape-from-jinja2

python版cuda使用问题。

onnxruntime-gpu==1.10
cuda==11.40
cudnn==8.2.4
ubuntu 1804
环境如上，在目标检测：加载模型阶段使用cuda，代码：
.....
self.preprocess_op = create_operators(pre_process_list)
self.postprocess_op = DBPostProcess(thresh=0.3,
box_thresh=0.5,
max_candidates=1000,
unclip_ratio=1.6,
use_dilation=True)
providers = [
('CUDAExecutionProvider', {
'device_id': 0,
'arena_extend_strategy': 'kNextPowerOfTwo',
'gpu_mem_limit': 2 * 1024 * 1024 * 1024,
'cudnn_conv_algo_search': 'EXHAUSTIVE',
'do_copy_in_default_stream': True,
}),
'CPUExecutionProvider',
]
self.session = onnxruntime.InferenceSession(det_model_path, providers=providers)
......

错误如下：
ValueError: This ORT build has ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. Since ORT 1.9, you are required to explicitly set the providers parameter when instantiating InferenceSession. For example, onnxruntime.InferenceSession(..., providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'], ...)

请问如何解决，谢谢

移动端离线部署

你好，这个项目如何移动端离线部署，谢谢！

Python recognition inference result is weird

When performing inference on a screenshot from wikipedia in english, the ocr results are... weird. Some lines are perfectly recognised while others are completely wrong. Is there a parameter I need to change ?

Original image:

Inference result:

Can someone share demo tool (cpp)?

Hi,

I want to use .net tool to compare models, but I couldn't download the files, because of they shared on QQ. I couldn't register it.
Can someone share this file with me, please? Via googledrive, wetransfer, telegram etc.
Thanks for all.

Links:

https://github.com/RapidAI/RapidOCR/blob/main/docs/README_en.md#demo
https://github.com/RapidAI/RapidOCR/tree/main/cpp#demo%E4%B8%8B%E8%BD%BDwinmaclinux

PaddleOCR的参数设置

PaddleOCR的运行参数设置在Web Demo中怎么设置，比如 det_db_score_mode:="slow"

dotnet测试 InitModel就出差了

public void InitModel(string path, int numThread)
{
try
{
SessionOptions op = new SessionOptions(); //这行
op.GraphOptimizationLevel = GraphOptimizationLevel.ORT_ENABLE_EXTENDED;
op.InterOpNumThreads = numThread;
op.IntraOpNumThreads = numThread;
dbNet = new InferenceSession(path, op);
inputNames = dbNet.InputMetadata.Keys.ToList();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message + ex.StackTrace);
throw ex;
}
}

执行到：
public SessionOptions()
: base(IntPtr.Zero, true)
{
NativeApiStatus.VerifySuccess(NativeMethods.OrtCreateSessionOptions(out handle));
}
出错：
在 OcrLiteLib.DbNet.InitModel(String path, Int32 numThread) 位置 F:\RapidOCR\dotnet\BaiPiaoOcrOnnxCs\OcrLib\DbNet.cs:行号 45
在 OcrLiteLib.OcrLite.InitModels(String detPath, String clsPath, String recPath, String keysPath, Int32 numThread) 位置 F:\RapidOCR\dotnet\BaiPiaoOcrOnnxCs\OcrLib\OcrLite.cs:行号 29

我在win7上测试PYTHON 很好检测图片速度也好
请大佬看看引用的包没有升级我看了一下dotnet 已经半年没有更新了

参考代码:

# 添加meta信息
import onnx

model = onnx.load_model('/path/to/model.onnx')
meta = model.metadata_props.add()
meta.key = 'dictionary'
meta.value = open('/path/to/ppocr_keys_v1.txt', 'r', -1, 'u8').read()

meta = model.metadata_props.add()
meta.key = 'shape'
meta.value = '[3,48,320]'

onnx.save_model(model, '/path/to/model.onnx')

# 获取meta信息
import json
import onnxruntime as ort

sess = ort.InferenceSession('/path/to/model.onnx')
metamap = sess.get_modelmeta().custom_metadata_map
chars = metamap['dictionary'].splitlines()
input_shape = json.loads(metamap['shape'])

	def sorted_boxes(dt_boxes):
	"""
	Sort text boxes in order from top to bottom, left to right
	args:
	dt_boxes(array):detected text boxes with shape [4, 2]
	return:
	sorted boxes(array) with shape [4, 2]
	"""
	num_boxes = dt_boxes.shape[0]
	sorted_boxes = sorted(dt_boxes, key=lambda x: (x[0][1], x[0][0]))
	_boxes = list(sorted_boxes)

	for i in range(num_boxes - 1):
	if abs(_boxes[i + 1][0][1] - _boxes[i][0][1]) < 10 and \
	(_boxes[i + 1][0][0] < _boxes[i][0][0]):
	tmp = _boxes[i]
	_boxes[i] = _boxes[i + 1]
	_boxes[i + 1] = tmp
	return _boxes