GithubHelp home page GithubHelp logo

raoyutian / paddleocrsharp Goto Github PK

View Code? Open in Web Editor NEW
600.0 600.0 99.0 225.77 MB

PaddleOCRSarp是一个基于百度飞桨PaddleOCR的C++代码修改并封装的.NET的OCR工具类库。包含文本识别、文本检测、表格识别功能。本项目针对小图识别不准的情况下做了优化,比飞桨原代码识别准确率有所提高。 包含总模型仅8.6M的超轻量级中文OCR,单模型支持中英文数字组合识别、竖排文本识别、长文本识别。同时支持多种文本检测。

License: Apache License 2.0

C++ 23.24% C# 68.67% Go 1.52% Batchfile 0.06% Python 6.51%

paddleocrsharp's People

Contributors

raoyutian avatar sagarjgb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paddleocrsharp's Issues

Training Set For This Model

Hello,

I am very interesting about the project. In comparsion with the orignal PaddleOCR, I've seen that the performance of your model is so good when detecting the small obejcts.

Could you please share with me what you've changed in the model architecture and the training dataset you've used to train the model?

I am juts curious aabout the techniques. Thank you in advance.

是否可以复制多份使用

我在.net6中使用了该库,但因为不支持多线程,所以会出现资源争用的情况,导致出现各种异常
所以不得不加锁,并设为了单例
但是这样会带来一个新的问题
我可能同时有十来个线程需要同时进行ocr识别
这如果其中有一些识别比较耗时且它们排在前面,就会导致后面的线程全部在等待中
会造成后面的线程的识别速度被降低

是否可以每一个线程我复制一份新的dll及其附加依赖项到另一个目录中使用呢?
就像我有1至10个线程,我new十个ocr对象且它们指向各自的目录库?

或者也可以多进程使用?

或者说我想岔了,我是否可以每个线程new一个ocr实例是否也是可行的呢?

运行 PaddleOCRSharpDemo出错

提示找不到

*\PaddleOCRSharp\PaddleOCRDemo\PaddleOCRSharpDemo\bin\Debug\inferenceserver\ch_ppocr_server_v2.0_det_infer
*\PaddleOCRSharp\PaddleOCRDemo\PaddleOCRSharpDemo\bin\Debug\inferenceserver\ch_ppocr_mobile_v2.0_cls_infer
*\PaddleOCRSharp\PaddleOCRDemo\PaddleOCRSharpDemo\bin\Debug\inferenceserver\ch_ppocr_server_v2.0_rec_infer
*\PaddleOCRSharp\PaddleOCRDemo\PaddleOCRSharpDemo\bin\Debug\inferenceserver\ppocr_keys.txt

这些东西项目里面都没有,项目不能运行,你的说明文档里面也没有这部分内容。麻烦更新一下文档

32bit Version

Hi,

thanks for your work. Is it possibile to have 32bit version?
If not how can I build to make it compatible?

Thanks.

Not Detect the Space

Hi,

I have a image with space in between the character. But it's ignore, kindly help me to fix this issue.

Thank you

无法加载 DLL“PaddleOCR.dll”: 找不到指定的模块

Hi Raoyutian
下面是我调用paddlepcr遇到的问题,请帮忙看看。。谢谢

System.DllNotFoundException
HResult=0x80131524
Message=无法加载 DLL“PaddleOCR.dll”: 找不到指定的模块。 (异常来自 HRESULT:0x8007007E)。
Source=PaddleOCRSharp
StackTrace:
在 PaddleOCRSharp.PaddleOCREngine.Initialize(String det_infer, String cls_infer, String rec_infer, String keys, OCRParameter parameter)
在 PaddleOCRSharp.PaddleOCREngine..ctor(OCRModelConfig config, OCRParameter parameter) 在 D:\Worke\builds\Paddle\PaddleOCRSharp-main\PaddleOCRSharp\PaddleOCREngine.cs 中: 第 55 行
在 RSADemo.RSADetct.RSAOCR(String ImPath, String OutputSaveImPath, String& Result) 在 D:\Worke\builds\Paddle\RSADetctText\RSADetctText\RSADemo\RSADetct.cs 中: 第 37 行
在 RSADemo.Form1.button1_Click(Object sender, EventArgs e) 在 D:\Worke\builds\Paddle\RSADetctText\RSADetctText\RSADemo\Form1.cs 中: 第 27 行
在 System.Windows.Forms.Control.OnClick(EventArgs e)
在 System.Windows.Forms.Button.OnClick(EventArgs e)
在 System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
在 System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
在 System.Windows.Forms.Control.WndProc(Message& m)
在 System.Windows.Forms.ButtonBase.WndProc(Message& m)
在 System.Windows.Forms.Button.WndProc(Message& m)
在 System.Windows.Forms.NativeWindow.DebuggableCallback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
在 System.Windows.Forms.UnsafeNativeMethods.DispatchMessageW(MSG& msg)
在 System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(IntPtr dwComponentID, Int32 reason, Int32 pvLoopData)
在 System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(Int32 reason, ApplicationContext context)
在 System.Windows.Forms.Application.ThreadContext.RunMessageLoop(Int32 reason, ApplicationContext context)
在 RSADemo.Program.Main() 在 D:\Worke\builds\Paddle\RSADetctText\RSADetctText\RSADemo\Program.cs 中: 第 19 行

是否楞以复制多份使用

我在.net6中使用了该库,但因为不支持多线程,所以会出现资源争用的情况,导致出现各种异常
甩以不得不加锁,并设为了单例
但是这样会带来一个新的问题
我可能同时有十来个线程需要同时进行ocr识别
这如果其中有一些识别比较耗时且它们排在前面,就会导致后面的线程全部在等待中
会造成后面的线程的识别速度被降低

是否可以每一个线程我复制一份新的dll及其附加依赖项到另一个目录中使用呢?
就像我有1至10个线程,我new十个ocr对象且它们指向各自的目录库?

或者也可以多进程使用?

At the same time, PaddleOCRSharp and PaddleSegSharp packaging conflicts are introduced

一个项目同时引入 PaddleOCRSharp 和 PaddleSegSharp 打包冲突

2>找到了多个具有相同相对路径的发布输出文件: E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\concrt140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\concrt140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\libiomp5md.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\libiomp5md.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\mfc140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\mfc140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\mfcm140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\mfcm140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\mklml.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\mklml.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\msvcp140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\msvcp140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\msvcp140_1.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\msvcp140_1.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\msvcp140_2.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\msvcp140_2.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\msvcp140_atomic_wait.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\msvcp140_atomic_wait.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\msvcp140_codecvt_ids.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\msvcp140_codecvt_ids.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\onnxruntime.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\onnxruntime.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\paddle2onnx.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\paddle2onnx.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\paddle_inference.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\paddle_inference.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\vcamp140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\vcamp140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\vccorlib140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\vccorlib140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\vcomp140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\vcomp140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\vcruntime140.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\vcruntime140.dll, E:\package\nuget\globalPackagesFolder\paddlesegsharp\1.1.0\build\PaddleSegLib\vcruntime140_1.dll, E:\package\nuget\globalPackagesFolder\paddleocrsharp\4.0.2\build\PaddleOCRLib\vcruntime140_1.dll。

有没有办法优化下大小?

你好,我这边只是用了nuget引入相关依赖,生成的release就有300M,有没有什么办法可以优化下大小?

多线程

image

同志,目前是不是已经支持多线程了

循环调用DetectText报错

1、包版本

<PackageReference Include="PaddleOCRSharp" Version="4.2.0" />

2、PaddleOCREngine全局初始化
3、调用代码

 foreach (var file in Request.Form.Files)
 {
     var bytes = new byte[file.Length];
     using (var fileStream = file.OpenReadStream())
     {
         fileStream.Read(bytes, 0, (int)file.Length);
     }
     var text = ocr.DetectText(bytes);
     result.Add(new { key = file.FileName, value = text });
 }
错误:
1、JsonSerializationException: Error converting value {null} to type 'System.Single'. Path '[0].cls_score', line 1, position 131.
2、InvalidCastException: Null object cannot be converted to a value type.

ToList() 导致的性能损失

PaddleOCREngine.cs 中有一行

y_max = blocks.OrderBy(x => x.BoxPoints[0].Y).ToList()[listys[i + 1]].BoxPoints[0].Y;

这里对 IEnumerable 进行了 ToList() 展开然后使用[Index]取值,这是一个危险的行为。鉴于上下文,这里只进行了一次取下标的行为。 因此 IEnumerable.ElementAt(index) 毫无疑问更适合的选择。那么这两者到底有什么区别,

考虑以下代码:

var list = Enumerable.Range(0, 10).ToList(); //创建一列数 [0~9]
var eighthElementAt = list.Order().Select(Print).ElementAt(8); // 通过 ElementAt(index)
var eighthIndexOf   = list.Order().Select(Print).ToList()[8]; // 通过 ToList()[index]
return;
T Print<T>(T value){ //输出计算过程中经过的值
    Console.Write(value + ",");
    return value;
}

Print 在中途用于体现整个函数计算了多少次
执行之后 eighthElementAt eighthIndexOf 的结果都是 8, 但是过程完全不一样
这是执行之后的输出:

8,
0,1,2,3,4,5,6,7,8,9,

可见 ElementAt 会忽略中间不必要的计算,直接取到所需下标的值
但是 ToList 会强制对整个表达式计算成 List 对象,再进行取下标的操作,这无疑多出许多不必要的遍历。
一旦枚举数量上升,两者的性能差距会变得更明显

稍后我会提一个pr修复这个问题。

框架相关优化

框架自适应

在默认情况下,.NET会对框架进行自适应,如当前项目框架定向为 .NET Framework4.0 时,他会自动兼容向上所有的框架,即4.0-4.8, 在中对框架全声明一般用于对于不同框架的不同编译(宏控制)以及区别Nuget包引用。现在.NET有三族:.NET Standard, .NET Framework, .NET.NET Standard 完全支持 .NET, 可参考:(.NET Standard版本族与兼容关系)

TargetFrameworks 声明为

<TargetFrameworks>netstandard2.0;net35</TargetFrameworks>

即可兼容所有的框架

接着对 .NET Standard2.0 引用 System.Drawing.Common 即可对 .NET Standard.NET 族安装该包

<ItemGroup Condition="'$(TargetFramework)' == 'netstandard2.0'">
    <PackageReference Include="System.Drawing.Common" />
</ItemGroup>

System.Text.Json

自进入 .NET 以来,微软就已经对 Json 序列化进行了优化,以 Newtonsoft.Json 为参照推出了 System.Text.Json,在 .NET 族,他是内置的。在近乎一致的API风格下,后者拥有更好的性能,更重要的是在 .NET8.0 下支持 NativeAOT,它的作用不言而喻。进行迁移也不需要太多成本:

public static class SerializeExtension{
    public static string Serialize<T>(this T target){
#if NETSTANDARD20
        return System.Text.Json....Serialize(target);
#else
        return Newtonsoft.Json...SerializeObject(target);
#endif
    }
}

想必看到这段,迁移的方案就已经很明了了。

分包?为跨平台做准备?

通常情况下,一个存在Native交互的库会发布多个Nuget包,分为包含托管代码和PInvoke的包,以及实际Native依赖对应系统和架构的包这两种。本库的包是引用和Native打包在一起的。当然从现在的状况来看,本地推理、显卡加速似乎确实只有win-x64这一种环境,但是分包的优点是有利于构建时进行管理,按需对程序集进行组合和剪裁,避免臃肿。

External component has thrown an exception.

堆栈信息:
在 PaddleOCRSharp.PaddleOCREngine.DetectByte(IntPtr engine, Byte[] imagebytedata, Int64 size, IntPtr& result)
在 PaddleOCRSharp.PaddleOCREngine.DetectText(Byte[] imagebyte)
在 PaddleOCRSharp.PaddleOCREngine.DetectText(Image image)
在 HuoHuan.Utils.PaddleUtil.GetImageText(Bitmap bitmap) 在 D:\Work\Code\HuoHuan\src\HuoHuan\Utils\PaddleUtil.cs 中: 第 20 行

报错代码:

    internal static class PaddleUtil
    {
        private static readonly PaddleOCREngine _engine = new(null, new OCRParameter());

        /// <summary>
        /// 识别获取Bitmap文字内容
        /// </summary>
        /// <param name="bitmap"></param>
        /// <returns></returns>
        public static string GetImageText(Bitmap bitmap)
        {
            try
            {
                var ocrResult = _engine.DetectText(bitmap);

                return ocrResult.Text;
            }
            catch (Exception ex)
            {
                return "";
            }
        }
    }

偶现问题,且出现失败后,继续调用会正常。

4.0.1无法识别表格

demo的版本是3.0.1,可以正常识别表格,nuget升级到4.0.1后无法识别。降级回3.0.1后可以正常识别

Is GPU Mode possible?

Hello!
I am wondering about using GPU resource, NVIDIA GPU card, for process acceleration. There are GPU related settings in the OCR Parameters of the Paddle OCR Sharp library but it does not work. If it can be worked done, please let me know how to install and activate GPU resources.

regards,

Thanks in advance.

.net 6.0 WebApi 中使用 PaddleOCRSharp 在docker中报错的问题

System.DllNotFoundException: Unable to load shared library 'PaddleOCR.dll' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libPaddleOCR.dll: cannot open shared object file: No such file or directory
at PaddleOCRSharp.PaddleOCREngine.Initialize(String det_infer, String cls_infer, String rec_infer, String keys, OCRParameter parameter)
at PaddleOCRSharp.PaddleOCREngine..ctor(OCRModelConfig config, OCRParameter parameter)

这个异常 如何解决。
在Windows 10 本地运行 完全没有问题,部署到Ubuntu 20.04 Docker中 就出现这个异常 ,求解决方案

Could not load file or assembly 'MFCMIFC80, Version=1.0.0.0

image
我使用了Prism框架的插件模块,将OCR模块引入到我的插件项目(x64位编译)中,但是在APp启动加载插件时始终报这个错误,无法启动, 但是我单独搞一个Demo直接测试识别又是可以正常运行的

请问如何启用GPU

我希望使用GPU进行预测,
但每次都会遇到Please compile with gpu to EnableGpu()这个问题
请问如何操作,

System.AccessViolationException:“尝试读取或写入受保护的内存。这通常指示其他内存已损坏

在多次调用后出现如标题所示异常提示,调试追踪到namespace PaddleOCRSharp中PaddleOCREngine类出错,出错行: int textCount = Detect(Engine, imagefile, out ptrResult);,其中监控方法变量值
Engine:0x0000023efee5e780
imagefile:"C:\Users\aaa\AppData\Local\Temp\26dc935f-5eac-4e80-aa49-e58936246a26.bmp"
ptrResult:0x0000000000000000

怀疑是指针出问题,但能力有限,未能继续分析下去.......

以下是库中的部分代码
[DllImport("PaddleOCR.dll", CallingConvention = CallingConvention.Cdecl, SetLastError = true)]
internal static extern int Detect(IntPtr engine, string imagefile, out IntPtr result);

    /// <summary>
    /// 对图像文件进行文本识别
    /// </summary>
    /// <param name="imagefile">图像文件</param>
    /// <returns></returns>
    public OCRResult DetectText(string imagefile)
    {
        if (!System.IO.File.Exists(imagefile)) throw new Exception($"文件{imagefile}不存在");
        IntPtr ptrResult;
        int textCount = Detect(Engine, imagefile, out ptrResult);

内存占用问题

在使用的时候我发现识别一张图的时候
其内存会暴涨到600多兆
而且后面也无法释放
不知道是否属于正常情况

python 始终找不到.\PaddleOCR.dll

通过打印路径

dll_path = ".\PaddleOCR.dll"
print("路径:", os.path.abspath(dll_path))

输出:路径: C:\Users\Administrator\PaddleOCR.dll
将PaddleOCR.dll复制进C:\Users\Administrator路径下,但是

paddleOCR=cdll.LoadLibrary(".\PaddleOCR.dll")

始终报错找不到

错误信息:
PS C:\Users\Administrator> & E:/python/python.exe e:/工作软件/项目/python/python/PaddleOCRCppPython.py
Traceback (most recent call last):
File "e:/工作软件/项目/python/python/PaddleOCRCppPython.py", line 178, in
paddleOCR=cdll.LoadLibrary(".\PaddleOCR.dll")
File "E:\python\lib\ctypes_init_.py", line 434, in LoadLibrary
return self.dlltype(name)
File "E:\python\lib\ctypes_init
.py", line 356, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.