goncalopp / simple-ocr-opencv Goto Github PK

View Code? Open in Web Editor NEW

516.0 37.0 175.0 572 KB

A simple python OCR engine using opencv

License: GNU Affero General Public License v3.0

Python 100.00%

ocr python-ocr supervised-learning opencv knn-algorithm machine-learning machinelearning machine-vision machinevision

simple-ocr-opencv's Introduction

Simple Python OCR

A simple pythonic OCR engine using opencv and numpy.

Originally inspired by this stackoverflow question

Essential Concepts

Segmentation

In order for OCR to be performed on a image, several steps must be performed on the source image. Segmentation is the process of identifying the regions of the image that represent characters.

This project uses rectangles to model segments.

Supervised learning with a classification problem

The classification problem consists in identifying to which class a observation belongs to (i.e.: which particular character is contained in a segment).

Supervised learning is a way of "teaching" a machine. Basically, an algorithm is trained through examples (i.e.: this particular segment contains the character f). After training, the machine should be able to apply its acquired knowledge to new data.

The k-NN algorithm, used in this project, is one of the simplest
classification algorithm.

Grounding

Creating a example image with already classified characters, for training purposes. See ground truth.

How to understand this project

Unfortunately, documentation is a bit sparse at the moment (I gladly accept contributions). The project is well-structured, and most classes and functions have docstrings, so that's probably a good way to start.

If you need any help, don't hesitate to contact me. You can find my email on my github profile.

How to use

Please check example.py for basic usage with the existing pre-grounded images.

You can use your own images, by placing them on the data directory. Grounding images interactively can be accomplished by using grounding.UserGrounder. For more details check example_grounding.py

Copyright and notices

This project is available under the GNU AGPLv3 License, a copy should be available in LICENSE. If not, check out the link to learn more.

Copyright (C) 2012-2017 by the simple-ocr-opencv authors
All authors are the copyright owners of their respective additions

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU AGPLv3 License, as found in LICENSE.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU Affero General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

simple-ocr-opencv's People

Contributors

Stargazers

Watchers

Forkers

sameerparekh nyox pi19404 olamotte hfeeki malygrzes justzx2011 koff kamilis williamtang sciumo eametichon arca1n-zz durgaprsd04 apanly bygreencn zxspectrumz80 zelladoor edran stone5495 sayedahmadnawid doclements uikit0 neurotoxin78 mikepatrickryan xieyanfu llp1992 animenon chrmorais tuanthng muditrastogi futurist jakeboydston allanchen sumanthbhatt cxuef aramic bimomarlaw avanisho guozanhua mtolba znatz heartz pallavibharadwaj lahiruprabodha ninoninkovic pipi1226 fz-29 astro44 surasys jqsheng94 showei kforest-intact veterun sdawans twmht leeyevi dsantosh yetanothertimes richard512 hrshptl6595 tonzhjobpro alexbigboy sjwang1988 raghavatreya humanely ossdc redfantom tonylyu mohanak12 abhishekjavali gjhkael li363849131 zuphilip saibabanadh kaiser34 zgsxwsdxg jkamlah nanjifeng aitsalaheddine kunle12 thwinkler montecarlo1 wind2008hxy jeanfredson siddhartha123 zion302 qiaokangqi pratapre501 bubalazi yangkf1985 jinqingqing avi9839 robihidayat tiravata lngao faisal-w duongnguyenhai frozenlzt gouri21196

simple-ocr-opencv's Issues

Different Number of lines

Hi!

I really appreciate this project. I tested it with the examples and it works pretty good.
I'm actually quite interested in getting the lines showing line starts and ends (waiting for input) because having the lines, I could experiments with LSTMs, which I already use.

The problem is that when I use other image, I get an error

Exception: different number of lines

I was debugging, and somehow

tops = self._guess_lines(segment_tops) 
bottoms = self._guess_lines(segment_bottoms)

are not the same (I mean, the length). But don't know why and how to fix it.

Maybe it's not a bug, but I need to do something else.

The pic I'm trying to work with is here:

It'd be ok if there are some errors, I could try to fix it if the lines are not sooo good and could make a pull request, but I'd be nice if it at least tries :D

Example program (example.py) crashing at first run under python 3.9 windows

C:\users\david\simple-ocr-opencv\simpleocr_init_.py:28: SyntaxWarning: "is" with a literal. Did you mean "=="?
elif sys.platform is "win32":
showing after BlurProcessor (waiting for input)
showing ContourSegmenter contours (waiting for input)
showing image after segmentation by RawContourSegmenter (waiting for input)
showing segments filtered by LargeFilter (waiting for input)
showing segments filtered by SmallFilter (waiting for input)
showing segments filtered by LargeAreaFilter (waiting for input)
showing segments filtered by ContainedFilter (waiting for input)
Traceback (most recent call last):
File "example.py", line 15, in
test_chars, test_classes, test_segments = ocr.ocr(test_image, show_steps=True)
File "C:\users\david\simple-ocr-opencv\simpleocr\ocr.py", line 73, in ocr
self.segmenter.display()
File "C:\users\david\simple-ocr-opencv\simpleocr\processor.py", line 162, in display
p.display(display_before=False)
File "C:\users\david\simple-ocr-opencv\simpleocr\segmentation_aux.py", line 83, in display
draw_lines(copy, self.lines_tops, (0, 0, 255))
File "C:\users\david\simple-ocr-opencv\simpleocr\opencv_utils.py", line 115, in draw_lines
cv2.line(image, (0, y), (image.shape[1], y), color, line_width)
TypeError: only integer scalar arrays can be converted to a scalar index

how to train it?

Make master compatible with both OpenCV 2 and 3

If possible, we should merge the opencv3 branch back into master in a way that makes it work for both versions. CI needs to be configured to test both.

@RedFantom 's comment:

Edit: using cv2.version, it is possible to retrieve a version number in str format, but the python-opencv package from the Ubuntu repositories provides $Rev 4227 as version number instead of 3.x.x or 2.x.x, so that complicates things.

I'm getting '2.4.9.1' on Debian stable (Jessie), so this might Ubuntu-specific. It's annoying, but I guess we can hardcode specific values for the Ubuntu versions we know for now.

ValueError: Input object to FillWithScalar is not a scalar

Hello,

First of all great project, really educational and thank you for sharing.

When I run the example.py file through my Terminal (OS X 10.7, Python 2.7.3) I get the following error, only after I press ANY key on the screen with the Pi number:

showing after BlurProcessor (waiting for input)
Traceback (most recent call last):
  File "example.py", line 15, in <module>
    test_classes, test_segments= ocr.ocr( test_image, show_steps=True )
  File "/Users/Ath/Source/CW/ocr_simple/ocr.py", line 42, in ocr
    self.segmenter.display()
  File "/Users/Ath/Source/CW/ocr_simple/processor.py", line 143, in display
    p.display( display_before= False )
  File "/Users/Ath/Source/CW/ocr_simple/segmentation.py", line 63, in display
    copy.fill( (255,255,255) )
ValueError: Input object to FillWithScalar is not a scalar

I would appreciate any help, as I am still a Python newbie...

error with the operator "-" in code

TypeError: numpy boolean subtract, the - operator, is deprecated, use the bitwise_xor, the ^ operator, or the logical_xor function instead.

class ContainedFilter(Filter):
"""desirable segments are not contained by any other"""

def _good_segments(self, segments):
    m = contained_segments_matrix(segments)
    return (True - numpy.max(m, axis=1))

Should I install something or i need to change it manually?

Fix opening of files for non-test files

Migrated from PR 24

It seems the current code for files.py doesn't allow loading specific (non-test) files given a relative path (tested on linux).

There may be more issues at well. @RedFantom can you elaborate on the issues you found?

Doesn't work on 64bit python

/simple-ocr-opencv/clustering.py:45: RuntimeWarning: invalid value encountered in divide
return min(tmp) / max_intra_distance
Traceback (most recent call last):
File "example.py", line 15, in
test_classes, test_segments= ocr.ocr( test_image, show_steps=True )
File "/simple-ocr-opencv/ocr.py", line 40, in ocr
segments= self.segmenter.process( image_file.image )
File "/simple-ocr-opencv/processor.py", line 87, in process
output= self._process(arguments)
File "/simple-ocr-opencv/processor.py", line 131, in _process
arguments= p.process( arguments )
File "/simple-ocr-opencv/processor.py", line 87, in process
output= self._process(arguments)
File "/simple-ocr-opencv/segmentation_aux.py", line 91, in _process
raise Exception("different number of lines")
Exception: different number of lines

Merge contributions from RedFantom's fork

Continued discussion from #13

Fork at https://github.com/RedFantom/simple-ocr-opencv

@RedFantom
A PR sounds great, but please make sure its in a reviewable state - i.e.: formatting is separated from contributions, and ideally a small (squashed) number of commits.
If you don't have the time right now to squash/rebase everything I'll try and have a go at replicating your changes incrementally.

Thanks!

TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'

I am running "openvc3" branch and trying just to run example.py when getting:

Traceback (most recent call last):
  File "/home/yuriy/development/simple-ocr-opencv/example.py", line 12, in <module>
    ocr.train( ImageFile('digits1') )
  File "/home/yuriy/development/simple-ocr-opencv/files.py", line 61, in __init__
    self.ground.read()
  File "/home/yuriy/development/simple-ocr-opencv/files.py", line 40, in read
    self.classes, self.segments = read_boxfile(self.path)
  File "/home/yuriy/development/simple-ocr-opencv/tesseract_utils.py", line 16, in read_boxfile
    return classes_to_numpy(classes), segments_to_numpy(segments)
  File "/home/yuriy/development/simple-ocr-opencv/segmentation.py", line 23, in segments_to_numpy
    segments = numpy.array(segments, dtype=SEGMENT_DATATYPE, ndmin=2)  # each segment in a row
TypeError: int() argument must be a string, a bytes-like object or a number, not 'map'

I am on Ubuntu 14, 64 bit. OpenCV 3.1.

Open Source License?

Can the code be used under some open source license?

Maybe, you even want to choose a OSS license and make it explicit in the repo.

OpenCV 4.0

Hi Goncalo,
any plan to make it compatible with openCV 4.0?
I was integrating your code in a new virtualenv which took the last openCV version 4.0 and I came up with below error after tweaking the version check

  File "/mnt/Database/moneygator/simpleocr/ocr.py", line 65, in train
    self.classifier.train(features, image_file.ground.classes)
  File "/mnt/Database/moneygator/simpleocr/classification.py", line 68, in train
    self.knn.train(features, classes)
TypeError: only size-1 arrays can be converted to Python scalars

Numpy deprecated operations

Awesome project. You should add your dependencies as numpy has deprecated some of your operators

is possible break this captcha?

https://github.com/aero2a/kape-

i can generate separate symbol but how detect all?

how to

This is a really interesting project.

Is there a best way to create the grounding for an image file?

I did this:

git diff
diff --git a/classification.py b/classification.py
index a87a802..08585fc 100644
--- a/classification.py
+++ b/classification.py
@@ -14,7 +14,7 @@ def classes_to_numpy( classes ):
     #utf-32 starts with constant ''\xff\xfe\x00\x00', then has little endian 32 bits chars
     #this assumes little endian architecture!
     assert unichr(15).encode('utf-32')=='\xff\xfe\x00\x00\x0f\x00\x00\x00'
-    int_classes= array.array( "L", "".join(classes).encode('utf-32')[4:])
+    int_classes= array.array( "I", "".join(classes).encode('utf-32')[4:])
     assert len(int_classes) == len(classes)
     classes=  numpy.array( int_classes,  dtype=CLASS_DATATYPE, ndmin=2) #each class in a column. numpy is strange :(
     classes= classes if CLASSES_DIRECTION==1 else numpy.transpose(classes)
diff --git a/example.py b/example.py
index bafffab..5c537e4 100644
--- a/example.py
+++ b/example.py
@@ -1,4 +1,5 @@
 from files import ImageFile
+from grounding import UserGrounder
 from segmentation import ContourSegmenter, draw_segments
 from feature_extraction import SimpleFeatureExtractor
 from classification import KNNClassifier
@@ -11,9 +12,12 @@ ocr= OCR( segmenter, extractor, classifier )
 
 ocr.train( ImageFile('digits1') )
 
-test_image= ImageFile('digits2')
+test_image= ImageFile('train')
 test_classes, test_segments= ocr.ocr( test_image, show_steps=True )
 
+grounder= UserGrounder()
+grounder.ground(test_image, test_segments);
+
 print "accuracy:", accuracy( test_image.ground.classes, test_classes )
 print "OCRed text:\n", reconstruct_chars( test_classes )
 show_differences( test_image.image, test_segments, test_image.ground.classes, test_classes)

But somehow I don't think I should have :-)

How did you create the groundings?

opencv3.0 the cv2 no this method

File "/Users/telescopeman/project/simple-ocr-opencv/classification.py", line 47, in init
self.knn= cv2.KNearest()
AttributeError: 'module' object has no attribute 'KNearest'

Opencv

Hello,

I tried to use this repository however getting an error

TypeError: only length-1 arrays can be converted to Python scalars

self.classifier.train( features, image_file.ground.classes )
File simple-ocr-opencv-master\classification.py", line 56, in train

i am using Python 2.7
Opencv 3.1.0 and
numpy version 1.9.1

am faily new to python and opencv, any suggestions or pointers is much appreciated.

Thanks

OpenCV 3.0

I've just forked this repo and I'm migrating it to OpenCV 3 as I hit breaking changes. Are you interested in pull requests for an OpenCV 3 branch?

numpy boolean subtract error in segmentation_filters.py

This is the output l get when running example.py with numpy 1.14.0:

showing after BlurProcessor (waiting for input)
showing ContourSegmenter contours (waiting for input)
showing image after segmentation by RawContourSegmenter (waiting for input)
Traceback (most recent call last):
  File "example.py", line 15, in <module>
    test_chars, test_classes, test_segments = ocr.ocr(test_image, show_steps=True)
  File "C:\work\PREN\simple-ocr-opencv\simpleocr\ocr.py", line 73, in ocr
    self.segmenter.display()
  File "C:\work\PREN\simple-ocr-opencv\simpleocr\processor.py", line 162, in display
    p.display(display_before=False)
  File "C:\work\PREN\simple-ocr-opencv\simpleocr\segmentation_filters.py", line 27, in display
    draw_segments(copy, s[True - g], (0, 0, 255))
TypeError: numpy boolean subtract, the `-` operator, is deprecated, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

As stated in the error message, the problem is located in segmentation_filters.py on line 27.
When l change the line from this

draw_segments(copy, s[True - g], (0, 0, 255))

draw_segments(copy, s[1 - g], (0, 0, 255))

it works. However l'm not sure if this is correct.

Create better readme

Create README that helps non-experienced python users to understand how to use the program. Example.py doesn't show how to use custom images and how to "box" them.

Changing the training image & Creating grounding files

Step 1) Tried to change the training image in example.py

ocr.train( ImageFile('otherimagefile') )

Step 2) Bumped into an error

Exception: The provided file is not grounded

Step 3) Tried the solution issue 2 suggests

from files import ImageFile
from grounding import UserGrounder
from segmentation import ContourSegmenter, draw_segments
from feature_extraction import SimpleFeatureExtractor
from classification import KNNClassifier
from ocr import OCR, accuracy, show_differences, reconstruct_chars

segmenter=  ContourSegmenter( blur_y=5, blur_x=5, block_size=11, c=10)
extractor=  SimpleFeatureExtractor( feature_size=10, stretch=False )
classifier= KNNClassifier()
ocr= OCR( segmenter, extractor, classifier )

test_image= ImageFile('cap1')
test_classes, test_segments= ocr.ocr( test_image, show_steps=True )
grounder= UserGrounder()
grounder.ground(test_image, test_segments)
test_image.ground.write()

Step 4) But an error occurred:

OpenCV Error: Assertion failed (N >= K) in kmeans, file /build/buildd/opencv-2.4.8+dfsg1/modules/core/src/matrix.cpp, line 2702
Traceback (most recent call last):
  File "example.py", line 14, in <module>
    test_classes, test_segments= ocr.ocr( test_image, show_steps=True )
  File "/home/username/simple-ocr-opencv/ocr.py", line 40, in ocr
    segments= self.segmenter.process( image_file.image )
  File "/home/username/simple-ocr-opencv/processor.py", line 87, in process
    output= self._process(arguments)
  File "/home/username/simple-ocr-opencv/processor.py", line 131, in _process
    arguments= p.process( arguments )
  File "/home/username/simple-ocr-opencv/processor.py", line 87, in process
    output= self._process(arguments)
  File "/home/username/simple-ocr-opencv/segmentation_aux.py", line 63, in _process
    tops=               self._guess_lines( segment_tops )
  File "/home/username/simple-ocr-opencv/segmentation_aux.py", line 29, in _guess_lines
    compactness, classified_points, means = cv2.kmeans( data=ys, K=k, bestLabels=None, criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_MAX_ITER, 1, 10), attempts=2, flags=cv2.KMEANS_PP_CENTERS)
cv2.error: /build/buildd/opencv-2.4.8+dfsg1/modules/core/src/matrix.cpp:2702: error: (-215) N >= K in function kmeans

Accuracy exception

Context
Hello
I tried the below image

with Python 3

current behavior
When grounding via example_grounding.py the image above then launching example.py by the same above image, I got below exception

Traceback (most recent call last):
  File "example_amundi.py", line 17, in <module>
    print("accuracy:", accuracy(test_image.ground.classes, test_classes))
  File "/mnt/Documents/jb/Dev/python/projects/moneygator/test/castor/ocr/simple-ocr-opencv-master/simpleocr/ocr.py", line 36, in accuracy
    raise Exception("expected " + str(expected.shape) + ", got " + str(result.shape))
Exception: expected (10, 1), got (18, 1)

expected behavior
it should not trigger the exception except if I am doing something bad (does that need more numbers for grounding?)

get the zone detection in the order

current behavior
Some of image which needs to be detected have zone detection appearing in different orders than what we see (from left to right and top to bottom)
Let's take this example below

The number 3 will arrive in the results before 7 whereas it should appear before 6

OCRed text:
q85920q61437

When I try the grounding on this particular picture we can indeed see that 3 detection comes in the same order as the result above

expected behavior
Have the below result

OCRed text:
q85920q36147