GithubHelp home page GithubHelp logo

antimatter15 / ocrad.js Goto Github PK

View Code? Open in Web Editor NEW
3.5K 139.0 440.0 7.97 MB

OCR in Javascript via Emscripten

Home Page: http://antimatter15.github.io/ocrad.js/demo.html

License: GNU General Public License v3.0

Makefile 0.37% C++ 10.61% Shell 0.37% C 0.15% JavaScript 87.87% Python 0.02% HTML 0.52% Roff 0.09%

ocrad.js's Introduction

ocrad.js

OCR in Javascript via Emscripten by Kevin Kwok

As with any minor stepping stone on the road to hell relentless trajectory of Atwood's Law, I probably don't need to justify the existence of yet another "x, but now in Javascript!", but I might as well try. After all, we all would like to think that there's some ulterior motive to fulfilling that prophecy.

On tablet or other touchscreen devices- of which there are quite a number of nowadays (as the New Year's Eve post, I am obliged to include conjecture about the technological zeitgeist), a library such as Ocrad.js might be used to add handwriting input in a device and operating system agnostic manner. Oftentimes, capturing the strokes and sending them over to a server to process might entail unacceptably high latency. Maybe you're working on an offline-capable note-taking app, or a browser extension which indexes all the doge memes that you stumble upon while prawling the dark corners of the internet.

If you've been following my trail of blog posts recently, you'd probably be able to tell that I've been scrambling to finish the program that I prototyped many months ago overnight at a Hackathon. The idea of the extension was kind of simple and also kind of magical: a browser extension that allowed users to highlight, copy, and paste text from any image as if it were plain text. Of course the implementation is a bit difficult and actually relies on the advent of a number of newfangled technologies.

If you try to search for some open source text recognition engine, the first thing that comes up is Tesseract. That isn't a mistake, because it turns out that the competition is worlds away in terms of accuracy. It's actually pretty sad that the state of the art hasn't progressed substantially since the mid-nineties.

A month ago, I tried compiling Tesseract using Emscripten. Perhaps it was a bad thing to try first, but soon I learned that even if it did work out, it probably wouldn't have been practical anyway. I had figured that all OCR engines had been powered by artificial neural networks, support vector machines, k-nearest-neighbors and their machine learning kin. It turns out that this is hardly the norm except in the realm of the actually-accurate, whose open source provinces live under the protection of Lord Tesseract.

GOCR and Ocrad are essentially the only other open source OCR engines (there's technically also Cuneiform, but the source code is in a really really big zip file from some website in Russian and its also really slow according to benchmarks). And something I didn't realize until I had peered into the source code is that they are powered by (presumably) painstakingly written rules for each and every detectable glyph and variation. This kind of blew my mind.

Anyway, I tried to compile GOCR first and was immediately struck by how easy and painless it had been. I was on a roll, and decided to do Ocrad as well. It wasn't particularly hard- sure it was slightly more involved but still hardly anything.

If you know me in person, you'll probably know that I'm not a terribly decisive person. Oftentimes, I'll delay the decision until there isn't a choice left for me to make. Anyway, serially-indecisive-me strikes again, so I alternated between the development of GOCR.js and Ocrad.js, leading up to a simultaneous release.

But in the back of my mind, I knew that eventually I would have to pick one for building my image highlighting project.

What consistently amazes me about Optical Character Recognition isn't its astonishing quality or lack thereof. Rather, it's how utterly unpredictable the results can be. Sometimes there'll be some barely legible block of text that comes through absolutely pristine, and some other time there will be a perfectly clean input which outputs complete garbage. Maybe this is a testament to the sheer difficulty of computer vision or the incredible and underappreciated abilities of the human visual cortex.

At one point, I was talking to someone and I distinctly remembered (I know, all the best stories start this way) a sense of surprise when the person indicated that he had heard of Tesseract, the open source OCR engine. I had appraised it as somewhat more obscure than it evidently was. Some time later, I confided about the incident with a friend, and he said something along the lines of "OCR is one of those fields that everyone comes across once".

I guess I've kind of held onto that thought for a while now, and it certainly seems to have at least a grain of truth. Text embedded into the physical world is more or less our primary means we have for communication and expression. Technology is about building tools that augment human capacity and inevitably entails supplanting some human capability. Data input is a huge bottleneck, and while we're kind of sidestepping the problem with things like QR codes by bringing the digital world into the physical. OCR is just one of those fundamental enabling technologies which ought to be as broad in scope as the set of humans who have interacted with a keyboard.

I can't help but feel that the rather large set of people who have interacted with the problem character recognition have surveyed the available tools and reached the same conclusion as your miniature Magic 8 Ball desk ornament: "Try again later". It doesn't take long for one to discover an instance of perfectly crisp and legible type which results in line noise of such entropy that it'd give DUAL_EC_DRBG a run for its money. "No, there really isn't any way for this to be the state of the art." "Well, I guess if it is, then maybe it'll improve in a few years- technology improves quickly, right?"

You would think that some analogue of Linus's Law would hold true: "given enough eyeballs, all bugs are shallow"- especially if you're dealing with literal eyeballs reading letters. But incidentally, the engine that absolutely everyone uses was developed three decades ago (It's older than I am!), abandoned for a decade before being acquired and released to the world (by our favorite benevolent overlords, Google).

In fact, what's absolutely stunning is the sheer universality of Tesseract. Just about everything which claims to have text recognition as a feature is backed by it. At one point, I was hoping that Mathematica had some clever routine using morphology and symbolic new kinds of sciences and evolved automata pattern recognition. Nope! Nestled deep within the gigabytes of code lies the Chuck Testa of textadermies: Tesseract.

ocrad.js's People

Contributors

antimatter15 avatar bbosman avatar j8r avatar kba avatar mathiasbynens avatar t-cool avatar tomayac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ocrad.js's Issues

Doesn't seem to work for me using the test.png file

I went to the page http://antimatter15.com/ocrad.js/demo.html. I try to drag and drop the test.png from the examples folder. It shows...

Cannot enlarge memory arrays in asm.js. Either (1) compile with -s TOTAL_MEMORY=X with X higher than the current value 16777216, or (2) set Module.TOTAL_MEMORY before the program runs.
 ocrad.js:7985 Uncaught abort() at Error
   at stackTrace (http://antimatter15.com/ocrad.js/ocrad.js:935:15)
   at abort (http://antimatter15.com/ocrad.js/ocrad.js:7985:25)

wrong reconize

Hello,
i have tried the programs he can recognize my number, do u now if is normal ?

look :
ocrad js - optical character recognition in javascript - mozilla firefox

what do u think is normal ?

Callback for nodejs

Hi,
I try to use your script in NodeJs but I can't integrate a callback with it.

When I make OCRAD(canvas, {numeric: true}, function(){console.log("test")})
I have an error :
/home/john/ocrad.js:1
aughtException",(function(ex){if(!(ex instanceof ExitStatus)){throw ex}}))}els
^
ReferenceError: window is not defined
at createWebWorkerFromString (/home/john/ocrad.js:43:34894)

Is it possible to call a callback after OCRAD's work is done with NodeJs ?

Thanks for your time

When I build by build.sh got these errors, could you help me ? thank you !

ocrad.js:54319 Uncaught Error: abort(Assertion failed: you need to wait for the runtime to be ready (e.g. wait for main() to be called)) at Error
    at jsStackTrace (ocrad.js:54516)
    at stackTrace (ocrad.js:54533)
    at abort (ocrad.js:54313)
    at assert (ocrad.js:53378)
    at Module._OCRAD_open (ocrad.js:57469)
    at ccall (ocrad.js:53431)
    at Object.open (ocrad.js:53440)
    at Object._simple (ocrad.js:58053)
    at Function._simple (ocrad.js:58273)
    at OCRAD (ocrad.js:58264)
    at abort (ocrad.js:54319)
    at assert (ocrad.js:53378)
    at Module._OCRAD_open (ocrad.js:57469)
    at ccall (ocrad.js:53431)
    at Object.open (ocrad.js:53440)
    at Object._simple (ocrad.js:58053)
    at Function._simple (ocrad.js:58273)
    at OCRAD (ocrad.js:58264)
    at recognize_image (numbers.html:39)
    at HTMLImageElement.onclick (numbers.html:51)

Ocrad Node not recognize .jpg

ocrad.js/examples/nodejs/app.js

In my code, after the upload image.jpg return this error:

C:\Users\henni\Documents\mocaai\public\js\ocrad.js:121
throw ex;
^

Error: Image given has not completed loading
at C:\Users\henni\Documents\mocaai\app\routes\ocrRT.js:65:21
at FSReqWrap.readFileAfterClose [as oncomplete] (fs.js:511:3)

It so only in files .jpg on NODE! In files .png thats OK on NODE! But, in browser run .jpg and .png no problems!

don't work with ssl url

Hello,
Today I have a problem when i using OCRAD url photo with ssl protocal, it don't work.
please help me

Invalid Number

I'm trying to use ocrad.js with React Native. I installed ocrad with npm using npm i ocrad.js --save and then I attempted to import with var OCRAD = require('ocrad.js');. When I attempt to load the app I get a bundling error because of an invalid number.

error: bundling failed: SyntaxError: .../node_modules/ocrad.js/ocrad.js: Invalid number (1555:51)
  1553 |         }}};
  1554 |   var MEMFS={ops_table:null,CONTENT_OWNING:1,CONTENT_FLEXIBLE:2,CONTENT_FIXED:3,mount:function (mount) {
> 1555 |         return MEMFS.createNode(null, '/', 16384 | 0777, 0);
       |                                                    ^
  1556 |       },createNode:function (parent, name, mode, dev) {
  1557 |         if (FS.isBlkdev(mode) || FS.isFIFO(mode)) {
  1558 |           // no supported

browser example does not work

interesting project as I need the x,y location of the textareas but unfortunately I get

    Uncaught TypeError: Cannot read property '5' of null
    at parseOcradResultsFile (ocrad.js:96142)
    at postprocess (ocrad.js:96223)
    at OCRAD (ocrad.js:96242)
    at OCRImage (location.html:20)
    at HTMLImageElement.onload (location.html:43)

Zonal OCR

Hi,
How do I OCR a specific zone of the image.
Thanks.

Feature request - Specific Font OCR

THis is a great tool and I have been playing around with it for the last couple of days. Is there anyway to use a specific font for the basis of OCR, say Calibri? This means that if the user knows what the base font of the text/picture they are scanning then there would be a higher chance of conversion to the correct characters? That is what I am hoping for.

Please let me know of there is a way to do this if you can.

Thanks!

Unable to scan numbers

Dear antimatter15,

when scanning numbers, such as the following image
http://lancelotlam.com/numbertest/numbers.png

ocrad.js only detect

___ - ______'''r'


-_________
_ ___7 __' __o,__o,7o,
__'T-| _T__T__T_O,
. E
_ _ _ _

  • ----_

most of the numbers were scanned as _

i have found the following enums in common.cc

const char * const charset_name[charsets] =
{ "ascii", "iso-8859-9", "iso-8859-15" };

and

const F_entry F_table[] =
{
{ "none", Filter::none },
{ "letters", Filter::letters },
{ "letters_only", Filter::letters_only },
{ "numbers", Filter::numbers },
{ "numbers_only", Filter::numbers_only },
{ 0, Filter::none }
};

but how to apply that on javascript side?

Thanks and regards

Uncaught DOMException: Failed to execute 'getImageData' on 'CanvasRenderingContext2D': The source width is 0.

It was working flawlessly but somehow after a while I started to get this error. I searched a bit and I know it's related to image.width & image.height but I created this issue since it's mainly because of the code itself rather than my case. Now tested on 4 different browser with a single screenshot and giving the same error.

ekran resmi 2017-04-11 10 30 43

PS: I tried to implement image.naturalHeight; and image.naturalWidth; before .getImageData(0, 0, imgWidth, imgHeight); as described with the link provided hence not succeeded.

Multiple Matches - extract from results file

Is it possible to extract the more than 1 match from the results file from OCRAD for each character? Can you please show an example of this? It seems to only return one result however OCRAD refers to an array of matches with confidence levels??

Any help on this would be much appreciated. Thanks!

License?

Hi,

Very cool project!

For now I've added a demo at https://github.com/brettz9/webappfind/blob/master/demos/ocr.html for my WebAppFind Firefox add-on (currently Windows only) which lets a person open files via right-click from the Windows desktop (in this case a PDF file) into a web app. The web app demo uses PDF.js to successively render the pages of the PDF into a canvas, then uses the user choice of Ocrad.js or GOCR.js to obtain OCR results which are placed into a textarea.

I assumed your project was open source, but I didn't see a license though for either Ocrad.js or GOCR.js, so could you please add one to this repo (and to the GOCR.js file if you don't have it in a repo). Thanks!

Error in the demo

From the demo

Is this supposed to happen or is it something with my browser?

Chrome 43 on Windows 10

much faster than tesseract.js

tesseract was running to slow for basic images which made it unusable for end consumers, but this seems much faster. how? thx!

OCR engine issues

  1. this image is recognized by ocrad:
    06_mcr

2a. that one is not (recognized):
11_mcr-extra

2b. removing all black boxes and lines from 2a, OCR works again:
11_mcr-extra-4

Mystery...

Can not get the nodejs example to work...

Forgive me if I am asking a foolish question here... I am new to NodeJS and have spent hours attempting to get the app.js to work. I have tried moving a copy to the main folder where ocrad.js file is. I have Googled for answers with no luck. I must be missing something... please someone enlighten me.

Cannot find module 'ocrad.js'

I have tesseract running with NodeJS. The browser examples for the ocrad work. Testing both it is clear ocrad OCR is significantly faster.

Running on Windows 10 Pro machine. Node 10.5.0

Thanks,
Will

Recognize 8 instead of &

As it is said. When I try to recognize a number 8, then it always turn out to be '&', not 8.

There is at least one use case (an image I can't publish), when this happens, but I will try to provide you a better picture to test.

Recognize digits only

Hello,
I like your work. How do I get it to recognize digits only. I intend to add it to my mobile that using Sencha Touch 2 that will be required to read digits from capture images such as business cards e.t.c
Awaiting your feedback

Examples broken with last nodeJS canvas versions

I'm just trying the examples here and noticed too problems:

  1. node canvas is not listed as a dependency in the project, i am not sure it should be but it seemed to me that it should (seeing the examples and how the library works).

  2. With latest canvas version (ex. 2.2.0 and above), new Canvas(...blah) is not a constructor and should be used this way: const canvas = Canvas.createCanvas(...blah). Maybe the examples should be updated.

Cheers !

Optimal font for Base64

antimatter15,

Thank you so much for this lib! It is amazing, fast, and easy!

I'm currently implementing your library to read Base64 strings that would be impractical for a user to manually input.

Which font has the highest degree of accuracy & precision for reading Base64 strings?

Thank you so much in advance!

How to use it?

how to use

can you write some words to show how to use ocrad.js

Chinese support in Ocrad instead of Tesseract

I find in issue #21 that Tesseract supports Chinese, but the Chinese result of Tesseract is not accurate, as well as many spaces between the word.

Ocrad is a better choice for it is faster, could it support Chinese, and improve accuracy.

ability to 'cut' ie apply target rectangle to input

https://www.gnu.org/software/ocrad/manual/ocrad_manual.html#Invoking-ocrad

-u left,top,width,height
--cut=left,top,width,height
Cut the input image by the rectangle defined by left, top, width and height. Values may be relative to the image size (-1.0 <= value <= +1.0), or absolute (abs( value ) > 1). Negative values of left, top are relative to the right-bottom corner of the image. Values of width and height must be positive. Absolute and relative values can be mixed. For example 'ocrad --cut 700,960,1,1' will extract from '700,960' to the right-bottom corner of the image.
The cutting is performed before any other transformation (rotation or mirroring) on the input image, and before scaling, layout analysis and recognition.

is this functionality available? is there an example? I'm interested in performing this on a file such as http://www.stormsurfing.com/stormuser2/images/grib/eoz_height_126hr.png

:( License!?

Hi!!

GPL license for a client side javascript application? Oh that's so sad. What that means is that, for commercial proprietary software developers, this wouldn't be of any help, as they wont be able to use it, at all! This seriously diminishes the potential of the application to reach a wide range of people!
Why not BSD or MIT, or Apache 2.0 perhaps??! ( Tesseract is under Apache 2.0 )

Won't work when uploading image with bigger file size

24_ap3 5_st1by8_iso800

It woks pretty well for images which is not big size. but when I upload this image (4.32 MB) Then it sjow me this errors:

Cannot enlarge memory arrays. Either (1) compile with -s TOTAL_MEMORY=X with X higher than the current value 33554432, (2) compile with -s ALLOW_MEMORY_GROWTH=1 which adjusts the size at runtime but prevents some optimizations, (3) set Module.TOTAL_MEMORY to a higher value before the program runs, or if you want malloc to return NULL (0) instead of this abort, compile with -s ABORTING_MALLOC=0 ocrad.min.js:12

Cannot enlarge memory arrays. Either (1) compile with -s TOTAL_MEMORY=X with X higher than the current value 33554432, (2) compile with -s ALLOW_MEMORY_GROWTH=1 which adjusts the size at runtime but prevents some optimizations, (3) set Module.TOTAL_MEMORY to a higher value before the program runs, or if you want malloc to return NULL (0) instead of this abort, compile with -s ABORTING_MALLOC=0 ocrad.min.js:12

warning: build with -s DEMANGLE_SUPPORT=1 to link in libcxxabi demangling ocrad.min.js:12

Only showing (dash) during recognition

When i browse an image it is showing (-) dash as detected string

image

After clicking on the image or near to it (clicked a dot), i am getting the actual recognized string

image

I am having this problem of detection as dash (-) when i use the ocrad.js code as well. It is shown in the alert in javascript. Please refer the attachment for more info.

image

Kindly let me know what is causing this problem and how this can be resolved.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.