gopirajumatta / ocr_caffe Goto Github PK
View Code? Open in Web Editor NEWThis project forked from anant-agarwal/ocr_caffe
Segmentation using OpenCV and OCR using Caffe with Python
This project forked from anant-agarwal/ocr_caffe
Segmentation using OpenCV and OCR using Caffe with Python
Optical Character Recognition using Neural Networks By: Anant Agarwal (aa2387), Deekshith Belchappada (db786) The code is based on Python 2.7, uses Anaconda, and Caffe. Setting up Anaconda package: i) Download the anaconda package based on your Operating System from - https://docs.continuum.io/anaconda/install ii) Make sure you are downloading the "Python 2.7 version" iii) Follow the instructions on the page, it should just require a few clicks to get it done. Setting up Caffe: i) Caffe installation requires a lot of intricate changes in setting it up the right way, and in installing the right set of dependencies. Describing all of these details here will make this README file very complicated. ii) We list the blogpost which describes all the intricacies in setting up caffe in detail and was used by us extensively to setup our version. We feel that repeating the same content here in this file will be wasteful. iii) Blogpost: http://installing-caffe-the-right-way.wikidot.com/start iv) Caffe official installation instructions: http://caffe.berkeleyvision.org/installation.html How to run the code: i) Code is contained in 2 python notebooks: Segmentation.ipynb, OCR.ipynb ii) Jupyter python notebooks are part of Anaconda, and can be triggered by the following command in shell/terminal: jupyter notebook iii) Once you have the notebook server running, open these notebooks. iv) Both the notebooks are decribed below. Segmentation.ipynb: i) This file contains all the code related to segmentaion, the file temp/1-this_is_a_test-1.png is provided as a sample image to run this code. ii) It will extract characters from this sample image and keep them in a sorted order. iii) You can change the input image file to any other file, and the output will be generated in temp/segmentation_pdfimage/ iv) The code is heavily commented and provides explanation of all the chunks. v) To run the code: in the top menu, go to Cell -> Run Cells. OCR.ipynb: This file contains following functionalities: i)Data Generation: This code block cell generates data using popular fonts. The code has been explained with the help of comments in the file. Output folder generated as a result of this code - "data_gen_new" has been included as well. ii)Generate Training and Testing files: This code block cell generates training_index.txt and test_index.txt These files are used by caffe model for training and testing purposes. They essentially contain the image file address and the image class label. Sample files created as a result of this code are present in the data_gen_new folder, and are named training_index.txt and test_index.txt iii)Loading Caffe model in Python: You need to train the caffe model first, and then provide its location to the python code along with its configuration file. The process of training the model is described later in this file. iv) Demo, Testing, Classification code: (You need to load the model first for any of this to run) a)Demo code: We take one image per class and try to classify it using the model trained, the input folder has been provided, and can be found at "temp/0-9A-Za-z". The code also calculates Mean Reciprocal Rank using top 3 guesses. b)Code for Testing the model: This one loads your test file, and uses the model to classify and calculate accuracy. A sample test file has been provided and can be found at "data_gen_new/test_index.txt" c) Classify segmented images: This is essentially same as above, just that we are passing in the images generated as a result of character segmentation on a image. You can generate the inputs for this code piece by running the segmentation code from segmentation.ipynb Training the caffe Model: We have two models - lenet (with 62 classes), alpha (referred as 'LeNet Modified' in report) Caffe requires following 3 files: Model File: temp/lenet.prototxt, temp/alpha.prototxt Solver File: temp/lenet_solver.prototxt, temp/alpha_solver.prototxt Deploy File: temp/lenet_deploy.prototxt, temp/alpha_deploy.prototxt Details for each of these are mention in the report. For training the following needs to be done: i) Fix the paths in Model File for training_index.txt, test_index.txt ii) See that the data(training image files) are located on the locations mentioned in the training and testing files. iii) In the solver file, make sure the path to model file on line 2 is correct. iv) In terminal/shell give the following command with desired solver file: caffe train -solver <solver file> v) The model will start training, and a version of model will be stored after every few iterations. All the details above, along with the report(it has additional details about caffe) should be sufficient in getting the code up and running. However, if there are any troubles, please feel free to reach out to any of us. ([email protected], [email protected]) Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.