GithubHelp home page GithubHelp logo

artificial-brix / brix-ocr Goto Github PK

View Code? Open in Web Editor NEW
2.0 0.0 3.0 48.39 MB

Hello, this is Brix OCR, an open-source project where we try to build an OCR engine with the help of others as well as make a custom model.

Home Page: https://kwoc.kossiitkgp.org/projects

Jupyter Notebook 100.00%
ocr ocr-recognition kwoc2021 optical-character-recognition ocr-text-reader ocr-python ocr-engine ocr-service text-processing text

brix-ocr's Introduction

BRIX-OCR

OCR engines have been developed into many kinds of domain-specific OCR applications, such as receipt OCR, invoice OCR, check OCR, legal billing document OCR. They can be used for: Data entry for business documents, e.g. Cheque, passport, invoice, bank statement and receipt. Automatic number plate recognition. This project focuses to build a OCR engine with the help of datasets and images.

Available libraries

Some of the avalable OCR engines are:

Datasets:

Some of the available datasets for testing and training a OCR engine:

Drive Link of some datasets for testing and training a OCR engine:

Research papers:

TASKS to resolve:

  • Task 0: As we need to train the custom model as well as the pretrained models so we need datasets,please add datasets links or download them inside a drive and make hyper link in the datasets sections in the readme.md and complete the Task 0.
  • Task 1: There are three folders given Newspaper ,Posters and Sheets,go inside one folder, you can find a image there, as a sample, please find similar images only and push them inside perticuler folders, minimum 50 images inside a folder will be enough to make the dataset.
  • Task 2: In this task you have to make a jupyter notebook and in that try to use some of the given libraries in the readme section and you have to test their output using the images in the test images,and contribute a jupyter notebook as a name like this: Name_of_the_contributer.ipynb.
  • Task 3: This is the last step of the project, as you have tried all the libraries,make a custom model using the datasets and the take the help of the research papers as well as you mentor of the project,make a jupyter notebook and complete the Task 3.

How to contribute

Please don't push any commits in the main branch, in that case the PR will not be accepted,as there are 4 tasks, please join the discord server first to contribute and then comment under the respective issues and then fork the repo and start working. HAPPY CONTRIBUTING!!!

brix-ocr's People

Contributors

badman-returns avatar chinmay-jain767 avatar debasish-dutta avatar i-am-sayantan avatar preyam2002 avatar smruti2002 avatar

Stargazers

 avatar  avatar

brix-ocr's Issues

TASK 1: Making the dataset of Newspaper,Posters and Sheets and add them to their respective folder

TASK 1 :Building the test dataset:

Add 50 images in each folder of Posters, Newspapers and Sheets to make a contribution. A sample for each of them has been given in their respective folders.

  • About the task: For better understanding, please have a look here: TASK 1
  • contribution : For making a contribution please have a look here: Contribute

Since a person can make different datasets, multiple contributors can participate in it and submit it as a contribution.

TASK 0: Find the open-source training dataset for OCR engine training and add the drive link in the README file

Find the open-source available datasets for training the OCR engine (minimum 3) and add it to your drive and then add the drive link in the dataset section of the README file.

  • About the task: For better understanding, please have a look here: TASK 0
  • contribution : For making a contribution please have a look here: Contribute

Since a person can find different datasets, multiple contributors can participate in it and submit their links as a contribution.

TASK 2: Make predictions on test images using open-source libraries available for OCR

Make a jupyter notebook, in which you can write the codes to do OCR on the images given in the test_images folder

  • About the task : For better understanding, please have a look here: TASK 2
  • contribution : For making a contribution please have look here: Contribute

Since there are multiple algorithms for making the predictions, multiple contributors can participate in it and make their own notebook as a contribution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.