GithubHelp home page GithubHelp logo

mehr32 / persian_ocr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from prp-e/persian_ocr_project

0.0 0.0 0.0 2.63 MB

A FLOSS software for Persian Optical Character Recognition

License: GNU General Public License v3.0

Jupyter Notebook 100.00%

persian_ocr's Introduction

Persian OCR Project

This repository is my biggest FLOSS project. I had it in my mind since the last year, when I was working on an automatic license plate number recognition project. So what I was thinking of was this a big FLOSS project and then, the idea of an OCR project came to my mind as well.

Now I started working on the whole idea and this repository will be updated for every phase of the project.

Important Notes

  • This project is published under GNU GPL version 3.0 license. I assure everyone who's concerned that as long as I, Muhammadreza Haghiri am in charge of this project the license will remain the same.
  • If some parties decide to acquire this project and want a change in license, I'll try to negotiate to keep it Free (as in freedom).
  • In the .gitignore file, we've ignored image files. It doesn't mean that our dataset won't be free. It will get so large so we've decided to ignore them in this repository, but we'll let you download data (raw or labeled) in near future.

Project Technical Details

  • Programming Language: Python 3 (3.9 on local machine, remote machines depend on where we do our tasks)
  • AI library: PyTorch
  • Model: YOLOv5

Models and Datasets

Models for June 23rd 2022

Results

Number recognition

  • From the input data:

    data

  • Screenshot from telegram:

    data

  • Final tests

    data

Letter recognition

Final test on letters

Project Phases

This part has been divided to two. First part is mostly considered lab phase since we're working as a group of data scientists and AI enthusiasts to develop and deploy our model and the second part is also considered as business/product phase and we try to present the result as a product to the outside world.

Lab phases

  • Number recognition
    • Data generation using Zarnevis.
    • Training YOLOv5 on generated data.
    • Testing the result.
      • Test on different numbers written on the same fonts.
      • Test on same or different numbers written in different fonts.
      • Test on hand-written numbers to find out how accurate our model is.
    • Asking participants to write down some random numbers (Data generation for hand-written numbers)
    • Training YOLOv5 on both hand-written and digital numbers.
    • Final Tests
      • Test on different numbers, both hand-written and digital.
  • Letter recognition
    • Data generation using Zarnevis.
      • Instead of generating our own data, data gathered from Shotor.
    • Training YOLOv5 on generated data.
    • Testing the results.
      • Test on different words with the same font.
      • Test on the same or different words written in different fonts.
    • Gathering hand-written words data.
    • Final tests.
  • Word detection
    • Training the YOLOv5 model on how to detect words in a sentence.
      • Getting books and articles
      • Converting PDFs to Images for the sake of labeling
      • Create a labeled dataset for words, numbers and maybe English words
  • Punctuation Detection
    • Training the YOLOv5 model to detect punctuations.
  • Jupyter notebook for people who want to test the model.

Business/Product phases

  • Designing a web service for production.

persian_ocr's People

Contributors

prp-e avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.