NEW : this repository is new and experimental, do not hesitate to open issues if you have any question or bug, or even suggestions to improve the project ! ๐
Check the CHANGELOG file to have a global overview of the latest modifications ! ๐
โโโ custom_architectures
โย ย โโโ crnn_arch.py : defines the CRNN main architecture for OCR (with CTC decoding)
โย ย โโโ unet_arch.py : defines variants of the UNet architectures used in the EAST detector
โย ย โโโ yolo_arch.py : defines the YOLOv2 architecture
โโโ custom_layers
โโโ custom_train_objects
โโโ datasets
โโโ hparams
โโโ loggers
โโโ models
โย ย โโโ detection : used to detect texts in images (with the EAST detector)
โย ย โโโ ocr
โย ย โย ย โโโ base_ocr.py : abstract class for OCR models
โย ย โย ย โโโ crnn.py : main CRNN class (OCR)
โโโ pretrained_models
โย ย โโโ yolo_backend : directory where to save the yolo_backend weights
โโโ unitest
โโโ utils
โโโ example_crnn.ipynb
โโโ pcr.ipynb
Check the main project for more information about the unextended modules / structure / main classes.
Check the detection project for more information about the detection
module and the EAST Scene-Text Detection model.
- Detection (module
models.detection
) :
Feature | Fuction / class | Description |
---|---|---|
OCR | ocr |
Performs OCR on the given image(s) |
You can check the ocr
notebook for a concrete demonstration
Available architectures :
Classes | Dataset | Architecture | Trainer | Weights |
---|
Models must be unzipped in the pretrained_models/
directory !
The pretrained CRNN
models come from the EasyOCR library. Weights are automatically downloaded given the language or the model's name, and converted in tensorflow
! The easyocr
is therefore not required, by pytorch
is required for weights loading (for convertion).
The pretrained version of EAST can be downloaded from this project. It should be set in pretrained_models/pretrained_weights/east_vgg16.pth
(torch
is required to transfer the weights : pip install torch
).
- Clone this repository :
git clone https://github.com/yui-mhcp/ocr.git
- Go to the root of this repository :
cd ocr
- Install requirements :
pip install -r requirements.txt
- Open
detection
notebook and follow the instructions !
Important Note : some heavy requirements are removed in order to avoid unnecessary installation of such packages (e.g. torch
and transformers
), as they are used only in very specific functions. It is therefore possible that some ImportError
occurs when using specific functions, such as TextEncoder.from_transformers_pretrained(...)
.
- Make the TO-DO list
- Convert the
CRNN
architecture / weights from theeasyocr
library totensorflow
- Convert the
CRNN + attention
architecture from this repo totensorflow
- Add examples to initialize pretrained models (both EAST and CRNN)
- Add an example to perform OCR on image (with text detection)
- Add an example to perform OCR on camera
- Allow to combine texts in lines / paragraphs (as EAST detects individual words)
- Take into account the text rotation in the combination procedure
You can contact me at [email protected] or on discord at yui#0732
The objective of these projects is to facilitate the development and deployment of useful application using Deep Learning for solving real-world problems and helping people. For this purpose, all the code is under the Affero GPL (AGPL) v3 licence
All my projects are "free software", meaning that you can use, modify, deploy and distribute them on a free basis, in compliance with the Licence. They are not in the public domain and are copyrighted, there exist some conditions on the distribution but their objective is to make sure that everyone is able to use and share any modified version of these projects.
Furthermore, if you want to use any project in a closed-source project, or in a commercial project, you will need to obtain another Licence. Please contact me for more information.
For my protection, it is important to note that all projects are available on an "As Is" basis, without any warranties or conditions of any kind, either explicit or implied. However, do not hesitate to report issues on the repository's project or make a Pull Request to solve it ๐
If you use this project in your work, please add this citation to give it more visibility ! ๐
@misc{yui-mhcp
author = {yui},
title = {A Deep Learning projects centralization},
year = {2021},
publisher = {GitHub},
howpublished = {\url{https://github.com/yui-mhcp}}
}
The code for the CRNN architecture is highly inspired from the easyocr
repo :
- [1] EasyOCR library : official repo of the
easyocr
library The code for the EAST part of this project is highly inspired from this repo : - [2] SakuraRiven pytorch implementation : pytorch implementation of the EAST paper.
Papers and tutorials :
- [1] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition : the original CRNN paper
- [2] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis : a great benchmark of OCR model + an open-source repository with pretrained models and datasets
- [3] U-Net: Convolutional Networks for Biomedical Image Segmentation : U-net original paper
- [4] EAST: An Efficient and Accurate Scene Text Detector : text detection (with possibly rotated bounding-boxes) with a segmentation model (U-Net).
Datasets :
- COCO Text dataset : an extension of COCO for text detection