Image Text Extraction Telegram Bot by Boramorka
🤖

Built with

How To Use • How To Run Locally • Built process • Feedback

You may interested in this bot if you need to recognize some text from the image. It's free and quick.

Supported languages:

✔️ English
✔️ Russian

The 🗝️ key technology is a Tesseract OCR by Google that has Python API.

How To Use

🤖 Bot link: https://t.me/boramorka_text_extraction_bot

Send a photo of text. Type /lang to choose a language. ✔️
Make sure that your document has a white background, readable black letters and picture is not rotated. ✔️
If choosed EN+RU mode it recognises both languages at the same time. But more artifacts may arise. If your document is in one language, please select that language. ✔️

How To Run Locally

# Clone this repository
$ git clone https://github.com/boramorka/text-extraction-app.git

# Go into the repository
$ cd text-extraction-app

# Install dependencies
$ pip install requirements.txt

# Run app
$ python bot.py

Built process

First of all we creating an app.py file for the main app. It contains:

# Path to pytesseract
pytesseract.pytesseract.tesseract_cmd

# Code for text recognition
def get_text():
...............

Bot.py script starts the bot. It containts AIOGram. It's a pretty simple and fully asynchronous framework for Telegram Bot API written in Python 3.7 with asyncio and aiohttp. It helps you to make your bots faster and simpler.

# Bot class takes an API key to connect to the Telegram servers.
bot = Bot(token=os.getenv("TEXT_EXTRACTOR_API_KEY")) #Note: API key is envioroment variable

"""
Dispatcher will process incoming updates: 
    • messages
    • edited messages
    • channel posts
    • edited channel posts
    • inline queries
    • chosen inline results
    • callback queries
    • shipping queries
    • pre-checkout queries.
"""
dp = Dispatcher(bot) 

# Decorator that takes a message and processes it.
@dp.message_handler(text=message)

Heroku deployment: Important files:

📄 bot.py: the bot application (refer to my Github for the source code)
📄 Aptfile : the third-party dependencies for Heroku to install (e.g: tesseract-ocr)
📄 Procfile : a list of process types in an app (on Heroku)
📄 requirements.txt : a list of dependencies to install
📄 runtime.txt : version of Python to run on Heroku (optional)

# HEROKU DEPLOYMENT PROCESS

# Note:
# Add this line to bot.py
pytesseract.pytesseract.tesseract_cmd = "/app/.apt/usr/bin/tesseract"
# (refer to my Github for the source code)

# Login to Heroku, and create a new app:
$ heroku login
$git init
$heroku create boramorka-text-extraction-app
$heroku git:remote -a boramorka-text-extraction-app

# Add Buildpacks:
$ heroku buildpacks:add --index 1 https://github.com/heroku/heroku-buildpack-apt
$ heroku buildpacks:add --index 2 heroku/python

# Add Config Vars:
$ heroku config:set TESSDATA_PREFIX=/app/.apt/usr/share/tesseract-ocr/4.00/tessdata

# heroku stack (heroku-20) has bad compatibility with tesseract.
# You may need to change heroku stack from 20 to 18 using command:
$ heroku stack:set heroku-18

# Deploy app on Heroku:
$ git add .
$ git commit -m "Initial commit to Heroku"
$ heroku git:remote -a boramorka-text-extraction-app
$ git push heroku master

# Check worker status:
$ heroku ps

# Run worker
$ heroku ps:scale worker=1

Feedback

🤵 Feel free to send me feedback on Telegram. Feature requests are always welcome.

🧮 Check my other projects.

boramorka / text-extraction-app Goto Github PK

text-extraction-app's Introduction

Image Text Extraction Telegram Bot by Boramorka
🤖

Built with

How To Use

How To Run Locally

Built process

Feedback

text-extraction-app's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs

boramorka / text-extraction-app Goto Github PK

text-extraction-app's Introduction

Image Text Extraction Telegram Bot by Boramorka 🤖

Built with

How To Use

How To Run Locally

Built process

Feedback

text-extraction-app's People

Watchers

Recommend Projects

Recommend Topics

Recommend Org

Jobs

Image Text Extraction Telegram Bot by Boramorka
🤖