GithubHelp home page GithubHelp logo

pimvanderloos / handwriting_recognition Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rikvegter/handwriting_recognition

0.0 0.0 0.0 135.84 MB

Shell 12.21% Python 87.31% Dockerfile 0.47%

handwriting_recognition's Introduction

Automatic Handwriting Recognition of Ancient Hebrew in the Dead Sea Scrolls

A Python program that performs OCR on binarized images of the Dead Sea Scrolls, and determines the style period based on character features.

Usage

Start by cloning the directory:

git clone https://github.com/rikvegter/handwriting_recognition.git

Using Docker

Running the tool using Docker ensures it will work as intended. Before trying this, make sure Docker is installed and running.

You can run the tool using the following command:

./run.sh /path/to/input

When the script is run for the first time, it will download a complete Docker image (~2GB), this might take some time. When the image is downloaded, it will be executed on the images in the folder. The output will be saved to ./results.

Note that it may take a while for the pipeline to complete and even appear to get stuck at times. This is also true when running it locally.

Running locally

You can also run the python code directly. The code has been tested with Python 3.8.6, newer versions aren't guaranteed to work (but they might). Before getting started, make sure all dependencies are installed:

python3 -m pip install --upgrade pip
python3 -m pip install -r requirements.txt

If you want to run locally, you need to provide some command line arguments:

python3 main.py
    [-h]                            # Show help for CLI arguments
    -i str                          # Directory containing files to process. If --single is set, the path to the file.
    [-o str]                        # Output directory
    [--stop_after {1,2,3,4,5}]      # Stop after a given step
    [-d]                            # Save intermediate images to a debug folder in the output directory
    [-s]                            # Process only a single image
    [--classifier str]              # The directory containing the trained character recognition model
    [--ngram_file str]              # The file containing the .xlsx file with n-gram information

Options within square brackets are optional.

An example of a simple valid input is the following:

python3 main.py -i /path/to/images

A more complex example would be the following:

python3 main.py --single -i /path/to/images/image-2.jpg -o ./custom/results/folder/ -d --stop-after 3

This will process only one image, save it to a custom output location, save debugging images and stop after step 3 (character recognition)

handwriting_recognition's People

Contributors

pimvanderloos avatar ramonmeffert avatar jeroenmuller avatar rikvegter avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.