GithubHelp home page GithubHelp logo

csafe-isu / wordscissors Goto Github PK

View Code? Open in Web Editor NEW
0.0 8.0 0.0 22 KB

A Python package to split handwriting into individual words.

License: GNU Affero General Public License v3.0

Python 100.00%

wordscissors's Introduction

- NOTE: This is our beta version of handwriter
- For the most stable version of handwriter please use:
- install.packages("handwriter")
- to get handwriter from CRAN: https://CRAN.R-project.org/package=handwriter 

README

This readme covers the functionality of the Python-based handwriter routines.

Python Word Separation

Here are the step by step instructions for Python Word Separation:

  1. Ensure that Python is installed. There are several ways: https://www.python.org/downloads/
  2. Clone the handwriter repository to your local machine
  3. Install the following python packages: pip install opencv-python matplotlib ipython jupyter
  4. Open up a terminal
  5. Change your working directory to the inst/python directory of the handwriter repository. For example, on MacOS, you would run cd /path/to/cloned/handwriter/inst/python, replacing the path with the actual path on your local machine.

The next set of steps depend on whether you (a) prefer to run the code natively in Python, (b) prefer to run the code from a Jupyter Notebook, or (c) prefer to run the code from R.

Option A: Run Natively

  1. Launch the Python interpreter by calling:
python
  1. Import the module:
import word_separation as ws
  1. Split the lines:
input_image="images/w0001_s03_pPHR_r01.png"

# Split the lines
split_images = ws.detect_lines(input_image)
split_images
  1. Separate the word:
# Display each split image
for split_image in split_images:
    ws.show_image(ws.separate_word(file_name=split_image))
  1. Display the contours!
# Get every word extracted as a contour
all_words = []
for split_image in split_images:
    im1_contours = ws.separate_word(file_name=split_image, ret="contours")
    all_words.append(ws.annotate_image(split_image, im1_contours))
  1. Display the Bounding Boxes!
# Let's take a look!
all_words
  1. Batch Processing
batched = ws.batch_process("images")

Option B: Run Natively through Jupyter

  1. Launch the Jupyter Notebook client by calling:
jupyter notebook
  1. Open up word_separation.ipynb

  2. Execute the cells of the notebook

Option C: Run in R with reticulate

Execute the word_separation.R script in the inst/python directory:

# We will need reticulate to call the Python functions
library(reticulate)

# Make sure you're using the right version of Python
use_python("/Users/erichare/.pyenv/shims/python")

# Source in the word separation code from this directory!
source_python("word_separation.py")

# Configure the input image
input_image="images/w0001_s03_pPHR_r01.png"

# Split the lines
split_images = detect_lines(input_image)
split_images

# Display each split image
for (split_image in split_images) {
  show_image(separate_word(file_name=split_image))
}

# Get every word extracted as a contour
all_words <- list()
for (split_image in split_images) {
  im1_contours = separate_word(file_name=split_image, ret="contours")
  all_words[[length(all_words) + 1]] <- annotate_image(split_image, im1_contours)
}

# Let's take a look!
all_words

###
# Batch Processing
###
batch_process("images")

wordscissors's People

Contributors

stephaniereinders avatar

Watchers

Heike Hofmann avatar Eric Hare avatar  avatar Guillermo Basulto-Elias avatar Mega NO avatar Haley Jeppson avatar Ganesh Krishnan avatar Alicia Carriquiry avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.