GithubHelp home page GithubHelp logo

casia's Introduction

Learning letter recognition for two alphabets

This repo houses the code for a research investigation into recognizing characters with a convolutional neural network (CNN). Eventually it will involve comparing performance (learning speed and accuracy) when trained simultaneously on two alphabets vs. when trained individually on either one of them.

Currently I am trying to use keras to train a CNN that can recognize offline isolated characters (.gnt files) from the CASIA dataset. I have included an example .gnt file in this repo, but you can download the rest here. Specifically:

http://www.nlpr.ia.ac.cn/databases/download/feature_data/HWDB1.1trn_gnt.zip (*) http://www.nlpr.ia.ac.cn/databases/download/feature_data/HWDB1.1tst_gnt.zip http://www.nlpr.ia.ac.cn/databases/Download/competition/competition-gnt.zip

(*) Note: I have not been able to decompress this particular file. It is very large, it seems to require ALZip to decompress it, and when I use ALZip to decompress it, there is an error about a CRC failure. For this project, I've pooled the latter two datasets and reserved one-fifth of each character's examples as a test dataset.

Try it out

See provision_gpu.sh, setup_gpu.sh, and extract_gnts.sh for information about setting up the GPU.

After the GPU is set up, you will need to preprocess the data. In the following example, we've chosen to generate preprocessed data for only eight of the CASIA character classes. (Trying to do much more than eight runs into memory problems when building the model on the GPU. I'm trying to figure out how to fix that.)

import casia
c = casia.Casia().read_all_examples(8)
c.save()

Then build and train a model to recognize these characters.

import casia, run, models
r = run.Runner(casia.Casia().load(8), models.vgg16)
r.run(15) # 15 epochs. It takes a while for vgg16 to converge.

Some benchmarks for this dataset: Table 3 of this paper.

CASIA viewer

A viewer for characters from the CASIA dataset. Usage:

  1. Start a web server in this directory. For example: python -m SimpleHTTPServer
  2. Visit the server in a web browser. (In the example above, it's at localhost:8000.) Have fun!

Acknowledgements

My research advisor for this project is Greg Shakhnarovich.

The CNN architecture vgg16 is taken from this paper.

Conversion of GB-2312 character codes into displayable strings is accomplished using encoding.js and encoding-indexes.js, both of which are taken from inexorabletash/text-encoding.

casia's People

Contributors

nhatch avatar

Stargazers

Ta Ko avatar  avatar Lucas Kjaero-Zhang avatar

Watchers

James Cloos avatar  avatar

Forkers

zvtrung

casia's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.