GithubHelp home page GithubHelp logo

page-david / python-mnist Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sorki/python-mnist

0.0 1.0 0.0 41 KB

Simple MNIST data parser written in Python

License: Other

Python 95.07% Shell 4.93%

python-mnist's Introduction

python-mnist

Simple MNIST and EMNIST data parser written in pure Python.

MNIST is a database of handwritten digits available on http://yann.lecun.com/exdb/mnist/. EMNIST is an extended MNIST database https://www.nist.gov/itl/iad/image-group/emnist-dataset.

Requirements

  • Python 2 or Python 3

Usage

  • git clone https://github.com/sorki/python-mnist
  • cd python-mnist
  • Get MNIST data:

    ./get_data.sh
  • Check preview with:

    PYTHONPATH=. ./bin/mnist_preview

Installation

Get the package from PyPi:

pip install python-mnist

or install with setup.py:

python setup.py install

Code sample:

from mnist import MNIST
mndata = MNIST('./dir_with_mnist_data_files')
images, labels = mndata.load_training()

To enable loading of gzip-ed files use:

mndata.gz = True

Library tries to load files named t10k-images-idx3-ubyte train-labels-idx1-ubyte train-images-idx3-ubyte and t10k-labels-idx1-ubyte. If loading throws an exception check if these names match.

EMNIST

  • Get EMNIST data:

    ./get_emnist_data.sh
  • Check preview with:

    PYTHONPATH=. ./bin/emnist_preview

To use EMNIST datasets you need to call:

mndata.select_emnist('digits')

Where digits is one of the available EMNIST datasets. You can choose from

  • balanced
  • byclass
  • bymerge
  • digits
  • letters
  • mnist

EMNIST loader uses gziped files by default, this can be disabled by by setting:

mndata.gz = False

You also need to unpack EMNIST files as get_emnist_data.sh script won't do it for you. EMNIST loader also needs to mirror and rotate images so it is a bit slower (If this is an issue for you, you should repack the data to avoid mirroring and rotation on each load).

Notes

This package doesn't use numpy by design as when I've tried to find a working implementation all of them were based on some archaic version of numpy and none of them worked. This loads data files with struct.unpack instead.

Example

$ PYTHONPATH=. ./bin/mnist_preview
Showing num: 3

............................
............................
............................
............................
............................
............................
.............@@@@@..........
..........@@@@@@@@@@........
.......@@@@@@......@@.......
.......@@@........@@@.......
.................@@.........
................@@@.........
...............@@@@@........
.............@@@............
.............@.......@......
.....................@......
.....................@@.....
....................@@......
...................@@@......
.................@@@@.......
................@@@@........
....@........@@@@@..........
....@@@@@@@@@@@@............
......@@@@@@................
............................
............................
............................
............................

python-mnist's People

Contributors

graingert avatar heduenas avatar page-david avatar sorki avatar stablum avatar tobloef avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.