GithubHelp home page GithubHelp logo

ramidecodes / mnist Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 42.67 MB

Part of Machine learning with python MIT course. Implementing Digit recognition with Mnist dataset

License: MIT License

Python 100.00%

mnist's Introduction

mnist

The MNIST database contains binary images of handwritten digits commonly used to train image processing systems. The digits were collected from among Census Bureau employees and high school students. The database contains 60,000 training digits and 10,000 testing digits, all of which have been size-normalized and centered in a fixed-size image of 28 ร— 28 pixels. Many methods have been tested with this dataset and in this project, you will get a chance to experiment with the task of classifying these images into the correct digit using some of the methods you have learned so far.

What's inside

  • part1/linear_regression.py where you will implement linear regression
  • part1/svm.py where you will implement support vector machine
  • part1/softmax.py where you will implement multinomial regression
  • part1/features.py where you will implement principal component analysis (PCA) dimensionality reduction
  • part1/kernel.py where you will implement polynomial and Gaussian RBF kernels
  • part1/main.py where you will use the code you write for this part of the project

To get warmed up to the MNIST data set run python main.py. This file provides code that reads the data from mnist.pkl.gz by calling the function get_MNIST_data that is provided for you in utils.py. The call to get_MNIST_data returns Numpy arrays:

  1. train_x : A matrix of the training data. Each row of train_x contains the features of one image, which are simply the raw pixel values flattened out into a vector of length . The pixel values are float values between 0 and 1 (0 stands for black, 1 for white, and various shades of gray in-between).
  2. train_y : The labels for each training datapoint, also known as the digit shown in the corresponding image (a number between 0-9).
  3. test_x : A matrix of the test data, formatted like train_x.
  4. test_y : The labels for the test data, which should only be used to evaluate the accuracy of different classifiers in your report. Next, we call the function plot_images to display the first 20 images of the training set. Look at these images and get a feel for the data (don't include these in your write-up).

mnist's People

Contributors

ramidecodes avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.