GithubHelp home page GithubHelp logo

ml_algorithms's Introduction

Overview

This is a machine learning library, made from scratch.

It uses:

  • numpy: for handling matrices/vectors
  • cvxopt: for convex optimization
  • networkx: for handling graphs in decision trees

It contains the following functionality:

  • Supervised Learning:
    • Regression
      • Linear Regression
      • Logistic Regression
      • Regularization
    • Support Vector Machines
      • Soft and hard margins
      • Kernels
    • Tree Methods
      • CART (classificiation and regression)
      • PRIM
      • AdaBoost
      • Gradient Boost
      • Random Forests
    • Kernel Methods
      • Nadaraya average
      • Local linear regression
      • Kernel density classification
    • Discriminant Analysis
      • LDA, QDA, RDA
    • Prototype Methods
      • KNN
      • LVQ
      • DANN
    • Perceptron
  • Unsupervised Learning
    • K means/mediods clustering
    • PCA
  • Model Selection and Validation

Examples

Examples are shown in two dimensions for visualisation purposes, however, all methods can handle high dimensional data.

Regression

  • Linear and logistic regression with regularization

Imgur

Imgur

Support Vector Machines

  • Support vector machines maximize the margins between classes

Imgur

  • Using kernels, support vector machines can produce non-linear decision boundries. The RBF kernel is shown below

Imgur

Imgur

  • An alternative learning algorithm, the perceptron, can linearly separate classes. It does not maximize the margin, and is severely limited.

SLiMG Image

Tree Methods

  • The library contains a large collection of tree methods, the basis of which are a decision trees for classification and regression

Imgur

These decision trees can be aggregated, and the library supports the following ensemble methods:

  • AdaBoosting
  • Gradient Boosting
  • Random Forests

Kernel Methods

Kernel methods estimate the target function by fitting seperate functions at each point using local smoothing of training data

  • Nadaraya–Watson estimation uses a local weighted average

Imgur

  • Local linear regression uses weighted least squares to locally fit an affine function to the data

Imgur

  • The library also supports kernel density estimation (KDE) of data which is used for kernel density classification

Imgur

Discriminant Analysis

  • Linear Discriminant Analysis creates decision boundries by assuming classes have the same covariance matrix.
  • LDA can only form linear boundries

SLiMG Image

  • Quadratic Discriminant Analysis creates deicion boundries by assuming classes have indepdent covariance matrices.
  • QDA can form non-linear boundries.

SLiMG Image

  • Regularized Discriminant Analysis uses a combination of pooled and class covariance matrices to determine decision boundries.

SLiMG Image

Prototype Methods

  • K-nearest neighbors determines target values by averaging the k-nearest data points. The library supports both regression and classification.

SLiMG Image

  • Learning vector quantization is a prototype method where prototypes are iteratively repeled by out-of-class data, and attracted to in-class data

SLiMG Image

  • Discriminant Adaptive Nearest Neighbors (DANN). DANN adaptively elongates neighborhoods along boundry regions.
  • Useful for high dimensional data.

SLiMG Image

Unsupervised Learning

  • K means and K mediods clustering. Partitions data into K clusters.

SLiMG Image

  • Principal Component Analysis (PCA) Transforms given data set into orthonormal basis, maximizing variance.

SLiMG Image

ml_algorithms's People

Contributors

christopherjenness avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.