GithubHelp home page GithubHelp logo

machine-learning-knowledge's Introduction

machine-learning-knowledge

knowledge memos/citations of machine-learning based on a coursera class: Machine Learning by Andrew Ng.

What is machine learning? (Two definitions)

  • Arthur Samuel's older and informal definition:
    "the field of study that gives computers the ability to learn without being explicitly programmed."

  • Tom Mitchell's more modern definition:
    "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."

    • Example: playing checkers.
    • E = the experience of playing many games of checkers
    • T = the task of playing checkers.
    • P = the probability that the program will win the next game.
In general, any machine learning problem can be assigned to one of two broad classifications:
Supervised learning and Unsupervised learning.

***

Supervised Learning and Unsupervised Learning

Supervised Learning

Definition

  • "right answer" is given.
  • In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Categories

  • Regression (回帰)
    • we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function.
    • ex. Given a picture of Male/Female, we have to predict his/her age on the basis of the given picture.
  • Classification (分類)
    • we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories
    • ex. Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

Unsupervised Learning

Definition

  • Approaching problems with little or no idea what our results should look like.
  • We can derive structure from data where we don't necessarily know the effect of the variables.
  • We can derive this structure by clustering the data based on relationships among the variables in the data.
  • With unsupervised learning there is no feedback based on the prediction results.

Categories

  • Clustering
    • ex. Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.
  • Non-clustering
    • ex. The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

***

Cost Function

Definition

  • summation of the difference between the predicted value and the actual value.
  • goal of machine learning to solove problem is, in other words, to minimize a cost function.
  • it is also called "Squared error function" or "Mean squared error".

Equasion

alt cost function equasion

  • 1/m with Summation: averaging it
  • 1/2 : we rather play with smaller numbers than big numbers
  • If all data(x) are plotted on the hypothesis, cost function = 0.

Visual Image

  • ex. h(X) = θ0 + θ1 * X
3D plot Contour plot/figure

Two ways to minimize cost function

  • Gradient Descent (any case)
  • Normal Equation (n = # of features < 10,000)

***

Gradient Descent

Definition

  • Start with some θ(parameter)
  • Keep changing θ to reduce J(θ) until we end up at a minimum
  • "Batch" Gradient Descent: Each step of gradient descent uses all the training examples.

Algorithm

  • As we approach a local minimum, gradient descent will automatically take smaller steps.
    So, no need to decrease α over time.

Caution

correct incorrect
  • Gradient descent could be stuck at local optima.

Technics about efficiency

  • make each of input values in roughly the same range to converge efficiently, speedy.
  • ideal range: −1 ≤ x(i) ≤ 1 or −0.5 ≤ x(i) ≤ 0.5
  • Feature Scailing + Mean Normalization:

Feature Scaling

  • "S" in the formula written above.
  • divide the input values by the range (i.e. max - min) of the input variable.

Mean Normalization

  • "μ" in the formula written above.
  • subtract the average value from the values for that input variable.
  • the average of the processed values is 0.

Learning Rate

  • "α" in the gradient descent formula.
  • If α is too small: slow convergence.
  • If α is too large: may not decrease on every iteration and thus may not converge.

Normal Equation

Definition

  • minimize J by explicitly taking its derivatives with respect to the θj ’s, and setting them to zero.
  • This allows us to find the optimum theta without iteration.

Notice


***

Linear Regression

name logic
Model
Cost Function
Algorithm

Linear Regression for Multiple variables

  • Every formula is equal.
description formula
Expanded
X:column(vector)
X:row

Gradient Descent behind it

  • simultaneously update
     

***

Binary Classification Problem(0 or 1)

  • Logistic Function / Sigmoid Function

Hypothesis

  • 0 <= h(x) <= 1



Key Concept

Decision Boundary

  • In order to get our discrete 0 or 1 classification, we can translate the output of the hypothesis function as follows:

***

Appendix

Expression in equation

Label Definition
x input variable, feature
y output/target variable
m number of training examples
h logic(relation) between x and y
θ parameter in h
J cost function
α learning rate
real number

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.