GithubHelp home page GithubHelp logo

stanford_ml's Introduction

Stanford_ML

Machine Learning Stanford University Started 12th July 2021 Concluded 3rd August 2021.

stanford_ml's People

Contributors

logahn avatar

Watchers

 avatar

stanford_ml's Issues

Stanford Machine Language Notes

Stanford Machine Learning course

		Introduction

Supervised Learning
Support Vector Machine allows computer to deal with infinite number of features.
Supervised Learning works with features, regression problem to predict continuous valued output, classification problem predicts the discrete value output.

Unsupervised Learning
Google uses clustering algorithm to combine articles of the same or similar contents.
Unsupervised Learning is like giving the computer a set of data and asking it to cluster them based on similarities, without giving it much more information.
Social networking analysis, Astronomical data analysis both use clustering or Unsupervised Learning.

Code for separating out audio
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');

	Model Representation

The training set feeds the training algo with data.
Learning algo is denoted by h which means hypothesis.
h takes in the training set and maps values of y for x's.
The model is called linear regression with one variable x.
Cost Function
tetah 0 is the starting point of the line, that is the starting point of tetah 1.
Cost function is also called squared error function.

Difference between hypothesis and cost function
The hypothesis is a function of x, in contrast, the cost function is a parameter of tetha 1.

The objective is to choose the value of tetha 1 that minimizes the value of Jtetah1
Gradient descent repeats until convergence
:= means assigning a value to something
= means assertion

In the gradient descent algorithm:

Alpha is the learning rate and it determines how big steps we take during a gradient descent.
The next term is a derivative term.
We simultaneously update tetha0 and tetha1 and we do that by computing LHS and RHS at the same time.
Learning Rate: If alpha is small, the gradient would be slow, but if too large, there is a tendency to overshoot the minimum
The derivative considers the slope of the graph
It tetah 1 is at the minimum, the derivative term would be equal to zero and tetah1 remains unchanged.
For each step of gradient descent, the derivative value becomes smaller, until we converge to the local minimum. The local minimum is when the derivative becomes zero.
We apply gradient descend algo to Linear regression model to reduce the linear regression model.
Taking further steps of gradient descents, the hypothesis change.
Batch means that we are considering the entire trading data.

		Matrices and Vectors

A vector is a matrix with one column
Matrix multiplication is
-Not commutative
-Associative

Mean normalization is making the features have approximately zero mean.

If you plot J tetha against number of iterations and J is increasing, that gives a sign that gradient descend is not functioning, and a smaller Learning Rate should be used. This is caused by the overshooting of large learning rates.
The J tetha should decrease after every iteration if the appropriate Learning Rate is used.
In Gradient Descent, you need to choose a learning rate alpha and many iterations are required. It works well even when n is large.
In the Normal Equation, the learning rate is not needed to be chosen and iterations are not necessary. Slow if n is large. Need to find the inverse of the transpose of x multiplied by x.

			Classification

Applying linear regression to a classification problem isn't often a great idea.
The decision boundary is a property not of the training set but of the hypothesis and the parameters
The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our hypothesis function.
Optimization algorithms are quite a bit more complex than gradient descent.
Multiclass Classification
This is when there is a classification problem that should be classified into classes of classes.
With the idea of one-versus-all classification, we can accomplish multiclass classification.
We convert the multiclass classification into several classes of binary class classification.

				Cost Function

There are binary classification and Multiclass classification.
The Logistic regression cost function is different from the Neural Network cost function.

The backpropagation algorithm is an algo for minimizing the cost function
Using this algorithm we compute the error of each node in each layer.
We can use forward propagation and Backpropagation on one training data once at a time.
When implementing, for advanced optimization, we unroll into vectors
We use the reshape syntax in Octave to restore back unrolled parameters.

		Debugging a learning algorithm
  • Get more training examples
  • Try smaller sets of features
  • Try getting additional features
  • Try adding polynomial features
  • Try decreasing lambda
  • Try increasing lambda

For hypothesis evaluation, it is better to divide a dataset into 3 parts:

  • Training Set (60%)
  • Cross-Validation Set (20%)
  • Testing Set (20%)

For a bias(underfit), The training data will be high and the cross-validation will approximately
equal the training set.
For a Variance(overfit), The training data will be low and the cross-validation would
greater than the training data.
In high bias problem, getting more learning data is unlikely to help
In high variance problems, getting more learning data is likely to help
Decreasing lambda helps to fix high bias and increasing lambda helps to fix high variance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.