GithubHelp home page GithubHelp logo

k-fold-cross-validation's Introduction

k-fold-cross-validation

Example of regression with fixed bases of features.

While the hold out method is an intuitive approach to determining proper fiiting models, it suffers from an obvious flaw: Having been chosen at random, the points assigned to the training set may not adequately describe the original data. However, we can easily extend and robustify the hold out method: k-fold cross-validation. Performing k-fold cross-validation is often the most computationally expensive component in solving a general regresion problem. We provide a pseudo-code for applying k-fold cross-validation.

________________________________________________________________________________________________________________________
Algorithm k-fold cross-validation pseudo-code
Input: Data-set $(\mathbf{x_{p}}, y_{p})_{p=1}^{P}$, k (number of folds), a range of values for M to try, and a type of basis feature
Split the data into k equal (as possible) size folds
for s=1...k
for each M (in the range of values to try)
    1) Train a model with M basis features on sth fold's training set
    2) Compute corresponding testing error on this fold
Return: Value $M^{*}$ with lowest average testing error over all k folds

_________________________________________________________________________________________________________________________

Regression with fixed bases of features

To perform regression using a fixed basis of features (e.g., polynomials or Fourier) it is natural to choose a degree D and transform the input data using the associated basis functions. For example, employing a degree D polynomial or Fourier basis for a scalar input, we transform each input $x_{p}$ to form an associated feature vector $\mathbf{f_{p}}=\begin{bmatrix} x_{p} & x_{p}^{2} &... & x_{p}^{D} \end{bmatrix}^{T}$ or $\mathbf{f_{p}}=\begin{bmatrix} cos(2\pi x_{p})&sin(2\pi x_{p}) &... &cos(2\pi Dx_{p}) & sin(2\pi Dx_{p}) \end{bmatrix}^{T}$ respectively. For higher dimensions of input N fixed basis features can be similarly used; however, the sheer number of elements involved (the length of each $\mathbf{f_{p}}$) explodes for even moderate values of N and D. In any case, once feature vectors $\mathbf{f_{p}}$ have been constructed using the data we can then determine proper weights b and $\mathbf{w}$ by minimizing the Least Squares cost function as $$\underset{b,\mathbf{w}}{minimize} \sum_{p=1}^{P}\left ( b+\mathbf{f_{p}^{T}}\mathbf{w} -y_{p}\right )^{2}$$ Using the compact notation $\mathbf{\widetilde{w}}=\begin{bmatrix} b\ \mathbf{w} \end{bmatrix}^{T}$ and $\widetilde{\mathbf{f_{p}}}=\begin{bmatrix} 1 & \mathbf{f_{p}} \end{bmatrix}^{T}$ for each p we may rewrite the cost as $ g\left ( \widetilde{\mathbf{w}} \right )=\sum_{p=1}^{P}\left ( \widetilde{f_{p}^{T}} \mathbf{\widetilde{w}} -y_{p} \right ) $. and checking the first order condition then gives the linear system of equations $$ \left ( \sum_{p=1}^{P}\mathbf{\widetilde{f_{p}}\widetilde{f_{p}^{T}}} \right )\mathbf{\widetilde{w}}=\sum_{p=1}^{P}\widetilde{\mathbf{f_{p}}}y_{p}$$ that when sovled recovers an optimal set of parameters \mathbf{\widetilde{w}}.
Thus, $$\mathbf{\widetilde{w}}=\left ( \sum_{p=1}^{P}\mathbf{\widetilde{f_{p}}\widetilde{f_{p}^{T}}} \right )^{-1}\sum_{p=1}^{P}\widetilde{\mathbf{f_{p}}}y_{p}$$

k-fold-cross-validation's People

Watchers

Liu Cao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.