GithubHelp home page GithubHelp logo

matt8955 / dsc-pca-introduction-nyc-ds-033020 Goto Github PK

View Code? Open in Web Editor NEW

This project forked from learn-co-students/dsc-pca-introduction-nyc-ds-033020

0.0 0.0 0.0 7 KB

License: Other

Jupyter Notebook 100.00%

dsc-pca-introduction-nyc-ds-033020's Introduction

PCA - Introduction

Introduction

In this section, you'll learn about principal component analysis, or PCA, one of the most famous Unsupervised Learning Techniques. PCA is a dimensionality reduction technique. It allows you to compress a dataset into a lower dimensional space with fewer features while maintaining as much of the original information as possible.

The Curse of Dimensionality

The curse of dimensionality is a general mathematical problem relating to the exploding size of space as you continue to add additional dimensions. This can be particularly problematic when dealing with large datasets. The more features you have, the more data you have about the scenario, but the more difficult it might be to exhaustively explore combinations of these features.

PCA Use Cases

The curse of dimensionality is certainly one motivating factor for PCA. If you can't process all of the information at your disposal, then an alternative path around is necessary. Dimensionality reduction techniques such as PCA can be essential in such situations. PCA can also help improve regression and classification algorithms in many cases. In particular, algorithms are less prone to overfitting when the underlying data itself has first been compressed, reducing noise or other anomalies. Finally, PCA can also be helpful for visualizing the structure of large datasets. After all, you are limited to 2 or 3 dimensions when visualizing data. As such, reducing a dataset to 2 or 3 primary features is monumental in creating a visualization.

Summary

In this section, you'll explore PCA in depth using scikit-learn, and coding your own version from scratch using NumPy. Throughout this section, keep in mind use cases for PCA such as the curse of dimensionality.

dsc-pca-introduction-nyc-ds-033020's People

Contributors

h-parker avatar loredirick avatar mathymitchell avatar sumedh10 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.