GithubHelp home page GithubHelp logo

sharon0611 / scikit-learn-videos Goto Github PK

View Code? Open in Web Editor NEW

This project forked from justmarkham/scikit-learn-videos

0.0 1.0 0.0 880 KB

IPython notebooks from the scikit-learn video series

Home Page: http://blog.kaggle.com/author/kevin-markham/

Jupyter Notebook 99.55% CSS 0.45%

scikit-learn-videos's Introduction

Introduction to machine learning with scikit-learn

This repo contains IPython notebooks from my scikit-learn video series, as seen on Kaggle's blog.

Want to learn even more about scikit-learn? I teach an online course, Machine Learning with Text in Python.

Entire series

Individual videos

  1. What is machine learning, and how does it work? (video, notebook, blog post)

    • What is machine learning?
    • What are the two main categories of machine learning?
    • What are some examples of machine learning?
    • How does machine learning "work"?
  2. Setting up Python for machine learning: scikit-learn and IPython Notebook (video, notebook, blog post)

    • What are the benefits and drawbacks of scikit-learn?
    • How do I install scikit-learn?
    • How do I use the IPython Notebook?
    • What are some good resources for learning Python?
  3. Getting started in scikit-learn with the famous iris dataset (video, notebook, blog post)

    • What is the famous iris dataset, and how does it relate to machine learning?
    • How do we load the iris dataset into scikit-learn?
    • How do we describe a dataset using machine learning terminology?
    • What are scikit-learn's four key requirements for working with data?
  4. Training a machine learning model with scikit-learn (video, notebook, blog post)

    • What is the K-nearest neighbors classification model?
    • What are the four steps for model training and prediction in scikit-learn?
    • How can I apply this pattern to other machine learning models?
  5. Comparing machine learning models in scikit-learn (video, notebook, blog post)

    • How do I choose which model to use for my supervised learning task?
    • How do I choose the best tuning parameters for that model?
    • How do I estimate the likely performance of my model on out-of-sample data?
  6. Data science pipeline: pandas, seaborn, scikit-learn (video, notebook, blog post)

    • How do I use the pandas library to read data into Python?
    • How do I use the seaborn library to visualize data?
    • What is linear regression, and how does it work?
    • How do I train and interpret a linear regression model in scikit-learn?
    • What are some evaluation metrics for regression problems?
    • How do I choose which features to include in my model?
  7. Cross-validation for parameter tuning, model selection, and feature selection (video, notebook, blog post)

    • What is the drawback of using the train/test split procedure for model evaluation?
    • How does K-fold cross-validation overcome this limitation?
    • How can cross-validation be used for selecting tuning parameters, choosing between models, and selecting features?
    • What are some possible improvements to cross-validation?
  8. Efficiently searching for optimal tuning parameters (video, notebook, blog post)

    • How can K-fold cross-validation be used to search for an optimal tuning parameter?
    • How can this process be made more efficient?
    • How do you search for multiple tuning parameters at once?
    • What do you do with those tuning parameters before making real predictions?
    • How can the computational expense of this process be reduced?
  9. Evaluating a classification model (video, notebook, blog post)

    • What is the purpose of model evaluation, and what are some common evaluation procedures?
    • What is the usage of classification accuracy, and what are its limitations?
    • How does a confusion matrix describe the performance of a classifier?
    • What metrics can be computed from a confusion matrix?
    • How can you adjust classifier performance by changing the classification threshold?
    • What is the purpose of an ROC curve?
    • How does Area Under the Curve (AUC) differ from classification accuracy?

scikit-learn-videos's People

Contributors

justmarkham avatar

Watchers

Sharon Shen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.