GithubHelp home page GithubHelp logo

rlugojr / sklearn_pandas_intro Goto Github PK

View Code? Open in Web Editor NEW

This project forked from agramfort/sklearn_pandas_intro

0.0 3.0 0.0 794 KB

Introduction to Scikit-Learn and Pandas

Makefile 0.11% Python 17.12% Jupyter Notebook 82.77%

sklearn_pandas_intro's Introduction

Introduction to predictive analytics with pandas and scikit-learn

This repository contains notebooks to get started with predictive analytics using scikit-learn and pandas.

This material is strongly inspired from the EuroPython 2014 scikit-learn tutorial

which was inspired by http://github.com/jakevdp/sklearn_scipy2013 by Jake VanderPlas @jakevdp | http://jakevdp.github.com

Installation Notes

This tutorial will require recent installations of numpy, scipy, matplotlib, scikit-learn, pandas and Pillow (or PIL).

For users who do not yet have these packages installed, a relatively painless way to install all the requirements is to use a package such as Anaconda, which can be downloaded and installed for free.

Please download in advance the datasets mentionned in Data Downloads

With the IPython/jupyter notebook

The recommended way to access the materials is to execute them in the IPython/jupyter notebook. If you have the notebook installed, you should download the materials (see below), go the the notebooks directory, and launch IPython notebook from there by typing:

cd notebooks
jupyter notebook  # ipython notebook if old version

in your terminal window. This will open a notebook panel load in your web browser.

Downloading the Tutorial Materials

I would highly recommend using git, not only for this tutorial, but for the general betterment of your life. Once git is installed, you can clone the material in this tutorial by using the git address shown above:

If you can't or don't want to install git, there is a link above to download the contents of this repository as a zip file. I may make minor changes to the repository in the days before the tutorial, however, so cloning the repository is a much better option.

Data Downloads

The data for this tutorial is not included in the repository. We will be using several data sets during the tutorial: most are built-in to scikit-learn, which includes code which automatically downloads and caches these data. Because the wireless network at conferences can often be spotty, it would be a good idea to download these data sets before arriving at the conference. You can do so by using the fetch_data.py included in the tutorial materials.

You will also need:

https://dl.dropboxusercontent.com/u/2140486/data/titanic_train.csv
https://dl.dropboxusercontent.com/u/2140486/data/adult_train.csv

sklearn_pandas_intro's People

Contributors

agramfort avatar camillemarinisonos avatar

Watchers

James Cloos avatar Ray Lugo, Jr. avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.