GithubHelp home page GithubHelp logo

lcombs / yellowbrick Goto Github PK

View Code? Open in Web Editor NEW

This project forked from districtdatalabs/yellowbrick

0.0 1.0 0.0 14.89 MB

A suite of visual analysis and diagnostic tools to facilitate feature selection, model selection, and parameter tuning for machine learning.

License: Apache License 2.0

Makefile 0.38% Python 99.62%

yellowbrick's Introduction

Yellowbrick

Build Status Coverage Status Code Health Documentation Status Stories in Ready

Visual analysis and diagnostic tools to facilitate machine learning model selection.

Follow the yellow brick road Image by Quatro Cinco, used with permission, Flickr Creative Commons.

This README is a guide for developers, if you're new to Yellowbrick, get started at our documentation.

What is Yellowbrick?

Yellowbrick is a suite of visual diagnostic tools called "Visualizers" that extend the Scikit-Learn API to allow human steering of the model selection process. In a nutshell, Yellowbrick combines Scikit-Learn with Matplotlib in the best tradition of the Scikit-Learn documentation, but to produce visualizations for your models!

Visualizers

Visualizers

Visualizers are estimators (objects that learn from data) whose primary objective is to create visualizations that allow insight into the model selection process. In Scikit-Learn terms, they can be similar to transformers when visualizing the data space or wrap an model estimator similar to how the "ModelCV" (e.g. RidgeCV, LassoCV) methods work. The primary goal of Yellowbrick is to create a sensical API similar to Scikit-Learn. Some of our most popular visualizers include:

Feature Visualization

  • Rank2D: pairwise ranking of features to detect relationships
  • Parallel Coordinates: horizontal visualization of instances
  • Radial Visualization: separation of instances around a circular plot

Classification Visualization

  • Class Balance: see how the distribution of classes affects the model
  • Classification Report: visual representation of precision, recall, and F1
  • ROC/AUC Curves: receiver operator characteristics and area under the curve

Regression Visualization

  • Prediction Error Plots: find model breakdowns along the domain of the target
  • Residuals Plot: show the difference in residuals of training and test data
  • Alpha Selection: show how the choice of alpha influences regularization

Text Visualization

  • Term Frequency: visualize the frequency distribution of terms in the corpus
  • TSNE: use stochastic neighbor embedding to project documents.

And more! Visualizers are being added all the time, be sure to check the examples (or even the develop branch) and feel free to contribute your ideas for Visualizers!

Using Yellowbrick

The Yellowbrick API is specifically designed to play nicely with Scikit-Learn. Here is an example of a typical workflow sequence with Scikit-Learn and Yellowbrick:

Feature Visualization

In this example, we see how Rank2D performs pairwise comparisons of each feature in the data set with a specific metric or algorithm, then returns them ranked as a lower left triangle diagram.

from yellowbrick.features import Rank2D

visualizer = Rank2D(features=features, algorithm='covariance')
visualizer.fit(X, y)                # Fit the data to the visualizer
visualizer.transform(X)             # Transform the data
visualizer.poof()                   # Draw/show/poof the data

Model Visualization

In this example, we instantiate a Scikit-Learn classifier, and then we use Yellowbrick's ROCAUC class to visualize the tradeoff between the classifier's sensitivity and specificity.

from sklearn.svm import LinearSVC
from yellowbrick.classifier import ROCAUC

model = LinearSVC()
model.fit(X,y)
visualizer = ROCAUC(model)
visualizer.score(X,y)
visualizer.poof()

For additional information on getting started with Yellowbrick, check out our examples notebook.

We also have a quick start guide.

Contributing to Yellowbrick

Yellowbrick is an open source tool designed to enable more informed machine learning through visualizations. If you would like to contribute, you can do so in the following ways:

This repository is set up in a typical production/release/development cycle as described in A Successful Git Branching Model. A typical workflow is as follows:

  1. Select a card from the dev board - preferably one that is "ready" then move it to "in-progress".

  2. Create a branch off of develop called "feature-[feature name]", work and commit into that branch.

    ~$ git checkout -b feature-myfeature develop
    
  3. Once you are done working (and everything is tested) merge your feature into develop.

    ~$ git checkout develop
    ~$ git merge --no-ff feature-myfeature
    ~$ git branch -d feature-myfeature
    ~$ git push origin develop
    
  4. Repeat. Releases will be routinely pushed into master via release branches, then deployed to the server.

yellowbrick's People

Contributors

bbengfort avatar jkeung avatar lauralorenz avatar lcombs avatar mariusvniekerk avatar mattandahalfew avatar naturallogofx avatar ndanielsen avatar nealhumphrey avatar ojedatony1616 avatar pdamodaran avatar pvomelveny avatar rebeccabilbro avatar stampedpassp0rt avatar waffle-iron avatar xerebz avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.