GithubHelp home page GithubHelp logo

j0x7c4 / facets Goto Github PK

View Code? Open in Web Editor NEW

This project forked from pair-code/facets

0.0 2.0 0.0 5.7 MB

Visualizations for machine learning datasets

Home Page: https://pair-code.github.io/facets/

License: Apache License 2.0

Python 1.75% HTML 21.41% Jupyter Notebook 68.81% TypeScript 8.01% JavaScript 0.03%

facets's Introduction

Introduction

The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive.

The visualizations are implemented as Polymer web components, backed by Typescript code and can be easily embedded into Jupyter notebooks or webpages.

Live demos of the visualizations can be found on the Facets project description page.

Facets Overview

Overview visualization of UCI census data

Overview gives a high-level view of one or more data sets. It produces a visual feature-by-feature statistical analysis, and can also be used to compare statistics across two or more data sets. The tool can process both numeric and string features, including multiple instances of a number or string per feature.

Overview can help uncover issues with datasets, including the following:

  • Unexpected feature values
  • Missing feature values for a large number of examples
  • Training/serving skew
  • Training/test/validation set skew

Key aspects of the visualization are outlier detection and distribution comparison across multiple datasets. Interesting values (such as a high proportion of missing data, or very different distributions of a feature across multiple datasets) are highlighted in red. Features can be sorted by values of interest such as the number of missing values or the skew between the different datasets.

Details about Overview usage can be found in its README.

Facets Dive

Dive visualization of UCI census data

Dive is a tool for interactively exploring up to tens of thousands of multidimensional data points, allowing users to seamlessly switch between a high-level overview and low-level details. Each example is a represented as single item in the visualization and the points can be positioned by faceting/bucketing in multiple dimensions by their feature values. Combining smooth animation and zooming with faceting and filtering, Dive makes it easy to spot patterns and outliers in complex data sets.

Details about Dive usage can be found in its README.

Setup

Installation

git clone https://github.com/PAIR-code/facets
cd facets

Enabling Usage in Jupyter Notebooks

Pre-built versions of the jupyter extension visualization code can be found in the facets-dist directory.

To enable use of these visualizations in Jupyter notebooks:

  1. Install the jupyter notebook software: http://jupyter.org/install.html
  2. Install the visualizations into Jupyter as an nbextension.
  • If jupyter was installed with pip, you can use jupyter nbextension install facets-dist/ if jupyter was installed system-wide or jupyter nbextension install facets-dist/ --user if installed per-user (run from the facets top-level directory). You do not need to run any follow-up jupyter nbextension enable command for this extension.
  • Alternatively, you can manually install the nbextension by finding your jupyter installation's share/jupyter/nbextensions folder and copying the facets-dist directory into it.
  1. To enable the Overview visualization, you must also have the Protocol Buffers python runtime library installed: https://github.com/google/protobuf/tree/master/python. If you used pip or anaconda to install Jupyter, you can use the same tool to install the runtime library.

Note: When visualizing a large amount of data, as is done in the Dive demo Jupyter notebook, you will need to start the notebook server with an increased IOPub data rate. This can be done with the command jupyter notebook --NotebookApp.iopub_data_rate_limit=10000000.

Building the Visualizations

If you make code changes to the visualization and would like to rebuild them for use in Jupyter notebooks, follow these directions:

  1. Install bazel: https://bazel.build/
  2. Build the visualizations: bazel build facets:facets_jupyter (run from the facets top-level directory)
  3. Move the resulting vulcanized html file into the facets-dist directory.
  4. Reinstall the facets-dist jupyter extension as in the previous section.

Known Issues

  • The Facets visualizations currently work only in Chrome - Issue 9.

Disclaimer: This is not an official Google product

facets's People

Contributors

bowenni avatar faviovazquez avatar jameswex avatar jimbojw avatar lincolnfrias avatar montanalow avatar morganics avatar moustafaaatta avatar pfrstg avatar rictic avatar royalliii avatar sudharakap avatar tobyjamesthomas avatar vfdev-5 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.