GithubHelp home page GithubHelp logo

p6's Introduction

P6: Declarative Specification for Interactive Machine Learning and Visual Analytics

P6 is a research project for developing a declarative language to specify visual analytics processes that integrate machine learning methods with interactive visualization for data analysis and exploration. P6 uses P4 for GPU accelerated data processing and rendering, and leverages Scikit-Learn and other Python libraries for supporting machine learning algorithms.

Demo

Demos for using declarative specifications with clustering, dimension reduction, and regression here:

Installation

To run P6, first install both the JavaScript and Python dependencies and libraries:

npm install
pip install -r python/requirements.txt

Development and Examples

For development and trying the example applications, use the following commands for starting the server and client

npm start

Or start server and client on two different terminals/consoles:

npm run server
npm run client

The example applications can be accessed at http://localhost:8080/examples/

Usage

  //config 
  let app = p6()
    .data({url: 'data/babies.csv'}) // input data
    .analyze({
      // analyze the data using sklearn.decomposition.PCA and store the result in a new variable 'PC'
      PC: {
        module: 'decomposition',
        algorithm: 'PCA',
        n_components: 2,
        features: ['BabyWeight', 'MotherWeight', 'MotherHeight', 'MotherWgtGain', 'MotherAge'] 
      }
    })

  app.layout({
    container: "app", // id of the div
    viewport: [800, 400]
  })
  .visualize({
    chart: {
      mark: 'circle', size: 8,
      x: 'PC1', y: 'PC0',
      color: 'clusters', opacity: 0.5,
    }
  })

API

P6 provides a JavaScript API with a declarative language for specifying operations in visual analytics processes, which include data processing, machine learning, visualization, interaction.

Data

data({source, selection, preprocess, transform})
  • source: source of the dataset, example: {url: './data/babies.csv}
  • select: select data subset by rows, columns, or data types. Example: {select: {nrows: 10000, columns: ['BabyWeight', 'BabyGender']}}
    • nrows - number of rows
    • columns - specify which data columns
    • dtype - select categorical or numerical data
  • preprocess: preprocess data by dtypes.
    • Example for using one-hot encoding on categorical data: {preprocess: {categorical: 'OneHot'}}
    • Example for dropping null values: {preprocess: {null: 'drop'}}
    • Example for filling null values by columns: {preprocess: {null: {fill: {BabyWeight: 8}}}

Machine Learning and Analytics

analyze({algorithm, features, scaling, [parameters]})
  • algorithm: supported algorithms and methods - clustering, dimension reduction, manifold
  • features: data fields as the input to the specified algorithm.
  • scaling: use StandardScaler, LabelEncoder minmax_scale, or other preprocessors for scaling the input data
  • [parameters]: use the same name as the functions in Python libraries. As shown in the example shown above, n_component is directly passed to sklearn.decomposition.PCA. More parameters can be set in this way.

Train model for classification and regression tasks

model({module, method, trainingData, features, target, [parameters]})
  • module: Python library and module containing the method for fitting the model. Example: sklearn.linearmodel.
  • method: the function to be called for fitting the model. Example: LinearRegression.
  • trainingData: data for training the model
  • features: input features to the model
  • target: the data field for prediction
  • [parameters]: hyperparameters for the model

Visualization

To organize the views for visualization, the layout function can be used for configuring the views and layouts.

View Layout

layout({id, width, height, padding, [options]})

To visualize data or analysis result, call `visualize' to transform data (optional), choose a visual mark, and specify the visual encoding for mapping data to visual marks.

Visual Encoding/Mapping

visualize({transform, visualMark, [encoding]})

Publication

Jianping Kelvin Li and Kwan-Liu Ma. P6: A Declarative Language for Integrating Machine Learning in Visual Analytics. IEEE Transactions on Visualization and Computer Graphics (Proc: VAST), 2020

Acknowledgement

This research was sponsored in part by the U.S. National Science Foundation through grant NSF IIS-1528203 and U.S. Department of Energy through grant DE-SC0014917.

p6's People

Contributors

dependabot[bot] avatar jpkli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.