GithubHelp home page GithubHelp logo

direkshan-digital / pydaal-getting-started Goto Github PK

View Code? Open in Web Editor NEW

This project forked from daaltces/pydaal-getting-started

0.0 1.0 0.0 15.43 MB

Introduction and tutorials for using PyDAAL, i.e. the Python API of Intel Data Analytics Acceleration Library

License: Apache License 2.0

Jupyter Notebook 81.34% Python 18.66%

pydaal-getting-started's Introduction

This repository consists of various materials introducing PyDAAL (Python API of Intel Data Analytics Acceleration Library) that facilitates Python and Machine Learning practitioners to start off with PyDAAL concepts.

Additionally, helper functions and classes have been provided to aid frequently performed PyDAAL operations.

Volume 1, 2 and 3 in PyDAAL Gentle Introduction Series are available as Jupyter Notebooks. These volumes are designed to provide a quick introduction to essential features of PyDAAL. These Jupyter Notebooks offer a collection of code examples that can be executed in the interactive command shell, and helper functions to automate common PyDAAL functionalities.

How to use?

Install Intel Distribution for Python (IDP) through conda. IDP consists of a large set of commonly used mathematical and statistical Python packages that are optimized for Intel architectures.

  1. Install the latest version of Anaconda.
  • Choose the Python 3.5 version2.
  1. From the shell prompt (on Windows, use Anaconda Prompt), execute these commands:
    conda create --name idp intelpython3_full python=3 -c intel    
    source activate idp (on Linux and OS X)      
    activate idp (on Windows)    

IDP environment is installed with necessary packages and activated to run these notebooks.

More detailed instructions can be found from this online article.

Various stages of machine learning model building process are bundled together to constitute one helper function class. These classes are constructed using PyDAAL’s data management and algorithm libraries to achieve a complete model deployment.

Stages supported by each helper function classes

  1. Training
  2. Prediction
  3. Model Evaluation and Quality Metrics
  4. Trained Model Storage and Portability

More details on all these stages are available in Volume 3.

Currently, helper function classes are provided for

  1. Linear Regression
  2. SVM - Binary and Multi-Class classifier

For practice, usage examples with sample datasets are also provided that utilize these helper function classes.

PyDAAL API's have been used to tailor Python modules that support common operations on DAAL's Data Management library.

Import the customUtils module and explore basic utilities provided for data retrieval and manipulation operations on DAAL's Data Management library

  1. getArrayFromNT() : Extracts a numpy array from numeric table
  2. getBlockOfNumericTable(): Slices a block of numeric table with specific range of rows and columns
  3. getBlockOfCols(): Extracts a block of numeric table within specific range of columns
  4. getNumericTableFromCSV(): Reads a CSV file into a numeric table
  5. serialize(): Serializes any input data and saves it into a local variable/disk
  6. deserialize(): Deserailizes serialized data from a local variable/disk

These tutorials are spread across a collection of Jupyter notebooks comprising a theoritical explanation on algorithms and interactive command shell to execute using PyDDAL API.

Tutorials Notebooks

Data files used in the tutorials are in the mldata folder. These data files are downloaded from the UCI Machine Learning Repository.

pydaal-getting-started's People

Contributors

preethivenkatesh avatar daaltces avatar zhangzhang10 avatar fschlimb avatar vika-f avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.