GithubHelp home page GithubHelp logo

yarshev / unidist Goto Github PK

View Code? Open in Web Editor NEW

This project forked from modin-project/unidist

0.0 1.0 1.0 890 KB

Unified Distributed Execution

License: Apache License 2.0

Python 99.16% Cython 0.31% C++ 0.53%

unidist's Introduction

Unified Distributed Execution

PyPI version

What is unidist?

unidist is a framework that is intended to provide the unified API for distributed execution by supporting various performant execution backends. At the moment the following backends are supported under the hood:

unidist is designed to work in a task-based parallel model.

Also, the framework provides a Python Sequential backend (pyseq), that can be used for debugging.

Installation

Using pip

unidist can be installed with pip on Linux, Windows and MacOS:

pip install unidist # Install unidist with dependencies for Python Multiprocessing and Python Sequential backends

unidist can also be used with MPI, Dask or Ray execution backend. If you don't have MPI, Dask or Ray installed, you will need to install unidist with one of the targets:

pip install unidist[all] # Install unidist with dependencies for all the backends
pip install unidist[mpi] # Install unidist with dependencies for MPI backend
pip install unidist[dask] # Install unidist with dependencies for Dask backend
pip install unidist[ray] # Install unidist with dependencies for Ray backend

unidist automatically detects which execution backends are installed and uses that for scheduling computation.

Note: There are different MPI implementations, each of which can be used as a backend in unidist. Mapping unidist[mpi] installs mpi4py package, which is just a Python wrapper for MPI. To enable unidist on MPI execution you need to have a working MPI implementation and certain software installed beforehand. Refer to Installation page of the mpi4py documentation for details. Also, you can find some instructions on MPI backend page.

Using conda

For installing unidist with dependencies for MPI and Dask execution backends into a conda environment the following command should be used:

conda install unidist-mpi unidist-dask -c conda-forge

All set of backends could be available in a conda environment by specifying:

conda install unidist-all -c conda-forge

or explicitly:

conda install unidist-mpi unidist-dask unidist-ray -c conda-forge

Note: There are different MPI implementations, each of which can be used as a backend in unidist. By default, mapping unidist-mpi installs a default MPI implementation, which comes with mpi4py package and is ready to use. The conda dependency solver decides on which MPI implementation is to be installed. If you want to use a specific version of MPI, you can install the core dependencies for MPI backend and the specific version of MPI as conda install unidist-mpi <mpi> as shown in the Installation page of mpi4py documentation. That said, it is highly encouraged to use your own MPI binaries as stated in the Using External MPI Libraries section of the conda-forge documentation in order to get ultimate performance.

For more information refer to Installation section.

Choosing an execution backend

If you want to choose a specific execution backend to run on, you can set the environment variable UNIDIST_BACKEND and unidist will do computation with that backend:

export UNIDIST_BACKEND=mpi  # unidist will use MPI
export UNIDIST_BACKEND=dask  # unidist will use Dask
export UNIDIST_BACKEND=ray  # unidist will use Ray

This can also be done within a notebook/interpreter before you initialize unidist:

from unidist.config import Backend

Backend.put("mpi")  # unidist will use MPI
Backend.put("dask")  # unidist will use Dask
Backend.put("ray")  # unidist will use Ray

If you have installed all the execution backends and haven't specified any of the execution backends, MPI is used by default. Currently, almost all MPI implementations require mpiexec command to be used when running an MPI program. If you use a backend other than MPI, you run a program as a regular python script (see below).

Usage

# script.py

import unidist
unidist.init() # MPI backend is used by default

@unidist.remote
def foo(x):
    return x * x

# This will run `foo` on a pool of workers in parallel;
# `refs` will contain object references to actual data
refs = [foo.remote(i) for i in range(5)]
# To get the data call `unidist.get(...)`
print(unidist.get(refs))

Run the script.py with:

$ mpiexec -n 1 python script.py  # for MPI backend
# $ python script.py  # for any other supported backend
[0, 1, 4, 9, 16]  # output

For more examples refer to Getting Started section in our documentation.

Powered by unidist

unidist is meant to be used not only directly by users to get better performance in their workloads, but also be a core component of other libraries to power those with the performant execution backends. Refer to Libraries powered by unidist section of Using Unidist page to get more information on which libraries have already been using unidist.

Full Documentation

Visit the complete documentation on readthedocs: https://unidist.readthedocs.io.

unidist's People

Contributors

arunjose696 avatar luweizheng avatar no-ponomarev avatar prutskov avatar retribution98 avatar yarshev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.