GithubHelp home page GithubHelp logo

shadensmith / torch-blocksparse Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ptillet/torch-blocksparse

1.0 3.0 0.0 198 KB

Block-sparse primitives for PyTorch

License: MIT License

Python 96.03% Dockerfile 0.71% C++ 3.25%

torch-blocksparse's Introduction

Torch-Blocksparse

Block-sparse operations for PyTorch

Supported Operations

The following features are supported:

Convolutions with block-sparse weights:  Layout has format [K//block, C//block, R, S]. Padding/Stride supported.
Sparse MultiHead Attention (https://arxiv.org/abs/1904.10509)
Batched Matrix Multiplication: SPARSE = op(DENSE) x op(DENSE)
Batched Matrix Multiplication: DENSE = op(SPARSE) x op(DENSE)
Batched Matrix Multiplication: DENSE = op(DENSE) x op(SPARSE)
Softmax: SPARSE = Softmax(SPARSE)

where op() is identity or transposition.

Inputs are FP32 or FP16 (with tensor cores).

Installation

Torch-Blocksparse depends on CUDA 10.1 and the Triton language and compiler, which requires llvm-{8,9}.

sudo apt-get install llvm-{8,9}-dev # Ubuntu

You can then install the latest stable version from pip

pip install torch-blocksparse

Or the latest development version from source

python setup.py install;

Usage

import torch
import torch_blocksparse

# Z: non-sparse batch dimension
# H: sparse batch dimension
# M: row dimension
# N: column dimension
Z, H, M, N, K = 4, 2, 256, 512, 384
a = torch.rand((Z, H, M, K), dtype=torch.float32).cuda()
b = torch.rand((Z, H, K, N), dtype=torch.float32).cuda()
# create sparsity layout
block = 16
layout = torch.randint(0, 2, (H, M//block, N//block))
# create object for Sparse = trans(Dense) x Dense (sdd)
# some overhead there as it pre-computes look-up tables 
# internally needed by GPU kernels
dot = torch_blocksparse.MatMul(layout, block, 'sdd', trans_a=True, trans_b=False)
c = dot(a, b)
# create object for Sparse = softmax(Sparse)
softmax = torch_blocksparse.Softmax(layout, block)
d = softmax(c)

Performance

Here is the performance of this package compared to OpenAI blocksparse for the DDS layout (dense = dense x sparse) with square, non-transposed inputs:

The file test.py includes simple benchmarking code.

torch-blocksparse's People

Contributors

arashashari avatar gangiman avatar ptillet avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.