GithubHelp home page GithubHelp logo

ptoxide's Introduction

ptoxide

ptoxide is a crate that allows NVIDIA CUDA PTX code to be executed on any machine. It was created as a project to learn more about the CUDA excution model.

Kernels are executed by compiling them to a custom bytecode format, which is then executed inside of a virtual machine.

To see how the library works in practice, check out the example below, and take a look at the integration tests in the tests directory.

Try running cargo run --example times_two to see it in action!

Supported Features

ptoxide supports most fundamental PTX features, such as:

  • Global, shared, and local (stack) memory
  • (Recursive) function calls
  • Thread synchronization using barriers
  • Various arithmetic operations on integers and floating point values
  • One-, two-, and three-dimensional thread grids and blocks

These features are sufficient to execute the kernels found in the kernels directory, such as simple vector operations, matrix multiplication, and matrix transposition using a shared buffer.

However, many features and instructions are still missing, and you will probably encounter todo!s and parsing errors when attempting to execute more complex programs. Pull requests to implement missing features are always greatly appreciated!

Internals

The code of the library itself is not yet well-documented. However, here is a general overview of the main modules comprising ptoxide:

  • The ast module implements the logic for parsing PTX programs.
  • The vm module defines a bytecode format and implements the virtual machine to execute it.
  • The compiler module implements a simple single-pass compiler to translate a PTX program given as an AST to bytecode.

Example

The following code snippet shows how to invoke a kernel to scale a vector of floats by a factor of 2. Check out the full example in the examples directory, or run it by running cargo run --example times_two.

use ptoxide::{Context, Argument, LaunchParams};

fn times_two(kernel: &str) {
    let a: Vec<f32> = vec![1., 2., 3., 4., 5.];
    let mut b: Vec<f32> = vec![0.; a.len()];

    let n = a.len();

    let mut ctx = Context::new_with_module(kernel).expect("compile kernel");

    const BLOCK_SIZE: u32 = 256;
    let grid_size = (n as u32 + BLOCK_SIZE - 1) / BLOCK_SIZE;

    let da = ctx.alloc(n);
    let db = ctx.alloc(n);

    ctx.write(da, &a);
    ctx.run(
        LaunchParams::func_id(0)
            .grid1d(grid_size)
            .block1d(BLOCK_SIZE),
        &[
            Argument::ptr(da),
            Argument::ptr(db),
            Argument::U64(n as u64),
        ],
    ).expect("execute kernel");

    ctx.read(db, &mut b);
    // prints [2.0, 4.0, 6.0, 8.0, 10.0]
    println!("{:?}", b);
}

Reading PTX

To learn more about the PTX ISA, check out NVIDIA's documentation.

License

ptoxide is dual-licensed under the Apache License version 2.0 and the MIT license, at your choosing.

ptoxide's People

Contributors

gvilums avatar

Stargazers

Ronak Haresh Chhatbar avatar  avatar Sejin Park avatar zhiqiangxu avatar Matt avatar Gabe Meikle avatar  avatar  avatar Yang Yang avatar junji hashimoto avatar Jon Purdy avatar Roman Dahm avatar  avatar  avatar Rolph Recto avatar Lorenz Schmidt avatar Adrian avatar Feng Ye avatar Chang Liu avatar Lucas Torroba Hennigen avatar  avatar Ramzi Sabra avatar Byeongjee Kang avatar Christian Legnitto avatar Luke Frisken avatar chenkun avatar Jovansonlee Cesar avatar John Skottis avatar Clayton Kehoe avatar  avatar

Watchers

Chang Liu avatar  avatar

Forkers

cl91 rolph-recto

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.