GithubHelp home page GithubHelp logo

b0nes164 / shaderonesweep Goto Github PK

View Code? Open in Web Editor NEW
58.0 4.0 6.0 96 KB

A compute shader implementation of the OneSweep sorting algorithm.

License: MIT License

HLSL 73.44% C# 26.56%
compute-shader gpgpu gpgpu-computing hlsl parallel-computing parallel-sorting radix-sort shader sorting unity

shaderonesweep's Introduction

NOTICE: This repository has been archived.

This repository has been archived. The development and maintenance of its contents have been moved to https://github.com/b0nes164/GPUSorting.

ShaderOneSweep

This project is an HLSL compute shader implementation of the current state-of-the-art GPU sorting algorithm, Adinets and Merrill's OneSweep, an LSD radix sort that uses Merrill and Garland's Chained Scan with Decoupled Lookback to reduce the overall global data movement during a digit-binning pass from $3n$ to $2n$.

Given an input size of $2^{28}$ 32-bit uniform random keys and a 2080 Super, this implementation achieves a harmonic mean performance of 9.55 G keys/sec as opposed to the 10.9 G keys/sec achieved in the CUDA CUB library.

To Use This Project

  1. Download or clone the repository.
  2. Drag the contents of src into a desired folder within a Unity project.
  3. Each sort has a compute shader and a dispatcher. Attach the desired sort's dispatcher to an empty game object. All sort dispatchers are named SortNameHere.cs.
  4. Attach the matching compute shader to the game object. All compute shaders are named SortNameHere.compute. The dispatcher will return an error if you attach the wrong shader.
  5. Ensure the slider is set to a non-zero value.

Strongly Suggested Reading and Bibliography

shaderonesweep's People

Contributors

b0nes164 avatar initialneil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

shaderonesweep's Issues

minimum e_size

Is there a minimum size for the array to be sorted? In OneSweepKeyValue.compute seems like G_HIST_PART_SIZE gets set to zero if e_size is smaller than G_HIST_TBLOCKS (2048).

thanks!

keys and values

Great work!

The current b_sort is the keys to be sorted. How can we perform a key & value sort? i.e. the value array is arranged by the sorting of the keys.

Warp Level Multisplit

This implementation is brilliant and simple. I've tried reading through the cub implementation and wow. This is so much better.

I was hoping you could answer a question for me about the block of code below.


uint4 bits = countbits(firstOffsets << LANE_MASK - LANE);
        int index = (firstKeys.x & RADIX_MASK) + (WAVE_INDEX << RADIX_LOG);
        uint prev = g_waveHists[index];
        if (bits.x == 1)
            g_waveHists[index] += countbits(firstOffsets.x);
        firstOffsets.x = prev + bits.x - 1;

Here is where you are calculating the wave offsets by using the WaveActiveBallot to check whether the thread pulled a key with a set bit.

But what is the code block above doing? I don't understand - for example - why you check to see if the counted bit is 1. The lane mask in the first few lines is also puzzling to me as it seems to shift away all of the bits in the lowest lanes.

Would you mind if I try my hand at implementing a version of your code in Metal? MSL is sorely lacking in freely available algorithms such as this one and your implementation is really impressive.

Originally posted by @bentoboxlimited in #1 (comment)

Support for 64-bit KV sort?

I'm wondering if the shaders current have support for 64-bit keys with payloads? If not, how would I go about implementing this?

Thanks for the great work!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.