delrocco / cellcputest Goto Github PK
View Code? Open in Web Editor NEWA matrix multiply and image convolution on the STI cell processor (Sony Playstation 3), circa 2008.
License: MIT License
A matrix multiply and image convolution on the STI cell processor (Sony Playstation 3), circa 2008.
License: MIT License
---------------------------------------- CELL CPU TEST ---------------------------------------- A matrix multiply and image convolution on the STI cell processor (Sony Playstation 3), circa 2008. ---------------------------------------- NOTES ---------------------------------------- Both of the implementations use double buffering and all the available SPEs. However, they don't use vectorization (SIMD) efficiently. I would have liked to get that working, but didn't have the time. Also, convolution only supports filters of size power of two because of alignment issues dma-ing chunks that were not multiples of 16 bytes. --------------------------------------- a. MATRIX MULTIPLY --------------------------------------- matrix_spu.c - 77 statements matrix_spu.assembly - 223 instructions Matrix B is transposed in the PPU to support reading rows instead of columns. Each SPE works on certain rows of the solution matrix (SPE 0 works on rows 0,6,12,18 and SPE 1 works on 1,7,13,19 etc). Below is the algorithm for each SPE. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For each row to work on dma equivalent row from input matrix A For each "column" in matrix B (actually row because its transposed) dma "column" sum up row x column set approprite element in resulting row dma resulting row back to output matrix PPU ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Double buffering is done between the column reads to overlap them. And I don't block on writing the result row back until the next write. Known issues are lack of vectorization, and since I read entire rows at a time the max size matrix supported is 4096x4096 since 4096x4b = 16K. --------------------------------------- b. IMAGE CONVOLUTION --------------------------------------- convolution_spu.c - 124 statements convolution_spu.assembly - 445 instructions First of all, I flipped the algorithm to: h[j,k]*a[m+j,n+k] instead of h[j,k]*a[m-j,n-k] Thus the values convolve to the top-left instead of the bottom-right. This makes it easier to simply pad the matrix on the right and bottom with zeros, instead of padding on the top and left sides, and you don't have to worry about an offset into the image when reading. Essentially the algorithm is the same, just a little easier to program. Each SPE only works on certain rows of tiles of the resulting image (SPE 0 works on tile row 0,6,12,18 and SPE 1 works on 1,7,13,19 etc). Below is the algorithm for each SPE. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ First dma entire filter(h) into local store For each tile row in image to work on For each tile in that row dma a "block" from input matrix A (tile+pad to account for h overlap) do convolution algorithm, set results in a local store tile dma resutling tile back to PPU ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Double buffering is done between tile reads from image A. And I don't block on writing resulting tiles back to image C, until the next write. Known issues are lack of vectorization, and it only supports filter size of power of two - thus 2x2, 4x4, 8x8 and 16x16.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.