GithubHelp home page GithubHelp logo

tugrul512bit / cekirdekler Goto Github PK

View Code? Open in Web Editor NEW
93.0 15.0 9.0 10.9 MB

Multi-device OpenCL kernel load balancer and pipeliner API for C#. Uses shared-distributed memory model to keep GPUs updated fast while using same kernel on all devices(for simplicity).

License: GNU General Public License v3.0

C# 100.00%
opencl-kernels iterative load-balancer pipelining multi-device gpgpu multi-gpu zero-copy gpu-computing gpu-acceleration

cekirdekler's Introduction

tugrul512bit's GitHub stats

cekirdekler's People

Contributors

tugrul512bit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cekirdekler's Issues

Lazy compute

There is no lazy compute for now.

var compute1 = array1.queueCompute()
var compute2 = compute1.nextStep(array2.queueCompute()).compute()

can be useful with less synchronizations.

C++ array wrapper re-creating(and computing) in loop throws error(CL_INVALID_MEM_OBJECT) but works for prepared N-array of C++ arrays

Found the root: re-creating inside loop has a chance to get same pointer (C++ - OS memory management) so USE_HOST_PTR flagged buffer throws error because of using duplicate buffer objects with same pointer.

Todo: release buffer that is bound to hashcode of ClArray<T> that is being destructed, in the Cores/ClNumberCruncher


  • because C# generates probably same hash after some iterations, makes API use same opencl buffer with USE_HOST_PTR flag and that has old/deleted array pointer, needs to re-check for USE_HOST_PTR type buffers whenever accessed.
    or

  • Parallel.For and buffer read/write(or workers[i].kernelArgument) gets overlapped(or even out of bounds) addressings that throw AggregateException_ctor_DefaultMessage error + System.AccessViolationException

  • no problem for C# arrays

  • probably from the USE_HOST_PTR buffer allocation failure which is not yet error-checked yet.

  • or, it is opencl implementation bugging when deleting a pointer while that pointer is still in an opencl buffer as CL_MEM_USE_HOST_PTR

Disposing unused buffers with warning message

api is creating a new buffer for each unique array given as parameter, with enough arrays, it could give out of resources.

  • LRU cache to hold max=N buffers(regardless of individual sizes) with total size constraint
    (default = RAM / 2 ? )
  • save data to disk when disposed, read from disk when re-created

Arrays: bounds check before compute.

just like workitems but with "elementsPerWorkItem" value taken into consideration against total work size and array size.

arrays will be able to bigger, but will not be let smaller than used range.

Explicit Pipelining

pipeline1.push(a.nextParam(b).read()).push(c.compute()).push(d.write()).finish()

pipeline1.overlap(pipeline2,pipeline3).finish()

Error handling for every single opencl command.

Maybe less performance but more description when something bad happens. There is already a Test class for testing implementation but developer faults need to be taken care of.

For now, it only tells opencl kernel compiling errors such as "float5 is not defined" and similar.

  • added error-returning function call error handing.
  • need to add buffer creation or buffer mapping error handling(from parameter, not returned value)

Explicit device selection

Can be useful when developer doesn't need all GPUs at once in OpenCL. Maybe something like a device list in different categories:

  ClDevice.getGpuList()                                        
  ClDevice.getAccList()                                          // random order with device name so user can choose 
  ClDevice.getDeviceWithMaxComputeUnit()    // 20 thread CPU is not same as a 20CU -  HD7870 !!! 
  ClDevice.getDeviceWithBenchmark("nbody"); // gets top point awarded device
  ClDevice.activateDynamicDeviceSwitching() // switches to another device when performance becomes too much oscillated (GTX_titan 1ms 3ms 2ms 3ms 1ms then switches to gtx_950 10ms 11ms 10ms 9ms)

Sequential kernel executions in same `compute()` method

array.compute(cruncher, 1, "kernel1 kernel2 kernel3", globalSize, localSize)

here all kernels listed in parameter are run with same globalSize and localSize. globalsize and localSize should support multiple values. Overloading compute with an array/list parameter maybe.

No offline compiler

Adding a clCreateProgramFromBinary() might be useful for FPGA owners. FPGAs may take hours to compile a single kernel while a gaming GPU can do it in seconds.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.