archie3d / kompute Goto Github PK

This project forked from komputeproject/kompute

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.

Home Page: http://kompute.cc/

License: Apache License 2.0

Shell 1.76% C++ 72.64% Python 7.16% Makefile 2.11% CMake 14.82% Dockerfile 1.51%

kompute's People

Contributors

kompute's Issues

Allow creating tensors without initial host data

Currently allocating tensors requires passing a std::vector<> to initialize it. This ticket is to add a possibility to create tensors by just passing a size of it.

Add fill tensor operation

Add OpTensorFill to fill tensor with am uint32_t value (see vkCmdFillBuffer).

Pipeline barrier flags for chained operations

Currently in OpAlgoDispatch::record() a pipeline barrier is inserted for every tensor associated with the algorithm:

tensor->recordPrimaryBufferMemoryBarrier(
    commandBuffer,
    vk::AccessFlagBits::eTransferWrite,
    vk::AccessFlagBits::eShaderRead,
    vk::PipelineStageFlagBits::eTransfer,
    vk::PipelineStageFlagBits::eComputeShader);

If I understand correctly, this puts a barrier between a memory transfer operation (host to GPU) and shader read operations.#
Two things are uncertain about this approach:

When chaining two operations together when the 2nd consumes the tensor(s) produced by the previous one, the barrier access and stages flags should probably be:

tensor->recordPrimaryBufferMemoryBarrier(
    commandBuffer,
    vk::AccessFlagBits::eShaderWrite,
    vk::AccessFlagBits::eShaderRead,
    vk::PipelineStageFlagBits::eComputeShader,
    vk::PipelineStageFlagBits::eComputeShader);

Current implementation inserts a barrier for each tensor of the algorithm regardless of whether it is read by the kernel, or actually written only. I suppose, for the write-only tensors the barrier should not be needed.

Memory leak in Algorithm destructor

The original implementation of Algorithm::destroy() does not free the push and specialization constants data - the code is commented.
This however leads to a memory leak as the data do not seem to be freed as described in the comment.

archie3d / kompute Goto Github PK

kompute's People

Contributors

kompute's Issues

Allow creating tensors without initial host data

Add fill tensor operation

Pipeline barrier flags for chained operations

Memory leak in Algorithm destructor

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs