cg-tuwien / auto-vk Goto Github PK
View Code? Open in Web Editor NEWAfterburner for Vulkan development; Auto-Vk is a modern C++ low-level convenience and productivity layer atop Vulkan-Hpp.
License: MIT License
Afterburner for Vulkan development; Auto-Vk is a modern C++ low-level convenience and productivity layer atop Vulkan-Hpp.
License: MIT License
With the official VK_KHR_ray_tracing
extension, additional buffer usage flags have been introduced (and vk::BufferUsageFlagBits::eRayTracingKHR
has been removed).
Add new meta data for the new buffer usages:
avk.cpp#L6309
)avk.hpp#L256
)Can't compile with -Wall flag. There's a lot of warnings and some errors.
Which one to use - eShaderDeviceAddressKHR
or eShaderDeviceAddress
?
Why is there a KHR
version anyways? Can't we just use the non-extension version?
Currently, eShaderDeviceAddressKHR
is used (probably) everywhere. => Choose one and unify usage.
Instead of evaluating the header version, i.e. e.g.
#if VK_HEADER_VERSION >= 135
the following would be more appropriate for such cases:
#if defined(VK_VERSION_1_2)
Definition of done:
#if VK_HEADER_VERSION
-fashion has been investigated if it is a feature that has been added to the standard with SDK version 1.2. If so, it has been replaced with #if defined(VK_VERSION_1_2)
.Modern GPUs typically have a (relatively small) memory region which is both: vk::MemoryPropertyFlagBits::eDeviceLocal | vk::MemoryPropertyFlagBits::eHostCoherent
. See the following output of context().print_available_memory_types();
:
INFO: ========== MEMORY PROPERTIES OF DEVICE 'NVIDIA GeForce RTX 3070' | file[avk.cpp] line[118]
INFO: ------------------------------------------------------------- | file[avk.cpp] line[119]
INFO: HEAP TYPES: | file[avk.cpp] line[120]
INFO: heap-idx | bytes | heap flags | file[avk.cpp] line[121]
INFO: ------------------------------------------------------------- | file[avk.cpp] line[122]
INFO: 0 | 8,433,696,768 | { DeviceLocal } | file[avk.cpp] line[139]
INFO: 1 | 17,141,145,600 | {} | file[avk.cpp] line[139]
INFO: 2 | 224,395,264 | { DeviceLocal } | file[avk.cpp] line[139]
INFO: ============================================================= | file[avk.cpp] line[141]
INFO: MEMORY TYPES: | file[avk.cpp] line[142]
INFO: mem-idx | heap-idx | memory propety flags | file[avk.cpp] line[143]
INFO: ------------------------------------------------------------- | file[avk.cpp] line[144]
INFO: 0 | 1 | {} | file[avk.cpp] line[154]
INFO: 1 | 0 | { DeviceLocal } | file[avk.cpp] line[154]
INFO: 2 | 1 | { HostVisible | HostCoherent } | file[avk.cpp] line[154]
INFO: 3 | 1 | { HostVisible | HostCoherent | HostCached } | file[avk.cpp] line[154]
INFO: 4 | 2 | { DeviceLocal | HostVisible | HostCoherent } | file[avk.cpp] line[154]
INFO: ============================================================= | file[avk.cpp] line[156]
The problem is just, that enum struct memory_usage
only supports device
OR host_coherent
, but not the combination of both.
Furthermore, the framework does probably disregard which memory region exactly a memory_usage::device
is allocated from (in the example above, the options would be memory regions at heap indices 0
or 2
, but the framework does probably not really care which one it chooses -- but it should!)
memory_usage::device
should allocate from the memory region that has the flag vk::MemoryPropertyFlagBits::eDeviceLocal
, but does not have vk::MemoryPropertyFlagBits::eHostCoherent
! Furthermore, this flag should probably be renamed into memory_usage::device_local
!
memory_usage::host_coherent
should allocate from the memory region that has the flag vk::MemoryPropertyFlagBits::eHostCoherent
, but does not have vk::MemoryPropertyFlagBits::eDeviceLocal
!
And a new memory_usage::device_local_and_host_coherent
should allocate from the memory region that has both flags: vk::MemoryPropertyFlagBits::eDeviceLocal | vk::MemoryPropertyFlagBits::eHostCoherent
. The question is just if there is a better name for this. (I think, there was a dedicated name for this memory region, but I can't remember.)
...to enable an image view with different format than the image
When using image_view_t::as_storage_image
or similar (intended to be temporary) objects of the pattern "as_something", the following situation can occur:
An image_view_as_storage_image
stores the vk::ImageView
handle internally, and its lifetime might extend the lifetime of the image_view
it has stored the handle from. That's not good. Optimally, the framework would prevent such usages.
One option would be that the image_view_as_storage_image
co-owns the image_view
which would imply (for most cases, probably) that enable_shared_ownership()
is being applied to an image_view
. That is a potentially huge overhead for an operation which produces an object that shall only be used as a temporary object anyways. So, what to do?
Best option would probably be to modify image_view_as_storage_image
in a way so that it can not be stored somewhere but can only be used as a temporary object. Maybe all functions/methods that take a parameter of type image_view_as_storage_image
shall take it as rvalue only (i.e. image_view_as_storage_image&&
). If, furthermore, its move constructor and move assignment operator would be disabled, that might prevent the user from std::move
-ing it into those functions/methods that take an image_view_as_storage_image&&
parameter.
This shall be implemented for the following types:
class image_view_as_input_attachment
class image_view_as_storage_image
class buffer_descriptor
class buffer_view_descriptor
shader_binding_table_ref
Pattern for disabling all kinds of stuff for type T
:
T() = delete;
T(T&&) noexcept = delete;
T(const T&) = delete;
T& operator=(T&&) noexcept = delete;
T& operator=(const T&) = delete;
~T() = delete;
Update:
With the introduction of the functions as_uniform_buffers
, as_uniform_texel_buffer_views
, as_storage_buffers
, as_storage_texel_buffer_views
, as_storage_images
, and as_input_attachments
in bindings.hpp
, the request to make those helper classes non-storable has become a bit more challenging, because they are stored in those functions.
Idea: Can constructors be declared as being friend
of something, so that only those functions may use them, but not "ordinary" users of the framework? I'm not a big fan of restricted usage either, but we need to ensure correctness, and if those classes do not own the resources they are referencing, we can not ensure correctness.
Hi,
I hope this request is not too outrageous, I greatly appreciate the work you are doing!
Might it be possible to integrate MoltenVK into / use it with Auto-VK?
I'm an absolutely clueless beginner, and I'm trying to write a cross-platform audio plug-in that displays a GLSL fragment shader in its window, like on shadertoys.com.
If you want to use pipeline barriers withing a renderpass or subpass, a self-dependency to the subpass the barrier is used in needs to be specified during renderpass creation. Details here.
This can currently be circumvented by using the alterConfigBeforeCreation feedback, but needs to be done with native Vulkan configs. A suitable abstraction (maybe in the renderpass_sync?) would be great for that.
In a few months time, just deprecate all previous Ray Tracing code (i.e. the original NVIDIA version and also the VK_KHR_ray_tracing
beta version from header version 135
)
Only support official VK_KHR_ray_tracing
which was introduced with header version 162
.
=> Search for all
#if VK_HEADER_VERSION >= 162
#else
#endif
patterns and delete all the #else
branches!
The shader binding table (SBT) is created right after the ray tracing pipeline is created by the means of vk::Device::createRayTracingPipelineKHR
and put into the following buffer:
result.mShaderBindingTable = create_buffer(
memory_usage::host_coherent,
vk::BufferUsageFlagBits::eRayTracingKHR,
generic_buffer_meta::create_from_size(shaderBindingTableSize)
);
This has been done because then there needs to be no synchronization. However, host_coherent
is suboptimal => it should be device
-memory. In order to store the SBT-entries in device
-memory, synchronization has to be added.
I would like to ask or suggest to add support for memory export in OpenGL. Namely, built-in support for GL_EXT_external_objects
, EXT_external_objects_win32
, EXT_external_objects_fd
… in short, all to support interoperability with OpenGL. Of course, I can try to implement it myself, but it already depends on whether such a Pull Request is accepted, whether I can do it, and how much it will change the code…
As pointed out in some C++ talk that I'll have to find once again returning by reference or even const-reference can be very dangerous. Because if the returnee dies, the reference is dangling. Therefore, one should always return by value.
There are some exceptions to this rule where it is okay to return by reference --- namely when it can be ensured with certainty that the returnee outlives anything that could be done with a returned reference. Specifically, that applies to:
root::physical_device
root::device
root::dynamic_dispatch
root::memory_allocator
In many cases (e.g. returning a handle) return by value has probably exactly the same performance cost as returning the reference. I.e. in many cases, you don't get anything positive from returning a reference. In some cases, return by reference can be cheaper: e.g. when returning a vector of something. However, correctness is always more important than performance.
Actually, there is also a potential "security" risk: By const_cast
-ing away the const
of a const T&
, one could modify the internal state through the returned reference.
Definition of Done:
Hi,
Would you mind adding a .gitignore?
I get modified: gears_vk (untracked content)
by git status
when I build gears_vk because auto_vk will produce an out/ directory.
As auto_vk is a submodule, I do not wish to change anything inside, and an external .gitignore has no effect on folders/files inside submodules.
Thanks so much!
In most places, const auto&
is returned, but handles are just (64-bit, I think) integers, so returning auto
would also be a good option. What's better?
Example: vk::Image handle()
instead of const vk::Image& handle()
...so that one can build fewer instances as the maximum number.
Also ensure that numInstances
is less than the maximum.
std::thread::id
is used in descriptor_cache.hpp
, but <thread>
is not included in avk.hpp
.
The current avk::image_usage
is somewhat inflexible and suboptimal design. Furthermore, it contains several flags which are no longer required due to image layout transitions now being explicitly handled by users (therefore, image_usage::read_only
, image_usage::presentable
, and image_usage::shared_presentable
can be removed).
Better: Refactor the existing enum struct image_usage
into the following format:
namespace image_usage
{
struct image_usage_data
{
vk::ImageUsageFlags mImageUsageFlags;
vk::ImageTiling mImageTiling;
vk::ImageCreateFlags mImageCreateFlags;
};
}
and let such a struct be created by users in a convenient manner in the style of avk::stage
or avk::access
, using operator|
or operator+
to let the user conveniently put together such a configuration.
Furthermore (and this is probably the big part of work for this issue): Refactor all the places where enum struct image_usage
is used currently accordingly!
Hint: See extern std::tuple<vk::ImageUsageFlags, vk::ImageTiling, vk::ImageCreateFlagBits> to_vk_image_properties(avk::image_usage aImageUsage)
for further details!
Implement a method similar to command_buffer_t::draw_vertices
and command_buffer_t::draw_indexed
that performs an indirect draw call. The indirect draw call data is stored in a buffer that must be passed to vk::CommandBuffer::drawIndexedIndirect
.
Furthermore, an additional buffer meta data will have to be added (=> buffer_meta.hpp
) which stores all the usual offsets, sizes, and numbers, and it will demand vk::BufferUsageFlagBits::eIndirectBuffer
usage flags for the buffer.
Static friend declarations (e.g. equality operator in descriptor_set) are not valid in Clang and thus Auto-Vk does not compile.
"error: 'static' is invalid in friend declarations"
The image_t currentLayout and targetLayout is not always set to the actual current layouts, which makes it hard to perform pipeline barriers (image memory barrier) inside of a renderpass for example, as these layouts are likely undefinded in that case. (unless some previous step sets the layout correctly)
Layout tracking probably needs to be reworked to reflect the correct layout at the time of calling the getter.
Current workaround is to set the layout manually before making a call to the barrier or other functions that rely on the layout.
Or you can always perform the direct Vulkan calls manually.
....which actually should be named "descriptor_bindings.hpp"
Add comments explaining what all the code in that file can be used for and how. Add usage examples!
During the cg_base
-> Auto-Vk
+ Gears-Vk
refactoring, descriptor sets are now generally passed via copy. Concretely, they are passed around via std::vector<descriptor_set>
.
descriptor_set
stores several members of the kind std::vector<...>
. Copying them might introduce substantial and probably unnecessary overheard. It would most likely be beneficial to convert all of them into std::shared_ptr<std::vector<...>>
.
The avk::descriptor_cache
shall be usable from parallel threads. This is partly already prepared by the means of different descriptor_pools
being used from different threads (see: std::unordered_map<std::thread::id, std::vector<std::weak_ptr<descriptor_pool>>> mDescriptorPools;
and avk::descriptor_cache::get_descriptor_pool_for_layouts
).
What needs to be added is a std::mutex
that synchronizes access.
std::mutex
to class descriptor_cache
std::scoped_lock<std::mutex>
)To test this issue, cg-tuwien/Gears-Vk#41 would have to be implemented.
Definition of done:
std::mutex
-based synchronization has been added to avk::descriptor_cache::get_descriptor_pool_for_layouts
.avk::descriptor_cache
and if so, the necessary measures have been taken.descriptor_cache
from parallel threads has been tested (e.g. by using a parallel_invoker
)...instead, sync::not_required()
is appropriate!
Definition of done:
buffer_t::fill
Use #if VK_HEADER_VERSION > 141
or similar preprocessor statements to make Auto-Vk compatible with SDK version 1.1. Some features (e.g. the VK_KHR_ray_tracing
extension) have been added with Vulkan 1.2 and were not available before
Concepts are a new feature introduced into C++20. Make yourself familiar with the topic. You can get an overview of it in the following video:
At many places in the framework, SFINAE are used which basically serve the same purpose, but are more complicated to use. The SFINAE classes are the following:
class is_dereferenceable
class has_resize
class has_size_and_iterators
class has_nested_value_types
Definition of done:
std::enable_if
. Replace them with Concepts => i.e. use the newly introduced C++20 requires
keyword....and replace with command
/commands
/work
.
nuff said
For vkCmdBindVertexBuffers
, which is invoked from the following methods:
command_buffer_t::draw_indexed
command_buffer_t:draw_vertices
,for each vertex buffer passed to those methods, an offset of 0
is set. There should be other offsets possible. But where to define those offsets? Meta data does not seem right. => Probably in the input_description
.
Some functions/methods are taking an arbitrary number of arguments that should all be of the same type. An example of such a method is command_buffer_t::draw_indexed
. It accepts any type, but optimally, it would only accept any number of arguments each of the SAME type, namely const buffer_t&
.
Evaluate if it is possible (with C++20 ?!) to specify that all variadic arguments shall have the same type!
Article that describes this issue in general: Fluent C++: How to Define a Variadic Number of Arguments of the Same Type
C++ Proposal: Homogeneous variadic function parameters
In the case of command_buffer_t::draw_indexed
, the big advantage would be that -- for instance in the model_loader
example -- the buffers could not only be passed like follows:
cmdbfr->draw_indexed(
*drawCall.mIndexBuffer,
*drawCall.mPositionsBuffer, *drawCall.mTexCoordsBuffer, *drawCall.mNormalsBuffer
);
but also without explicitly invoking ak::owning_resource::operator*
i.e. like follows:
cmdbfr->draw_indexed(
drawCall.mIndexBuffer,
drawCall.mPositionsBuffer, drawCall.mTexCoordsBuffer, drawCall.mNormalsBuffer
);
AND even a mixture would be possible because the implicit cast operator ak::owning_resource::operator const T&
would be invoked automatically by the compiler. The following invocation would then also be viable:
cmdbfr->draw_indexed(
drawCall.mIndexBuffer,
*drawCall.mPositionsBuffer, drawCall.mTexCoordsBuffer, *drawCall.mNormalsBuffer
);
...or any other combination of passing const owning_resource<buffer_t>&
or const buffer_t&
for that matter since all would be cast to const buffer_t&
since the method declaration would state the type explicitly.
Initializer Lists?
An alternative would maybe be to look into using std::initializer_list
but it would have to capture references and pass them on. Not sure if would lead to nice syntax. In the optimal case, with an std::initializer_list
, the following might be possible:
cmdbfr->draw_indexed(
drawCall.mIndexBuffer,
{ *drawCall.mPositionsBuffer, drawCall.mTexCoordsBuffer, *drawCall.mNormalsBuffer }
);
... and avk::read
does not perform such a barrier. Not sure yet, if this issue is a bug or an enhancement, though -- i.e., if it is the responsibility of avk::read
to perform the barrier or if it is the user's. But this should be investigated.
Anyways, Vulkan synchronization examples have the following example:
CPU read back of data written by a compute shader
This example shows the steps required to get data written to a buffer by a compute shader, back to the CPU.
vkCmdDispatch(...);
VkMemoryBarrier2KHR memoryBarrier = {
...
.srcStageMask = VK_PIPELINE_STAGE_2_COMPUTE_SHADER_BIT_KHR,
.srcAccessMask = VK_ACCESS_2_SHADER_WRITE_BIT_KHR,
.dstStageMask = VK_PIPELINE_STAGE_2_HOST_BIT_KHR,
.dstAccessMask = VK_ACCESS_2_HOST_READ_BIT_KHR};
VkDependencyInfoKHR dependencyInfo = {
...
1, // memoryBarrierCount
&memoryBarrier, // pMemoryBarriers
...
}
vkCmdPipelineBarrier2KHR(commandBuffer, &dependencyInfo);
vkEndCommandBuffer(...);
vkQueueSubmit2KHR(..., fence); // Submit the command buffer with a fence
Currently, the build
/update
methods of both, class bottom_level_acceleration_structure_t
and class top_level_acceleration_structure_t
, support only "pointer to an array"-style input.
Add support for:
The relevant code that needs to be updated/extended in order to support these different built/update types is contained in the methods:
top_level_acceleration_structure_t::build_or_update
bottom_level_acceleration_structure_t::build_or_update
More information about "array of pointers" input format can be found here: VkAccelerationStructureBuildGeometryInfoKHR => the description of VkBool32 geometryArrayOfPointers;
More information about host builds of acceleration structures can be found here: 36.5. Host Acceleration Structure Operations
Definition of done:
assert(sizeof(VkAabbPositionsKHR) == result.mSizeOfOneElement);
in buffer_meta.hpp#L833 is really appropriate.assert(sizeof(VkAccelerationStructureInstanceKHR) == result.mSizeOfOneElement);
in buffer_meta.hpp#L912 is really appropriate.So I have encountered difficulties when using this library. To be specific, I find it difficult integrating this into my project. I think the lack of examples and documentation probably pushes people away from auto vk, even though it looks like a really nice library.
CPPLINQ must be removed due to an incompatible license. Sad story.
However, there is the perfect replacement for it: C++20 Ranges
A quick overview about C++20 ranges is given in the following video:
Definition of done:
operator>>
has been used (probably better to search for usages of CPPLINQ's from
, or just compile after removing it and fix the errors), avk.cpp#L3952
, avk.cpp#L4100
, avk.cpp#L4122
, avk.cpp#L4161
, avk.cpp#L4181
,include/
std::basic_string<CharT,Traits,Allocator>::ends_with (used in avk.cpp
) is a C++20 feature.
There has been some refactoring w.r.t. the binding of buffer_view
s:
class buffer_view_descriptor
was introducedas_uniform_texel_buffer
and as_storage_texel_buffer
has been removed from buffer
as_uniform_texel_buffer_view
and as_storage_texel_buffer_view
have been added to buffer_view
as_uniform_texel_buffer_views
and as_storage_texel_buffer_views
have been adapted accordingly, and so has been struct binding_data
and all associated functions/methods.It is a bit unclear if the structure is flawless/faultless.
Double check the actual descriptor infos that are written, where the data is stored, how the data is gathered, and if the vk::BufferView
bindings are created properly.
This can best be investigated in the "ray_tracing_triangle_meshes" examples, because it uses buffer_view
s. See if everything still works, then bind explicitly using as_uniform_texel_buffer_views
and also try to bind as_storage_texel_buffer_views
(that will need some adaption in shaders).
VMA is included in the most recent Vulkan SDKs! We should (probably?!) use that version instead of the one which comes bundled with this repository: see file vk_mem_alloc.h
Make the necessary configuration changes!
Something seems to be confusing about
struct input_binding_to_location_mapping
{
vertex_input_buffer_binding mGeneralData;
buffer_element_member_meta mMemberMetaData;
};
Why are the parameters stride on the one hand, and offset and format on the other hand distributed to different structs?
But maybe it's okay, because that distribution is also present when compiling the configuration for the graphics pipeline in graphics_pipeline root::create_graphics_pipeline
=> steps 1 and 2, when the vk::VertexInputBindingDescription
and vk::VertexInputAttributeDescription
elements are compiled.
Everything is working, so probably it's okay.
But just double check everything once again; especially if the data structures in input_binding_to_location_mapping
are okay as they are or if they could maybe be optimized somehow.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.