try / tempest Goto Github PK
View Code? Open in Web Editor NEW3d graphics engine
License: MIT License
3d graphics engine
License: MIT License
This ticket is to track ideas/known solutions to GPU-driven.
Use VK_NV_mesh_shader
as starting point. And build some emulation layer to enable mesh-shader on wider range of hardware.
In version 1.22 of openal-soft they forgot to add an extra include for FreeBSD.
It was fixed in this commit kcat/openal-soft@b26ca6b and improved here kcat/openal-soft@2f3acdf.
I fixed it temporarily with this patch, but an update to 1.23 would be the better option.
+++ b/Engine/thirdparty/openal-soft/core/rtkit.cpp
@@ -41,6 +41,9 @@
#include <unistd.h>
#include <sys/types.h>
#include <sys/syscall.h>
+#if defined(__FreeBSD__)
+#include <sys/thr.h>
+#endif
This seems to be a strictly firmware-level bug, but it doesn't manifest in DXVK's generated mesh shader, nor in other Vulkan users of it, so it's possible Tempest does something odd to trigger it.
https://gitlab.freedesktop.org/mesa/mesa/-/issues/10360
A small reproduction example or ideas why this might happen are appreciated.
This was mainly tested with the Steam Overlay. Normally if you add a game as a non-steam game, it'll try and load up the overlay as well. In this case, it never appears.
I attempted to add an activation of the layer, which the logs seem to show that the layer was being enabled. A link to a patch showing what was changed: 0001-Attempt-at-getting-steam-overlay-enabled.zip
I'm wondering if it might have something to do with the hidden window being created and then destroyed, as in possibly the overlay layer library is seeing that thinking it needs to inject into it and getting confused.
Was curious if you had any pointers or had ideas as to a fix.
Time for vkQueuePresentKHR in my laptop is sometimes 20-50 ms.
After reboot it back to normal: 0-2 ms - so can be issue with my machine.
Anyway presentation engine should be reviewed.
Samsung present thread practice:
https://www.khronos.org/assets/uploads/developers/library/2019-vulkanised/02_Live%20Long%20And%20Optimise-May19.pdf
vkAcquireFullScreenExclusiveModeEXT worth to be implemented for windows
Official open-source headers:
https://github.com/microsoft/DirectX-Headers
Currently Pixmap::Format doesn't math TextureFormat
, possibly can simplify into single enum
Things to do:
VectorImage
into VectorImage
, as CPU data + VectorImage::Mesh
, as GPU data.Uniforms
to something else, probably DescriptorSet
Things to think about:
VK_EXT_inline_uniform_block
Bindless is quite messy in every api, so need to design nice top-level api with reasonable underlying implementation.
GLSL is main language in Tempest, so dedicated section is must. GLSL features 2 ways:
layout(binding = 0) uniform sampler2D tex[]; // unbound array of textures
layout(binding = 1) uniform sampler2D img[]; // another unbound array of textures
layout(binding = 1, std140) readonly buffer Input {
vec4 val[];
} ssbo[]; // unbound array of buffers
std::vector<const Tempest::Texture2d*> ptex(tex.size());
for(size_t i=0; i<tex.size(); ++i)
ptex[i] = &tex[i];
auto desc = device.descriptors(pso);
desc.set(0,ptex); // taking vector or c-array
Doesn't fit the engine perfectly - need to add support for sampler and textures(non-combined) on top of it.
Caps-list:
VkPhysicalDeviceDescriptorIndexingFeatures::runtimeDescriptorArray; // support for unbound array declaration (tex[])
// Support of nonuniformEXT, per resource-type
VkPhysicalDeviceDescriptorIndexingFeatures::shaderUniformBufferArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderSampledImageArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderStorageBufferArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderStorageImageArrayNonUniformIndexing;
VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT
can be used (in theory), but only for the very last binding in descriptor set, what doesn't fit GLSL side.
Alternatively, it's sufficient to use VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT
with very-large descriptor array. Size of array has to be defined in C++ upfront, at VkDescriptorSetLayout
creation.
Current implementation of Tempest can recreate VkDescriptorSetLayout
and VkDescriptorSet
on a go, if preallocated array is not big enough. But it also requires reallocation of VkPipeline
, at runtime, based of descriptor set size - this is hard to implement without extra performance cost.
VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT
- useless by itself, but there is a special behavior for this type of descriptors in spec:
... layouts which may be much higher than the pre-existing limits. The old limits only count descriptors in non-updateAfterBind descriptor set layouts, and the new limits count descriptors in all descriptor set layouts in the pipeline layout.
maxUpdateAfterBindDescriptorsInAllPools = 500,000+ // Eh, probably can't do anything sensible about it
maxPerStageUpdateAfterBindResources = 500,000+
maxPerStageDescriptorUpdateAfterBindSamplers = 500,000+
maxPerStageDescriptorUpdateAfterBindUniformBuffers = 12+
maxPerStageDescriptorUpdateAfterBindStorageBuffers = 500,000+
maxPerStageDescriptorUpdateAfterBindSampledImages = 500,000+
maxPerStageDescriptorUpdateAfterBindStorageImages = 500,000+
maxPerStageDescriptorUpdateAfterBindAccelerationStructures = 500,000+
maxDescriptorSetUpdateAfterBindSamplers = 500,000+
maxDescriptorSetUpdateAfterBindUniformBuffers = 72+ // n × PerStage
maxDescriptorSetUpdateAfterBindStorageBuffers = 500,000+
maxDescriptorSetUpdateAfterBindSampledImages = 500,000+
maxDescriptorSetUpdateAfterBindStorageImages = 500,000+
maxDescriptorSetUpdateAfterBindAccelerationStructures = 500,000+
Naturally as there is only single descriptor-set, can just take min of PerStage
and DescriptorSet
limits.
Other limits to concern (obsolete):
VkPhysicalDeviceLimits::maxPerStageDescriptorSamplers = 16+;
VkPhysicalDeviceLimits::maxPerStageDescriptorUniformBuffers = 12+;
VkPhysicalDeviceLimits::maxPerStageDescriptorStorageBuffers = 4+;
VkPhysicalDeviceLimits::maxPerStageDescriptorSampledImages = 16+;
VkPhysicalDeviceLimits::maxPerStageDescriptorStorageImages = 4+;
VkPhysicalDeviceLimits::maxPerStageResources = 128^2+;
VkPhysicalDeviceLimits::maxDescriptorSetSamplers = 96^8+;
VkPhysicalDeviceLimits::maxDescriptorSetUniformBuffers = 72^8+;
VkPhysicalDeviceLimits::maxDescriptorSetStorageBuffers = 24^8+;
VkPhysicalDeviceLimits::maxDescriptorSetSampledImages = 96^8+;
VkPhysicalDeviceLimits::maxDescriptorSetStorageImages = 24^8+;
With such limits, realloc
has to manage per-stage + per-resource + per_set limit somehow.
Note: Tempest uses spirv-cross to generate HLSL, except produced HLSL is not valid:
// error: more than one unbounded resource (ssbo and tex) in space 0
ByteAddressBuffer ssbo[] : register(t1, space0);
Texture2D<float4> tex[] : register(t0, space0);
SamplerState _tex_sampler[] : register(s0, space0);
RWTexture2D<unorm float4> ret : register(u2, space0);
Apparently spirv-cross follows VARIABLE_DESCRIPTOR_COUNT
workflow. This maps directly to
D3D12_DESCRIPTOR_HEAP_DESC::NumDescriptors = -1
with same limitation of only one runtime array per set. I theory can workaround with instrumenting spir-v:
OpDecorate %tex DescriptorSet 0 -> OpDecorate %tex DescriptorSet UNIQ_SPACE
Limits:
Resources Available to the Pipeline | Tier 1 | Tier 2 | Tier 3 |
---|---|---|---|
Feature levels | 11.0+ | 11.0+ | 11.1+ |
Maximum number of descriptors in a CBV/SRV/UAV heap used for rendering | 1,000,000 | 1,000,000 | 1,000,000+ |
Maximum number of CBV in all descriptor tables per shader stage | 14 | 14 | full heap |
Maximum number of SRV in all descriptor tables per shader stage | 128 | full heap | full heap |
Maximum number of UAV in all descriptor tables per shader stage | 64 for feature levels 11.1+ 8 for feature level 11 | 64 | full heap |
Maximum number of Samplers in all descriptor tables per shader stage | 16 | 2048 | 2048 |
ID3D12GraphicsCommandList::SetDescriptorHeaps
Only one descriptor heap of each type can be set at one time, which means a maximum of 2 heaps (one sampler, one CBV/SRV/UAV) can be set at one time.
DX12 is a bit awkward, because limit is shared for all types of descriptors, except sampler. Probably can "just" split heap in equal partitions.
Limits (per-app resources available at any given time are):
Resources Available to the Pipeline | Tier1(ios) | Tier1 | Tier2 |
---|---|---|---|
Buffers(and TLAS'es) | 31 | 64 | 500,000 |
Textures | 31 | 128 | 500,000 |
Samplers | 16 | 16 | 2048 |
For both tiers, the maximum number of argument buffer entries in each function argument table is 8.
*Writable textures aren’t supported within an argument buffer.
Tier 1 argument buffers can’t be accessed through pointer indexing, nor can they include pointers to other argument buffers.
Tier 2 argument buffers can be accessed through pointer indexing, as shown in the following example.
T1 argument are practically same as descriptor-set's in vulkan and have nothing usefull in it.
T2 allows for pointer-indexing and can be leveraged for bindless-array.
Sources:
https://gist.github.com/DethRaid/0171f3cfcce51950ee4ef96c64f59617
https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_descriptor_range
https://learn.microsoft.com/en-us/windows/win32/direct3d12/hardware-support?redirectedfrom=MSDN
https://developer.apple.com/documentation/metal/buffers/about_argument_buffers
https://developer.apple.com/documentation/metal/buffers/managing_groups_of_resources_with_argument_buffers
Unbound array of descriptors has 2 meanings:
Base spec:
uniform sampler2D tex[]
-> OpTypeArray %8 %uint_1
size of array depend on highest index that been used in code.
GL_EXT_nonuniform_qualifier
:
May work same as base spec, if runtime-index is not in use, and otherwise:
uniform sampler2D tex[]
->OpTypeRuntimeArray %8
// legal only if driver supports descriptor-indexing
[wip]
Generally metal-like model is good middle ground:
maxUAV = 500'000; // ssbo + tlas + imageStore
maxTextures = 500'000;
maxSamplers = 2048;
// can skip maxUbo - hard in vulkan and not very usefull
// combined image consumes both Texture and Samplers limits
In DX UAX/Tex - can be achieved by splitting heap in 2 parts
In Vulkan UAV is probably min for all applicable resources
API:
drawIndirect
drawIndexedIndirect
dispath
dispatchMesh
GLSL:
gl_VertexIndex
and gl_InstanceIndex
gl_NumWorkGroups
Alignment:
offset
must be multiple of 4 in Vulkan
offset
must be multiple of 4 in DX12 (https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12graphicscommandlist-executeindirect#remarks)
offset
in Metal "check for offset alignment requirements for buffers in device and constant address space." - not documented(?!)
I've noticed two warnings when compiling on Linux Fedora 32 using GCC 10.2.1:
/OpenGothic/lib/MoltenTempest/Engine/formats/image/pixmapcodeccommon.cpp: In function ‘void stbiSkip(void*, int)’:
/OpenGothic/lib/MoltenTempest/Engine/formats/image/pixmapcodeccommon.cpp:30:41: warning: ignoring return value of function declared with attribute ‘warn_unused_result’ [-Wunused-result]
30 | reinterpret_cast<IDevice*>(user)->seek(size_t(n));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~
/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vulkanapi_impl.cpp: In constructor ‘Tempest::Detail::VulkanApi::VulkanApi(bool)’:
/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vulkanapi_impl.cpp:36:51: warning: assignment from temporary ‘initializer_list’ does not extend the lifetime of the underlying array [-Winit-list-lifetime]
36 | validationLayers = checkValidationLayerSupport();
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
The UniqueHandle
class in Vulkan.hpp is not declared, because of the definition
add_definitions(-DVULKAN_HPP_NO_SMART_HANDLE)
, but will be used later (line 13075).
I fixed it by just adding the definition
add_definitions(-DVULKAN_HPP_DISABLE_IMPLICIT_RESULT_VALUE_CAST)
to skip the code. Not sure if that's okay, though.
I'm using the vulkan-headers 1.2.137-1.
Otherwise GUI app dont show any text by default.
Some small free font, like ProggyClean.ttf
or Roboto
, will be great to bake into *.dll
Push constants size mismatch is most common issues now. Need to do some safety/asset mechanism.
a) Push constant size can be asserted vs expected size
b) Push buffer size may not match between msl and glsl
In the engine we see the following code for 1D to 4D points:
Tempest/Engine/utility/utility.h
Line 120 in 1f83234
However manhatten length is defined by
std::abs(x) + std::abs(y) + std::abs(z)
The length currently implemented is the euclidean length, as it should be. I recommend to simply rename it to
T Length() const { return T(std::sqrt(x*x+y*y+z*z)); }
Trying to type non-ASCII characters (e.g. ü ö ä
) in the G2 save menu causes the game to freeze. It enters an infinite loop here, because l
remains 0
:
Tempest/Engine/utility/textcodec.cpp
Lines 47 to 49 in 783c650
ü
gives the code 0xfc on Linux since XLookupString
returns latin-1 encoding, not utf8.Fix: nothings/stb#835
Need to update stb version once the PR is merged to main-repo of stb
Engine is throwing SIGSEGV in copyUpsample
when trying first memset
on line 16.
In Vulkan/Dx12 R32UINT
and R32INT
images always support atomic operations.
In Metal it's complicated:
Ruins opportunity for many instanced techniques.
Affeected:
gl_BaseInstance
// zero
gl_InstanceIndex
// starts with zero
ss-command buffers are only option on MacOS, so we need to enfoce them (or emulate reusable buffers on Mac and iOS)
- device.submit(cmd,sync);
+ device.submit(std::move(cmd),sync);
Since 28e8515 the Mouse wheel is not behaving correctly.
Prior to the commit when using the wheel it behaved like forward and backwards movement (W and S).
Now seems to also do a motion and a selection. I had not time to test this further.
I will investigate when I have more time.
There is spirv to HLSL compiler issue, causing OpenGothic(DX12) to fail at load-time.
Full description: KhronosGroup/SPIRV-Cross#1645
Currently shader registers are set to be automatic after cross compilation:
cbuffer SPIRV_Cross_VertexInfo
{
int SPIRV_Cross_BaseVertex;
int SPIRV_Cross_BaseInstance;
};
Yet, there is a problem: so far no good way to guess what register is it going to be.
Attempt #1 - find minimal value across unused ones. Normally it's a same as max_bind+1. - Doesn't work, if shader has unused bindings.
Attempt #2 - use reflection api.
ComPtr<IDxcUtils> pUtils;
hr = DxcCreateInstance(CLSID_DxcUtils, __uuidof(IDxcUtils), reinterpret_cast<void**>(&pUtils.get()));
if(FAILED(hr))
return hr;
ComPtr<IDxcBlob> pReflectionData;
hr = result->GetOutput(DXC_OUT_REFLECTION, IID_PPV_ARGS(&pReflectionData.get()), nullptr);
if(FAILED(hr))
return hr;
// Create reflection interface.
DxcBuffer ReflectionData;
ReflectionData.Encoding = DXC_CP_ACP;
ReflectionData.Ptr = pReflectionData->GetBufferPointer();
ReflectionData.Size = pReflectionData->GetBufferSize();
ComPtr<ID3D12ShaderReflection> pReflection;
hr = pUtils->CreateReflection(&ReflectionData, IID_PPV_ARGS(&pReflection.get()));
if(FAILED(hr))
return hr;
D3D12_SHADER_INPUT_BIND_DESC desc = {};
pReflection->GetResourceBindingDescByName("SPIRV_Cross_VertexInfo",&desc);
vertexInfoBind = desc.BindPoint;
Hey!
Cool stuff you have here! :D I'm trying to compile on a pi4 just for fun. Is this actually possible? It had problems with sse2 before compiling squish but I just removed the commands. Currently I'm stuck at this. Coming from C# and still learning C++. Maybe you can help and try to describe what it means.
[ 55%] Building CXX object lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/gapi/vulkan/vcommandbuffer.cpp.o /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp: In member function ‘void Tempest::Detail::VCommandBuffer::changeLayout(VkImage, VkFormat, VkImageLayout, VkImageLayout, uint32_t, bool)’: /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:395:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:395:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:396:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:396:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:397:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:397:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:398:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:398:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:447:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:447:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:448:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:448:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:449:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:449:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:450:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:450:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL make[2]: *** [lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/build.make:585: lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/gapi/vulkan/vcommandbuffer.cpp.o] Fehler 1 make[1]: *** [CMakeFiles/Makefile2:1059: lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/all] Fehler 2 make: *** [Makefile:130: all] Fehler 2
Currently, the DDS codec only supports the DXT compression format:
Tempest/Engine/formats/image/pixmapcodecdds.cpp
Lines 35 to 57 in 8886578
This becomes an issue when trying to load RGB from DDS files since processing just fails. In Try/OpenGothic#271 this leads to the wrong image being loaded.
A solution specifically for the OpenGothic issue (apart from implementing support for this format) would be to allow the creation of a Pixmap
from raw bytes. This would require adding more formats (like R5G6B5
and BGR8
) to Pixmap::Format
.
So linux has at least two video backends that should be supported: x11 and wayland. You generally only know at runtime which one to use which means that the Vulkan instance extensions required to the WSI can be different depending on the Window type (x11/wayland). The code has to be restructured in a way that the vulkan instance creating already has access to the window.
The other point here is that you really should use SDL2 to replace the various SystemApi implementations.
Relevant APIs:
https://wiki.libsdl.org/CategoryInit
https://wiki.libsdl.org/CategoryVideo
https://wiki.libsdl.org/CategoryEvents
https://wiki.libsdl.org/CategoryVulkan
Job: https://ci.appveyor.com/project/Try/tempest/builds/43007972/job/c0ob1eqnofs8vmr5
[----------] 24 tests from DirectX12Api
[ RUN ] DirectX12Api.DirectX12Api
[ OK ] DirectX12Api.DirectX12Api (39 ms)
[ RUN ] DirectX12Api.Vbo
Microsoft Basic Render Driver
Microsoft Basic Render Driver
createBuffer 218
alloc 000002DFCEAF3E70 64
alloc 000002DFCEAF4CC0 65
unknown file: error: SEH exception with code 0x87a thrown in the test body.
[ FAILED ] DirectX12Api.Vbo (124 ms)
[ RUN ] DirectX12Api.VboInit
createBuffer 218
alloc 000002DFD03481C0 64
alloc 000002DFCEAF4CC0 65
Error is not reproducible locally and by RDP, could be an issue with Microsoft Basic Render Driver
This ticket is to track the ideas/solutions to pipeline barriers generation. Mostly it's about Vulkan perspective, yet DirectX12 is also to take in consideration.
All image resources assume a default read-to-read.
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
VK_PIPELINE_STAGE_{EARLY|LATE}_FRAGMENT_TESTS_BIT
(no real layout, just ready-to-read)
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT
dest stage is ALL_COMMANDS (except for Depth) for now.
Compute:
all storage resources are tracked individually; if current pipeline has a unordered access - new set of barriers has to be issues.
auto cmd = device.commandBuffer();
{
auto enc = cmd.startEncoding(device);
// Assume 'ready-to-read' state
enc.dispatch(...);
// tex1: ALL -> COLOR
enc.setFramebuffer({{tex1,Vec4(0,0,1,1),Tempest::Preserve}});
enc.draw(...);
// tex1: COLOR -> ALL
// tex2: ALL -> COLOR
enc.setFramebuffer({{tex2,Tempest::Discard,Tempest::Preserve}});
enc.draw(...);
// tex2: COLOR -> ALL
}
Optimize:
7.9. Host Write Ordering Guarantees
When batches of command buffers are submitted to a queue via a queue submission command, it defines a memory dependency with prior host operations, and execution of command buffers submitted to the queue.
This makes it easier on resource upload/uniform buffers side, yet still command buffer must assume any commands submitted before.
If vkCmdPipelineBarrier2KHR is recorded within a render pass instance, the synchronization scopes are limited to operations within the same subpass.
This may cause troubles, if barriers are delayed.
7.6.1. Subpass Self-dependency
vkCmdPipelineBarrier or vkCmdPipelineBarrier2KHR must not be called within a render pass instance started with vkCmdBeginRenderingKHR.
Since VK_KHR_dynamic_rendering
is a go-to extension, barriers must not be issued in renderpass.
This limitation basically blocks any split-barrier or partial-barrier approaches.
Vulkan api on this side is quite messy - need to design nice engine-level api.
Use-cases:
case 3 is native - DescriptorSet::set
case 2 almost works, as push constant (128 byte limit is an issue)
case 1 can work as push, except it doesn't fit there.
Starting a new game or loading a save with -dx12 causes loading to take about three minutes. Then it force closes with:
---crashlog(ExceptionFilter)---
GPU: NVIDIA GeForce RTX 2080
0x00007ff77f01c857: [unknown function] in [unknown module]
0x00007ff77f01c95b: [unknown function] in [unknown module]
0x00007ff77f01cb38: [unknown function] in [unknown module]
0x00007fff83db5dcc: UnhandledExceptionFilter in C:\Windows\System32\KERNELBASE.dll
0x00007fff8677859d: RtlMoveMemory in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff8675f047: _C_specific_handler in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff86773e1f: _chkstk in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff866eeae6: RtlFindCharInUnicodeString in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff86724af5: RtlRaiseException in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff83cefb1c: RaiseException in C:\Windows\System32\KERNELBASE.dll
0x00007fff78106720: CxxThrowException in C:\Windows\SYSTEM32\VCRUNTIME140.dll
0x00007fff1b02a998: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b03710d: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0378cf: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b030fda: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0b749a: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0b752d: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007ff77ef7d1bb: [unknown function] in [unknown module]
0x00007ff77ef7c360: [unknown function] in [unknown module]
0x00007ff77ef7c5ae: [unknown function] in [unknown module]
0x00007ff77efa5368: [unknown function] in [unknown module]
0x00007fff1b0cb46f: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0cb28b: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007ff77ef994e5: [unknown function] in [unknown module]
0x00007ff77f36310c: [unknown function] in [unknown module]
0x00007fff84ef244d: BaseThreadInitThunk in C:\Windows\System32\KERNEL32.DLL
0x00007fff8672df78: RtlUserThreadStart in C:\Windows\SYSTEM32\ntdll.dll
P.S. Are the WIP Windows builds MSVC2022 Debug or Release builds? These logs look a bit spartanic to me.
Based on #33
Initial implementation is practically working, this ticket is to track technical depth and for profiling work.
TODO:
flat
and other interpolatorsin uvec3 gl_WorkGroupID
polutionin uvec3 gl_NumWorkGroups
- polluted due to dispatch indirectin uvec3 gl_GlobalInvocationID
// polluted, since it is byproduct of gl_WorkGroupIDERR(wont't fix):
In complex workloads it getting annoying to sanitize zero-sized buffers (and avodi draw/dispatches)
Meanwhile on shader side (in term of algorithms) in most cases it perfectly fine to have .length()==0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.