GithubHelp home page GithubHelp logo

try / tempest Goto Github PK

View Code? Open in Web Editor NEW
78.0 8.0 23.0 17.97 MB

3d graphics engine

License: MIT License

CMake 1.34% C++ 87.30% Makefile 0.46% M4 0.45% C 0.94% Python 7.99% Shell 0.47% GLSL 0.21% Objective-C++ 0.85%
engine 3d vulkan cpp directx directx12 vulkan-engine

tempest's People

Contributors

accessory avatar alexjakegreen avatar axel-dd avatar christophhaag avatar d10sfan avatar errorflexxx avatar ezamelczyk avatar galyam avatar lmichaelis avatar muttleyxd avatar nindaleth avatar raphaelahrens avatar reveares avatar sebbestune avatar sjavora avatar swick avatar try avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tempest's Issues

GPU driven rendering

Porting to FreeBSD: openal-soft 1.22 does not build on FreeBSD

In version 1.22 of openal-soft they forgot to add an extra include for FreeBSD.
It was fixed in this commit kcat/openal-soft@b26ca6b and improved here kcat/openal-soft@2f3acdf.

I fixed it temporarily with this patch, but an update to 1.23 would be the better option.

+++ b/Engine/thirdparty/openal-soft/core/rtkit.cpp
@@ -41,6 +41,9 @@
 #include <unistd.h>
 #include <sys/types.h>
 #include <sys/syscall.h>
+#if defined(__FreeBSD__)
+#include <sys/thr.h>
+#endif

[Linux] Overlay Layers Do not Appear To Work

This was mainly tested with the Steam Overlay. Normally if you add a game as a non-steam game, it'll try and load up the overlay as well. In this case, it never appears.

I attempted to add an activation of the layer, which the logs seem to show that the layer was being enabled. A link to a patch showing what was changed: 0001-Attempt-at-getting-steam-overlay-enabled.zip

I'm wondering if it might have something to do with the hidden window being created and then destroyed, as in possibly the overlay layer library is seeing that thinking it needs to inject into it and getting confused.

Was curious if you had any pointers or had ideas as to a fix.

Rethinking a engine structure

Things to do:

  • Split VectorImage into VectorImage, as CPU data + VectorImage::Mesh, as GPU data.
  • Rename Uniforms to something else, probably DescriptorSet
  • Add readable depth texture

Things to think about:

  • Implement Metal-api
  • Reconsider design cross-queue/cpu synchronization
  • Investigate usefulness of VK_EXT_inline_uniform_block

Bindless support

Bindless is quite messy in every api, so need to design nice top-level api with reasonable underlying implementation.

GLSL

GLSL is main language in Tempest, so dedicated section is must. GLSL features 2 ways:

  1. Unbound array of descriptors. - nice and easy to use
  2. Device address. - not portable to metal; hard to track hazards
layout(binding = 0) uniform sampler2D tex[]; // unbound array of textures
layout(binding = 1) uniform sampler2D img[]; // another unbound array of textures
layout(binding = 1, std140) readonly buffer Input {
  vec4 val[];
  } ssbo[]; // unbound array of buffers

Engine-side

std::vector<const Tempest::Texture2d*> ptex(tex.size());
for(size_t i=0; i<tex.size(); ++i)
  ptex[i] = &tex[i];
auto desc = device.descriptors(pso);
desc.set(0,ptex); // taking vector or c-array

Doesn't fit the engine perfectly - need to add support for sampler and textures(non-combined) on top of it.

Vulkan

Caps-list:

VkPhysicalDeviceDescriptorIndexingFeatures::runtimeDescriptorArray; // support for unbound array declaration (tex[])
// Support of nonuniformEXT, per resource-type 
VkPhysicalDeviceDescriptorIndexingFeatures::shaderUniformBufferArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderSampledImageArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderStorageBufferArrayNonUniformIndexing;
VkPhysicalDeviceDescriptorIndexingFeatures::shaderStorageImageArrayNonUniformIndexing;

VK_DESCRIPTOR_BINDING_VARIABLE_DESCRIPTOR_COUNT_BIT can be used (in theory), but only for the very last binding in descriptor set, what doesn't fit GLSL side.
Alternatively, it's sufficient to use VK_DESCRIPTOR_BINDING_PARTIALLY_BOUND_BIT_EXT with very-large descriptor array. Size of array has to be defined in C++ upfront, at VkDescriptorSetLayout creation.
Current implementation of Tempest can recreate VkDescriptorSetLayout and VkDescriptorSet on a go, if preallocated array is not big enough. But it also requires reallocation of VkPipeline, at runtime, based of descriptor set size - this is hard to implement without extra performance cost.

VK_DESCRIPTOR_BINDING_UPDATE_AFTER_BIND_BIT - useless by itself, but there is a special behavior for this type of descriptors in spec:

... layouts which may be much higher than the pre-existing limits. The old limits only count descriptors in non-updateAfterBind descriptor set layouts, and the new limits count descriptors in all descriptor set layouts in the pipeline layout.

maxUpdateAfterBindDescriptorsInAllPools = 500,000+ // Eh, probably can't do anything sensible about it
maxPerStageUpdateAfterBindResources   = 500,000+

maxPerStageDescriptorUpdateAfterBindSamplers = 500,000+
maxPerStageDescriptorUpdateAfterBindUniformBuffers = 12+
maxPerStageDescriptorUpdateAfterBindStorageBuffers = 500,000+
maxPerStageDescriptorUpdateAfterBindSampledImages = 500,000+
maxPerStageDescriptorUpdateAfterBindStorageImages = 500,000+
maxPerStageDescriptorUpdateAfterBindAccelerationStructures = 500,000+

maxDescriptorSetUpdateAfterBindSamplers = 500,000+
maxDescriptorSetUpdateAfterBindUniformBuffers = 72+ // n × PerStage
maxDescriptorSetUpdateAfterBindStorageBuffers = 500,000+
maxDescriptorSetUpdateAfterBindSampledImages = 500,000+
maxDescriptorSetUpdateAfterBindStorageImages = 500,000+
maxDescriptorSetUpdateAfterBindAccelerationStructures = 500,000+

Naturally as there is only single descriptor-set, can just take min of PerStage and DescriptorSet limits.

Other limits to concern (obsolete):

VkPhysicalDeviceLimits::maxPerStageDescriptorSamplers = 16+;
VkPhysicalDeviceLimits::maxPerStageDescriptorUniformBuffers = 12+;
VkPhysicalDeviceLimits::maxPerStageDescriptorStorageBuffers = 4+;
VkPhysicalDeviceLimits::maxPerStageDescriptorSampledImages = 16+;
VkPhysicalDeviceLimits::maxPerStageDescriptorStorageImages = 4+;
VkPhysicalDeviceLimits::maxPerStageResources = 128^2+;

VkPhysicalDeviceLimits::maxDescriptorSetSamplers = 96^8+;
VkPhysicalDeviceLimits::maxDescriptorSetUniformBuffers = 72^8+;
VkPhysicalDeviceLimits::maxDescriptorSetStorageBuffers = 24^8+;
VkPhysicalDeviceLimits::maxDescriptorSetSampledImages = 96^8+;
VkPhysicalDeviceLimits::maxDescriptorSetStorageImages = 24^8+;

With such limits, realloc has to manage per-stage + per-resource + per_set limit somehow.

DirectX12

Note: Tempest uses spirv-cross to generate HLSL, except produced HLSL is not valid:

// error: more than one unbounded resource (ssbo and tex) in space 0
ByteAddressBuffer         ssbo[]        : register(t1, space0);
Texture2D<float4>         tex[]         : register(t0, space0);
SamplerState             _tex_sampler[] : register(s0, space0);
RWTexture2D<unorm float4> ret           : register(u2, space0);

Apparently spirv-cross follows VARIABLE_DESCRIPTOR_COUNT workflow. This maps directly to
D3D12_DESCRIPTOR_HEAP_DESC::NumDescriptors = -1 with same limitation of only one runtime array per set. I theory can workaround with instrumenting spir-v:
OpDecorate %tex DescriptorSet 0 -> OpDecorate %tex DescriptorSet UNIQ_SPACE

Limits:

Resources Available to the Pipeline Tier 1 Tier 2 Tier 3
Feature levels 11.0+ 11.0+ 11.1+
Maximum number of descriptors in a CBV/SRV/UAV heap used for rendering 1,000,000 1,000,000 1,000,000+
Maximum number of CBV in all descriptor tables per shader stage 14 14 full heap
Maximum number of SRV in all descriptor tables per shader stage 128 full heap full heap
Maximum number of UAV in all descriptor tables per shader stage 64 for feature levels 11.1+ 8 for feature level 11 64 full heap
Maximum number of Samplers in all descriptor tables per shader stage 16 2048 2048

ID3D12GraphicsCommandList::SetDescriptorHeaps
Only one descriptor heap of each type can be set at one time, which means a maximum of 2 heaps (one sampler, one CBV/SRV/UAV) can be set at one time.
DX12 is a bit awkward, because limit is shared for all types of descriptors, except sampler. Probably can "just" split heap in equal partitions.

Metal [3]

Limits (per-app resources available at any given time are):

Resources Available to the Pipeline Tier1(ios) Tier1 Tier2
Buffers(and TLAS'es) 31 64 500,000
Textures 31 128 500,000
Samplers 16 16 2048

For both tiers, the maximum number of argument buffer entries in each function argument table is 8.

*Writable textures aren’t supported within an argument buffer.
Tier 1 argument buffers can’t be accessed through pointer indexing, nor can they include pointers to other argument buffers.
Tier 2 argument buffers can be accessed through pointer indexing, as shown in the following example.

T1 argument are practically same as descriptor-set's in vulkan and have nothing usefull in it.
T2 allows for pointer-indexing and can be leveraged for bindless-array.

Sources:
https://gist.github.com/DethRaid/0171f3cfcce51950ee4ef96c64f59617
https://docs.microsoft.com/en-us/windows/win32/api/d3d12/ns-d3d12-d3d12_descriptor_range
https://learn.microsoft.com/en-us/windows/win32/direct3d12/hardware-support?redirectedfrom=MSDN
https://developer.apple.com/documentation/metal/buffers/about_argument_buffers
https://developer.apple.com/documentation/metal/buffers/managing_groups_of_resources_with_argument_buffers

GLSL

Unbound array of descriptors has 2 meanings:
Base spec:
uniform sampler2D tex[] -> OpTypeArray %8 %uint_1
size of array depend on highest index that been used in code.

GL_EXT_nonuniform_qualifier:
May work same as base spec, if runtime-index is not in use, and otherwise:
uniform sampler2D tex[] ->OpTypeRuntimeArray %8 // legal only if driver supports descriptor-indexing

Engine side

[wip]
Generally metal-like model is good middle ground:

maxUAV      = 500'000; // ssbo + tlas + imageStore
maxTextures = 500'000;
maxSamplers = 2048;
// can skip maxUbo - hard in vulkan and not very usefull
// combined image consumes both Texture and Samplers limits

In DX UAX/Tex - can be achieved by splitting heap in 2 parts
In Vulkan UAV is probably min for all applicable resources

Support indirect draw/compute

API:

  • drawIndirect
  • drawIndexedIndirect
  • dispath
  • dispatchMesh

GLSL:

  • gl_VertexIndex and gl_InstanceIndex
    • Just works on Metal (afaik)
    • fist-instance has only 90% coverage (and some 2023 devices do not support it)
    • Just doesn't work in DX
  • gl_NumWorkGroups
    • Should work on metal (?)
    • Just works on Vulkan
    • Just doesn't work in DX - can be emulated with push-descriptor

Alignment:
offset must be multiple of 4 in Vulkan
offset must be multiple of 4 in DX12 (https://learn.microsoft.com/en-us/windows/win32/api/d3d12/nf-d3d12-id3d12graphicscommandlist-executeindirect#remarks)
offset in Metal "check for offset alignment requirements for buffers in device and constant address space." - not documented(?!)

Compilation warnings on GCC 10.2.1

I've noticed two warnings when compiling on Linux Fedora 32 using GCC 10.2.1:

/OpenGothic/lib/MoltenTempest/Engine/formats/image/pixmapcodeccommon.cpp: In function ‘void stbiSkip(void*, int)’:
/OpenGothic/lib/MoltenTempest/Engine/formats/image/pixmapcodeccommon.cpp:30:41: warning: ignoring return value of function declared with attribute ‘warn_unused_result’ [-Wunused-result]
   30 |   reinterpret_cast<IDevice*>(user)->seek(size_t(n));
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~

/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vulkanapi_impl.cpp: In constructor ‘Tempest::Detail::VulkanApi::VulkanApi(bool)’:
/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vulkanapi_impl.cpp:36:51: warning: assignment from temporary ‘initializer_list’ does not extend the lifetime of the underlying array [-Winit-list-lifetime]
   36 |     validationLayers = checkValidationLayerSupport();
      |                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~

Vulkan compile error on linux

The UniqueHandle class in Vulkan.hpp is not declared, because of the definition
add_definitions(-DVULKAN_HPP_NO_SMART_HANDLE), but will be used later (line 13075).

I fixed it by just adding the definition
add_definitions(-DVULKAN_HPP_DISABLE_IMPLICIT_RESULT_VALUE_CAST)
to skip the code. Not sure if that's okay, though.

I'm using the vulkan-headers 1.2.137-1.

Bake default font to engine

Otherwise GUI app dont show any text by default.
Some small free font, like ProggyClean.ttf or Roboto, will be great to bake into *.dll

Push constants issue

Push constants size mismatch is most common issues now. Need to do some safety/asset mechanism.

a) Push constant size can be asserted vs expected size
b) Push buffer size may not match between msl and glsl

Misleading method name "manhattenLength"

In the engine we see the following code for 1D to 4D points:

T manhattanLength() const { return T(std::sqrt(x*x+y*y+z*z)); }

However manhatten length is defined by

std::abs(x) + std::abs(y) + std::abs(z)

The length currently implemented is the euclidean length, as it should be. I recommend to simply rename it to

T Length() const { return T(std::sqrt(x*x+y*y+z*z)); }

[Linux] Unable to decode üöä

Trying to type non-ASCII characters (e.g. ü ö ä) in the G2 save menu causes the game to freeze. It enters an infinite loop here, because l remains 0:

for(size_t i=0;s[i];) {
uint32_t cp = 0;
size_t l = Detail::utf8ToCodepoint(&s[i],cp);

ü gives the code 0xfc on Linux since XLookupString returns latin-1 encoding, not utf8.

Segmentation fault

Engine is throwing SIGSEGV in copyUpsample when trying first memset on line 16.

Add mandatory atomic-image formats

In Vulkan/Dx12 R32UINT and R32INT images always support atomic operations.

In Metal it's complicated:

  • Metal 3.1 has native support
  • pre 3.1, it's possible to do software workaround (similar to moltenvk)

Single time submit command buffers

Motivation:

ss-command buffers are only option on MacOS, so we need to enfoce them (or emulate reusable buffers on Mac and iOS)

Proposed api:

- device.submit(cmd,sync);
+ device.submit(std::move(cmd),sync);

Mouse wheel on X11 is not working.

Since 28e8515 the Mouse wheel is not behaving correctly.

Prior to the commit when using the wheel it behaved like forward and backwards movement (W and S).
Now seems to also do a motion and a selection. I had not time to test this further.
I will investigate when I have more time.

Robust support for push_constant's and SPIRV_Cross_VertexInfo in DX12

Currently shader registers are set to be automatic after cross compilation:

cbuffer SPIRV_Cross_VertexInfo
{
    int SPIRV_Cross_BaseVertex;
    int SPIRV_Cross_BaseInstance;
};

Yet, there is a problem: so far no good way to guess what register is it going to be.

Attempt #1 - find minimal value across unused ones. Normally it's a same as max_bind+1. - Doesn't work, if shader has unused bindings.

Attempt #2 - use reflection api.

    ComPtr<IDxcUtils> pUtils;
    hr = DxcCreateInstance(CLSID_DxcUtils, __uuidof(IDxcUtils), reinterpret_cast<void**>(&pUtils.get()));
    if(FAILED(hr))
      return hr;

    ComPtr<IDxcBlob> pReflectionData;
    hr = result->GetOutput(DXC_OUT_REFLECTION, IID_PPV_ARGS(&pReflectionData.get()), nullptr);
    if(FAILED(hr))
      return hr;

    // Create reflection interface.
    DxcBuffer ReflectionData;
    ReflectionData.Encoding = DXC_CP_ACP;
    ReflectionData.Ptr      = pReflectionData->GetBufferPointer();
    ReflectionData.Size     = pReflectionData->GetBufferSize();

    ComPtr<ID3D12ShaderReflection> pReflection;
    hr = pUtils->CreateReflection(&ReflectionData, IID_PPV_ARGS(&pReflection.get()));
    if(FAILED(hr))
      return hr;

    D3D12_SHADER_INPUT_BIND_DESC desc = {};
    pReflection->GetResourceBindingDescByName("SPIRV_Cross_VertexInfo",&desc);
    vertexInfoBind = desc.BindPoint;
  • this not working at all as register will be assigned after shader link-package

Compiling on Raspberry Pi 4

Hey!
Cool stuff you have here! :D I'm trying to compile on a pi4 just for fun. Is this actually possible? It had problems with sse2 before compiling squish but I just removed the commands. Currently I'm stuck at this. Coming from C# and still learning C++. Maybe you can help and try to describe what it means.

[ 55%] Building CXX object lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/gapi/vulkan/vcommandbuffer.cpp.o /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp: In member function ‘void Tempest::Detail::VCommandBuffer::changeLayout(VkImage, VkFormat, VkImageLayout, VkImageLayout, uint32_t, bool)’: /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:395:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:395:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:396:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:396:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:397:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:397:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:398:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:398:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:447:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:447:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:448:10: error: ‘VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:448:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:449:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:449:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_ATTACHMENT_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:450:10: error: ‘VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL’ was not declared in this scope case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/pi/OpenGothic/lib/MoltenTempest/Engine/gapi/vulkan/vcommandbuffer.cpp:450:10: note: suggested alternative: ‘VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL’ case VK_IMAGE_LAYOUT_STENCIL_READ_ONLY_OPTIMAL: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL make[2]: *** [lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/build.make:585: lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/gapi/vulkan/vcommandbuffer.cpp.o] Fehler 1 make[1]: *** [CMakeFiles/Makefile2:1059: lib/MoltenTempest/Engine/CMakeFiles/MoltenTempest.dir/all] Fehler 2 make: *** [Makefile:130: all] Fehler 2

Missing support for DDS RGB formats

Currently, the DDS codec only supports the DXT compression format:

int compressType = squish::kDxt1;
switch(ddsd.ddpfPixelFormat.dwFourCC) {
case FOURCC_DXT1:
bpp = 3;
compressType = squish::kDxt1;
frm = Pixmap::Format::DXT1;
break;
case FOURCC_DXT3:
bpp = 4;
compressType = squish::kDxt3;
frm = Pixmap::Format::DXT3;
break;
case FOURCC_DXT5:
bpp = 4;
compressType = squish::kDxt5;
frm = Pixmap::Format::DXT5;
break;
default:
return nullptr;
}

This becomes an issue when trying to load RGB from DDS files since processing just fails. In Try/OpenGothic#271 this leads to the wrong image being loaded.

A solution specifically for the OpenGothic issue (apart from implementing support for this format) would be to allow the creation of a Pixmap from raw bytes. This would require adding more formats (like R5G6B5 and BGR8) to Pixmap::Format.

Linux support (and SDL2)

So linux has at least two video backends that should be supported: x11 and wayland. You generally only know at runtime which one to use which means that the Vulkan instance extensions required to the WSI can be different depending on the Window type (x11/wayland). The code has to be restructured in a way that the vulkan instance creating already has access to the window.

The other point here is that you really should use SDL2 to replace the various SystemApi implementations.

Relevant APIs:
https://wiki.libsdl.org/CategoryInit
https://wiki.libsdl.org/CategoryVideo
https://wiki.libsdl.org/CategoryEvents
https://wiki.libsdl.org/CategoryVulkan

DX12 test cashes on appveyor

Job: https://ci.appveyor.com/project/Try/tempest/builds/43007972/job/c0ob1eqnofs8vmr5

[----------] 24 tests from DirectX12Api
[ RUN      ] DirectX12Api.DirectX12Api
[       OK ] DirectX12Api.DirectX12Api (39 ms)
[ RUN      ] DirectX12Api.Vbo
Microsoft Basic Render Driver
Microsoft Basic Render Driver
createBuffer 218
alloc 000002DFCEAF3E70 64
alloc 000002DFCEAF4CC0 65
unknown file: error: SEH exception with code 0x87a thrown in the test body.
[  FAILED  ] DirectX12Api.Vbo (124 ms)
[ RUN      ] DirectX12Api.VboInit
createBuffer 218
alloc 000002DFD03481C0 64
alloc 000002DFCEAF4CC0 65

Error is not reproducible locally and by RDP, could be an issue with Microsoft Basic Render Driver

automatic PipelineBarriers investigation

This ticket is to track the ideas/solutions to pipeline barriers generation. Mostly it's about Vulkan perspective, yet DirectX12 is also to take in consideration.

Strategy:

All image resources assume a default read-to-read.

  • Color -> VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT
  • Depth -> VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL VK_PIPELINE_STAGE_{EARLY|LATE}_FRAGMENT_TESTS_BIT
  • Buffer -> (no real layout, just ready-to-read) VK_PIPELINE_STAGE_ALL_COMMANDS_BIT

dest stage is ALL_COMMANDS (except for Depth) for now.
Compute:
all storage resources are tracked individually; if current pipeline has a unordered access - new set of barriers has to be issues.

Code sample:

auto cmd  = device.commandBuffer();
{
  auto enc = cmd.startEncoding(device);
  // Assume 'ready-to-read' state
  enc.dispatch(...);
  // tex1: ALL -> COLOR
  enc.setFramebuffer({{tex1,Vec4(0,0,1,1),Tempest::Preserve}});
  enc.draw(...);
  // tex1: COLOR -> ALL 
  // tex2: ALL -> COLOR
  enc.setFramebuffer({{tex2,Tempest::Discard,Tempest::Preserve}});
  enc.draw(...);
  // tex2: COLOR -> ALL 
}

Points of interest

Optimize:

  • Depth -> Frag case
  • Color -> Frag case
  • Assure correctness in general
  • Describe limitations, when it comes to rendering vs compute cases

Problems:

  1. Readable depth doesn't "fit" into this paradigm.
  2. COLOR -> ALL pipeline bubble, for shadow-maps (and most of other rendering scenarios)
  3. [DX12] UAV barriers a not allowed inside a renderpass.

Api-limitations

7.9. Host Write Ordering Guarantees

When batches of command buffers are submitted to a queue via a queue submission command, it defines a memory dependency with prior host operations, and execution of command buffers submitted to the queue.

This makes it easier on resource upload/uniform buffers side, yet still command buffer must assume any commands submitted before.

7.6. Pipeline Barriers

If vkCmdPipelineBarrier2KHR is recorded within a render pass instance, the synchronization scopes are limited to operations within the same subpass.

This may cause troubles, if barriers are delayed.

7.6.1. Subpass Self-dependency

vkCmdPipelineBarrier or vkCmdPipelineBarrier2KHR must not be called within a render pass instance started with vkCmdBeginRenderingKHR.

Since VK_KHR_dynamic_rendering is a go-to extension, barriers must not be issued in renderpass.
This limitation basically blocks any split-barrier or partial-barrier approaches.

Uniform data/buffers

Vulkan api on this side is quite messy - need to design nice engine-level api.

Use-cases:

  1. Per-frame immediate data (view/proj matrix, main light)
  2. Per-draw immediate data (obj matrix)
  3. Constant data (vbo/ibo, animation skeleton to some extend)

case 3 is native - DescriptorSet::set
case 2 almost works, as push constant (128 byte limit is an issue)
case 1 can work as push, except it doesn't fit there.

-dx12 seems to be broken

Starting a new game or loading a save with -dx12 causes loading to take about three minutes. Then it force closes with:

---crashlog(ExceptionFilter)---
GPU: NVIDIA GeForce RTX 2080
0x00007ff77f01c857: [unknown function] in [unknown module]
0x00007ff77f01c95b: [unknown function] in [unknown module]
0x00007ff77f01cb38: [unknown function] in [unknown module]
0x00007fff83db5dcc: UnhandledExceptionFilter in C:\Windows\System32\KERNELBASE.dll
0x00007fff8677859d: RtlMoveMemory in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff8675f047: _C_specific_handler in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff86773e1f: _chkstk in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff866eeae6: RtlFindCharInUnicodeString in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff86724af5: RtlRaiseException in C:\Windows\SYSTEM32\ntdll.dll
0x00007fff83cefb1c: RaiseException in C:\Windows\System32\KERNELBASE.dll
0x00007fff78106720: CxxThrowException in C:\Windows\SYSTEM32\VCRUNTIME140.dll
0x00007fff1b02a998: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b03710d: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0378cf: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b030fda: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0b749a: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0b752d: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007ff77ef7d1bb: [unknown function] in [unknown module]
0x00007ff77ef7c360: [unknown function] in [unknown module]
0x00007ff77ef7c5ae: [unknown function] in [unknown module]
0x00007ff77efa5368: [unknown function] in [unknown module]
0x00007fff1b0cb46f: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007fff1b0cb28b: stbtt_MakeGlyphBitmapSubpixel in C:\Users\Daniel\Desktop\opengothic_win\Tempest.dll
0x00007ff77ef994e5: [unknown function] in [unknown module]
0x00007ff77f36310c: [unknown function] in [unknown module]
0x00007fff84ef244d: BaseThreadInitThunk in C:\Windows\System32\KERNEL32.DLL
0x00007fff8672df78: RtlUserThreadStart in C:\Windows\SYSTEM32\ntdll.dll

P.S. Are the WIP Windows builds MSVC2022 Debug or Release builds? These logs look a bit spartanic to me.

Mesh shader emulation over draw-indirect

Based on #33

Initial implementation is practically working, this ticket is to track technical depth and for profiling work.

TODO:

  • Lines/Points
  • test flat and other interpolators
  • Fix in uvec3 gl_WorkGroupIDpolution
  • Fix in uvec3 gl_NumWorkGroups - polluted due to dispatch indirect
  • Fix in uvec3 gl_GlobalInvocationID // polluted, since it is byproduct of gl_WorkGroupID
  • Control/sanitize out-of-memory case
  • perprimitiveEXT - no immediate need

ERR(wont't fix):

  • Draw order is lost inside a single draw-call (not an issue for 3D)

Binding of zero-sized SSBO

In complex workloads it getting annoying to sanitize zero-sized buffers (and avodi draw/dispatches)
Meanwhile on shader side (in term of algorithms) in most cases it perfectly fine to have .length()==0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.