gpuopen-drivers / xgl Goto Github PK
View Code? Open in Web Editor NEWVulkan API Layer
License: MIT License
Vulkan API Layer
License: MIT License
When using a timeline semaphore that will be signaled from the same thread after submit this line in vk_queue.cpp (and potentially others, but thats where i found it to be happening when debugging) can have a PAL timeout returned, causing the function to early out and return the invalid for vkQueueSubmit/vkQueueSubmit2 return code of VK_TIMEOUT. The timeout does also cause other spec compliance issues, but they appear to all be the result of the underlying timeout.
Does require a signal operation in the same submit, but only presents itself when the wait semaphore is not yet signaled. Presence or lack of command buffer submit does not appear to affect the result
Produced using a device group of two W5700 GPUs on Ubuntu 22.04 through vkQueueSubmit2, with a timeline semaphore wait executed on device index 1 for a semaphore that will be signaled after submit. The same operation with device index 0 completes as expected after signaling on the host. When used in a single device device group, both physical devices behave correctly.
If needed i can provide code that reproduces the issue on my machine.
When I run the shaderdb test Ext16bitStorage_TestTesOutput.tese, it hits an infinite loop in the DAG combiner during backend code generation.
dEQP-VK.pipeline.creation_feedback.graphics_tests.vertex_stage_fragment_stage_delayed_destroy
seems to be related with pull request #68 , but not quite sure.
There has been a lot of work in LLPC over the past 9 months to implement relocatable shaders. These were intended to provide a way to compile shaders “offline”, that is without running a vulkan application. However for that to be useful, there must be a way for a vulkan application to make use of the precompiled shaders.
To this end, we want to write a tool that will take the precompiled shaders and build a file whose contents can be passed as the initial data to vkCreatePipelineCache
. This tool should live in the XGL repository because XGL controls the format of the cache when it is serialized by vkGetPipelineCacheData
.
This will be a standalone tool that will have its own subdirectory in the tools directory.
amdllpc
to compile shaders and get an elf file.xlg_cache_creator [options] <input elf files>
Options:
-o <filename> - The filename to output the cache data to.
Required.
-device_id=<device id> - The device id of the device this cache will be used on.
The device id can be found at
https://devicehunt.com/view/type/pci/vendor/1002.
If this option is not provided, the device id will be
queried from the runtime.
-uuid=<uuid> - The uuid for the specific driver and machine.
<How can the uuid be found?>
If this option is not provided, the device id will be
queried from the runtime.
vkGetPipelineCacheData
llpc_cache_hash
contianing the 128-bit hash for the ELF file being generated.PhysicalDevice::InitializePlatformKey
so:
PhysicalDevice
.CalculateHashId
:
pInstance
parameter with the allocation and deallocation functions, so that an Instance
is not needed.vkGetPipelineCacheData
code that writes the header into a function (WriteVkCacheHeader
?) the cache creator tool can call.InitializePlatformKey
, CalculateHashId
, and WriteVkCacheHeader
.ElfReader
to read the elf and extract the PAL metadata.MsgPackReader
to read the PAL metadata to get the hash.This is as of the latest build on Github. Fedora 28
I think the implementation of DefaultReallocFunc
(
Line 75 in 63761ec
size
is more than the size of the original allocation. This is because it copies up to the size of the destination buffer, which will typically be more than the original buffer.size
is 0.Vulkan specs: https://www.khronos.org/registry/vulkan/specs/1.2-extensions/man/html/PFN_vkReallocationFunction.html.
The deviceMask component of VkCommandBufferSubmitInfo isn't read by Queue::Submit, only the device mask from VkDeviceGroupSubmitInfo for vkQueueSubmit is read from. The submit instead behaves as if every device mask is 1
for only the primary/first GPU. This includes a devicemask of zero, which for vkQueueSubmit2 means execute on all devices, unlike a lack of VkDeviceGroupSubmitInfo meaning primary only.
From my reading of the spec, using gl_SampleId or gl_SamplePosition should force per sample execution, however the spec doesn't seem to mention the gl_SampleMask input causing per sample.
AMDVLK sets m_pResUsage->builtInUsage.fs.runAtSampleRate for gl_SampleMask.
When running my application with a debug build of AMDVLK, I get an assertion here:
Line 7911 in 0387177
vkCmdBeginRenderPass
is being called with a normal non-imageless VkFramebuffer
, and the VkRenderPassBeginInfo
chain contains a VkRenderPassAttachmentBeginInfo
structure with its attachmentCount == 0
.
This is explicitly permitted by the spec for non-imageless framebuffers here:
If
framebuffer
was created with aVkFramebufferCreateInfo::flags
value that did not includeVK_FRAMEBUFFER_CREATE_IMAGELESS_BIT
, and thepNext
chain includes aVkRenderPassAttachmentBeginInfo
structure, itsattachmentCount
must be zero.
It looks like the following call to SetImageViews(pRenderPassAttachmentBeginInfo)
doesn't actually do anything when attachmentCount == 0
, so this is probably just an incorrect assertion.
I would like to point out that identifiers like “__COLOR_SPACE_HELPER_H__
” and “__VK_CMDBUFFER_H__
” do not fit to the expected naming convention of the C++ language standard.
Would you like to adjust your selection for unique names?
From the spec:
Precision and Operation of SPIR-V Instructions
The following rules apply to both single and double-precision floating point instructions:
Positive and negative infinities and positive and negative zeros are generated as dictated by IEEE 754, but subject to the precisions allowed in the following table.
We've faced the same issue in radv, and the spec seems quite clear on it.
The Undefined Behavior sanitizer reports an uninitialized read in a CacheInfo unit test:
Script:
--
/vulkandriver/builds/ci-build/tools/test/cache-creator/unittests/./CacheCreatorUnitTests --gtest_filter=CacheInfoTest.ValidBlobOneEntry
--
Note: Google Test filter = CacheInfoTest.ValidBlobOneEntry
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from CacheInfoTest
[ RUN ] CacheInfoTest.ValidBlobOneEntry
/vulkandriver/drivers/xgl/tools/cache_creator/unittests/cache_info_tests.cpp:160:38: runtime error: constructor call on misaligned address 0x608000000264 for type 'vk::BinaryCacheEntry', which requires 8 byte alignment
0x608000000264: note: pointer points here
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
^
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /vulkandriver/drivers/xgl/tools/cache_creator/unittests/cache_info_tests.cpp:160:38 in
For the current code to work, cache blob entry sizes need to be multiples of alignof(vk::BinaryCacheEntry)
. We should either make sure that this is the case or copy raw bytes to aligned memory before reading.
/home/constantine/AMDVLK/pal/inc/util/palAssert.h:96:20: error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
if ((_expr) == false) \
^
/home/constantine/AMDVLK/pal/inc/util/palAssert.h:104:27: note: in expansion of macro ‘PAL_ASSERT_MSG’
#define PAL_ASSERT(_expr) PAL_ASSERT_MSG(_expr, "%s", "Unknown")
^~~~~~~~~~~~~~
/home/constantine/AMDVLK/xgl/icd/api/include/vk_utils.h:54:19: note: in expansion of macro ‘PAL_ASSERT’
#define VK_ASSERT PAL_ASSERT
^~~~~~~~~~
/home/constantine/AMDVLK/xgl/icd/api/include/vk_physical_device.h:138:9: note: in expansion of macro ‘VK_ASSERT’
VK_ASSERT(pVkIndex);
^~~~~~~~~
make[2]: *** [icd/CMakeFiles/xgl.dir/api/app_profile.cpp.o] Error 1
make[1]: *** [icd/CMakeFiles/xgl.dir/all] Error 2
make: *** [all] Error 2
constantine@linux:~/AMDVLK$ gcc --version
gcc (Ubuntu 7.2.0-18ubuntu2) 7.2.0
ERROR: Fails to verify module (llpc-pass-external-lib-link): Invalid bitcast
%1 = bitcast i32 %texel to <4 x float>
Shader generated from HLSL through DXC: test.spvas.txt
The image format is R32ui, but all the library functions for image writes seem to expect four component vectors (from g_glslImageOpEmu.ll):
define void @llpc.image.write.u32.Buffer(i32 %resourceDescSet, i32 %resourceBinding, i32 %resourceIdx ,i32 %coord ,<4 x i32> %texel ,i1 %glc ,i1 %slc ,i32 %imageCallMeta) #1
{
%resource = call <4 x i32> @llpc.descriptor.load.texelbuffer(i32 %resourceDescSet, i32 %resourceBinding, i32 %resourceIdx)
%1 = bitcast <4 x i32> %texel to <4 x float>
call void @llvm.amdgcn.buffer.store.format.v4f32(<4 x float> %1, <4 x i32> %resource, i32 %coord, i32 0, i1 %glc, i1 %slc)
ret void
}
That image write is invoked in the shader as:
call void @llpc.image.write.u32.Buffer(i32 4, i32 69, i32 0, i32 %6, i32 64, i1 false, i1 false, i32 326) #0
The texel
operand has a mismatched type.
In addition to making cache creator usable for our internal partners/developers, we want to reach an agreement on how to make the open source XGL cache creator implementation usable. Open-source users/developers should be able to compile shaders offline and bundle them as a valid pipeline binary cache file on a machine without the target GPU, and then be able to load it on a machine with a compatible AMDVLK/AMDGPU installation.
We propose to change how hashes included in the pipeline cache blobs are calculated and a new tool, cache-info, to allow this cache creation flow.
Our prototype cache-creator implementation is available on a public XGL fork. The intended offline relocatable shader cache creation flow looks as follows:
.spv
files..spv
files offline with amdllpc into relocatable shader elfs:
amdllpc -gfxip=9.0.0 -unlinked -enable-relocatable-shader-elf shader.spv -o ./elfs/shader.elf
tools/cache-creator --device-id=0x6860 --uuid=6d4570f9-78b4-ef2f-fb61-f46919af88b4 -o cache.bin ./elfs/*.elf
vkCreatePipelineCache
.Before loading the cache contents, the ICD performs a series of checks that make sure that the cache blob: (1) is not corrupted, (2) matches the host driver/GPU. Both of these are performed based on the cache headers:
(1) vk::PipelineCacheHeaderData::headerLength
and vk::PipelineBinaryCachePrivateHeader::hashId
.
(2) vk::PipelineCacheHeaderData::{headerVersion,vendorID,deviceID,UUID}
and vk::PipelineBinaryCachePrivateHeader::hashId
.
With AMDVLK, headerVersion
and vendorID
are known constants, while deviceID
is known on a per-GPU-model basis. UUID
and hashId
are calculated based on the runtime host system configuration and include bits derived from build timestamps. Some unnecessary runtime system information leaking into hashId
makes it difficult to build caches offline on a non-target system, which affects both usability and testability of the cache-creator tool.
We want to be able to run the cache-creator tool on machines without the target GPU. This will also allow us to set up testing infrastructure on any VM without AMD GPU. As in the original cache-creator tool proposal, we want to be able to specify all the necessary hardware/system information through command line arguments.
--device-id
) and pipeline cache UUID (--uuid
) will be extracted from a valid pipeline cache file created on the target machine.A sample cache-info tool invocation looks like this:
$ tools/cache-info cache.bin --elf-source-dir=./elfs
Read: cache.bin, 407588 B
=== Vulkan Pipeline Cache Header ===
header length: 32
header version: 1
vendor ID: 0x1002
device ID: 0x6860
pipeline cache UUID: e333659f-a1b8-bafe-f19b-04dc792f9e99
=== Pipeline Binary Cache Private Header ===
header length: 20
hash ID: f1a233ff dd873311 0004432 ffff0ddf 2a303455
=== Cache Blob Info ===
content size: 407536
*** Entry 0 ***
hash ID: 4198f242 044ad73b 02283345 4845466c
data size: 2144
MD5 sum: 940c7b25c14d997209f7726611c73c1c
source elf: ./elfs/attachmentread.frag.elf
*** Entry 1 ***
hash ID: 67af7c86 06930f37 7bcf4d76 57426594
data size: 1552
MD5 sum: 1c4d73ed2b8ee5535e876940cd710fb6
source elf: ./elfs/attachmentread.vert.elf
Note that the second argument, --elf-source-dir
, is optional and only used for testing/debugging.
vk::PipelineBinaryCachePrivateHeader::hashId
is calculated with a hash algorithm and initial data provided by Util::IPlatformKey
which includes full VkPhysicalDeviceProperties
struct (including pipeline cache UUID). We cannot read this data off a pipeline cache file, because it blends in both the system information and pipeline cache data. Some possible solutions are:hashId
.IPlatformKey
and save it in a new field in PipelineBinaryCachePrivateHeader
. This way we can read it off existing pipeline binary cache files and pass as a new argument to the cache-creator tool.The changes in 2. could be conditionally enabled only in the relocatable compilation mode if you don't want to change the behavior in regular online full-pipeline compilation. We would prefer the first option (2a) and would like to understand which physical properties and system info fields actually affect cache compatibility and use only those.
If 2a sounds like a good way forward, we could further simplify the cache creation flow by extending the vulkaninfo tool to print pipeline cache UUID, so that we can take IDs from its output instead of having to create a cache file with some Vulkan application and read those values with cache-info.
We propose to use a unit testing framework in XGL, e.g., Doctest or Google Test. You can find the prototype unit test implementation with Doctest here. The main difference between the two frameworks is that Doctest consists of a single header file, which makes it a very small dependency easy to build with any build system. Concretely, unit tests will be a new cmake executable target and will execute on any build machine, i.e., a GPU won't be required.
Unit tests will be only built when the rest of cache creator is enabled with -DXGL_ENABLE_CACHE_CREATOR=ON
, but could further hide them behind an additional cmake option. We will run these unit tests in the public LLPC CI and could also make a similar CI setup for XGL based on GiHub Actions easily.
We further propose to add offline end-to-end tests based on the LIT testing infrastructure. LIT is used in LLPC to run the shaderdb tests. The testing flow will look like follows:
Note that this doesn't require a GPU, so these tests can be placed either in XGL or LLPC's shaderdb.
We don't have a full proposal for full GPU tests that would check if we are getting expected cache hits. These would have to run under Jenkins, as public VMs don't come with AMD GPUs.
One important change that we propose is to add an option to disable the ArchiveLayer
in PipelineBinaryCache
, so that we don't get unintended cache hits based on system-level persistent cache files.
We propose to make cache-creator usable and testable by using a flow, where cache header IDs can be read from valid pipeline cache files using a new tool cache-info and then provided to the cache-creator tool. To make it possible, we need to modify how the cache header field hashId
is handled.
We propose to introduce two levels of public tests: unit tests in XGL and LIT tests in XGL or LLPC, which won't require a GPU to run. To make it easier to test cache hits with Vulkan applications, we propose to add an option to disable the ArchiveLayer
in PipelineBinaryCache
.
The prototype implementation that allows the proposed offline flows and unit tests are available at https://github.com/kuhar/xgl/tree/cct/tools/cache_creator.
xgl/icd/api/vk_render_pass.cpp
Lines 578 to 579 in 6288905
This can crash, as it may dereference a NULL pointer:
pDepthStencilResolveAttachment is NULL or a pointer to a VkAttachmentReference2 structure defining the depth/stencil resolve attachment for this subpass and its layout.
If pDepthStencilResolveAttachment is NULL, or if its attachment index is VK_ATTACHMENT_UNUSED, it indicates that no depth/stencil resolve attachment will be used in the subpass.
And on ANGLE, we were relying on drivers accepting that pDepthStencilResolveAttachment can be NULL. The crash was noticed on Windows with Adrenalin version 23.5.2, but I suspect the open source amdvlk driver is using some of the same code.
When compiling a shader using reloctable shaders in LLPC, the shader compiler does not have access to the value for the immutable sampler. In fact, the compiler does not even know that a sampler will be an immutable sampler. This means that if a pipeline has an immutable sampler, then LLPC will have to fall back to whole pipeline compilation.
I would like to fix this by having XGL write the value of the immutable sampler to its descriptor so. This way it will look like a regular sampler to the shader. Let me know if there are any objections to this. I can work on the implementation if we decide to move forward with it.
Writing out the value can be conditional on the EnableRelocatableShaders
setting if you want.
When running DOOM or Wolfenstein II under Wine, I'm getting a black screen and the game freezes there. I haven't been able to run those two games for at least a month now. It works fine using RADV.
Using Ryzen 1700X / R9 Fury.
Hi, we've encountered that vkCreateGraphicsPipelines/vkCreateComputePipelines return in VkPipelineCreationFeedbackEXT::duration number of cpu ticks instead of nanoseconds on Windows. By specification, VkPipelineCreationFeedbackEXT::duration is the duration spent creating a pipeline or pipeline stage in nanoseconds.
Hi, I hope this is the right repository,
I'm testing out a pattern where instead of using RESET_COMMAND_BUFFER_BIT
and freeing individual command buffers, I reset the entire pool. I currently record 1 command buffer with 1 vkCmdFillBuffer
, 1 barrier, then bind things and do 30 vkCmdDispatch
es with it. On the next frame, I make sure to synchronize and make sure the execution of that CB finishes, then I tried resetting the entire pool. Without the transient flag, it takes around 13ms, while with the transient flag enabled, it's still 4ms. So I ended up destroying it instead and recreating a brand new command pool to replace it, which is pretty much free by comparison, doesn't even register when profiling.
Is that a known performance cliff?
The Vulkan spec states:
supportedStages is a bitfield of VkShaderStageFlagBits describing the shader stages that group
operations with subgroup scope are supported in. supportedStages will have the
VK_SHADER_STAGE_COMPUTE_BIT bit set if any of the physical device’s
queues support VK_QUEUE_COMPUTE_BIT.
This is exposed for a wide variety of unsupported ray-tracing related shader stages
VK_SHADER_STAGE_RAYGEN_BIT_KHR
VK_SHADER_STAGE_CLOSEST_HIT_BIT_KHR
VK_SHADER_STAGE_MISS_BIT_KHR
VK_SHADER_STAGE_INTERSECTION_BIT_KHR
VK_SHADER_STAGE_CALLABLE_BIT_KHR
despite these shader stages not being supported at all.
xgl/icd/api/vk_physical_device.cpp
Line 4168 in e5c5ad4
Therefore, these shader stages do not support these subgroup operations and they should not be exposed here.
This extension will be required by https://github.com/doitsujin/dxvk to efficiently support instancing step rates. It's already exposed by latest RADV and nvidia drivers.
According to the spec, "...If pCounterBufferOffsets is NULL, then it is assumed the offsets are zero." So NULL pCounterBufferOffsets should be legal.
However, in both BeginTransformFeedback() and EndTransformFeedback() function, pCounterBufferOffsets is accessed without checking for NULL, which causes segfault when the application passes NULL to both functions.
The Vulkan CTS has a number of OOM tests that caused double-free errors in InternalMmeMgr. CreateMemoryPoolAndSubAllocate tries to access uninitialized pointers and tries to free them.
Following snippet of code gets executed when the pool creation fails, here pInternalMemory would be uninitialized on failure.
else
{
auto it = pOwnerList->Begin();
bool needEraseFromOwnerList = pOwnerList->NumElements() > 0 ?
(it.Get()->groupMemory.PalMemory(DefaultDeviceIndex) ==
pInternalMemory->groupMemory.PalMemory(DefaultDeviceIndex)) : false;
// Unmap any persistently mapped memory
pInternalMemory->groupMemory.Unmap();
// Destroy the buddy allocator
if (pInternalMemory->pBuddyAllocator != nullptr)
{
PAL_DELETE(pInternalMemory->pBuddyAllocator, m_pSysMemAllocator);
}
// Release this pool's base allocation
FreeBaseGpuMem(pInternalMemory);
// Remove this memory pool from the list if we added it
if (needEraseFromOwnerList)
{
pOwnerList->Erase(&it);
}
}
When running ManiaPlanet under Wine with DXVK and amdvlk 9e6992f, I'm getting some pipeline compilations failures.
SPIR-V dumps are available here: http://dl.free.fr/bWMhdEP08
LLPC_CLIENT_INTERFACE_MAJOR_VERSION
helps us avoid build failures when we use the old XGL with the new LLPC.
In my opinion, if a document briefly explained the rule, it would be helpful to understand the context.
In addition, I am still confused how to quickly check the next LLPC_CLIENT_INTERFACE_MAJOR_VERSION
number.
If there was a specific file somewhere contains the next number, it would be helpful.
Please let me know if I missed the already existing file for it.
Hi dear amd vulkan developer:
I notice that in recent release , DeviceMaintenance3Properties.MaxMemoryAllocationSize change from 4G to 2G, what's the rational behind it?
In most case , vulkan developer expect this value usually 4G , ( before this change , both nvidia and amd actually was. ) , but now it change to 2G, it will bring some inconvenience and downgrade some app which need large memory.
Thanks if any hint for this change.
The VK_EXT_graphics_pipeline_library extension has been around in Nvidia GPUs for a while and also in AMD RADV but not in AMD Windows yet. it's a very important extension as it completely fixes the stutters while loading shaders in some games and also very important for the use of dxvk on windows.
please let us know if any updates towards supporting this extension is going to happen soon
At vk_cmdbuffer.cpp:5382 I believe the memset() should go into the preceding conditional.
The Vulkan spec says that pSizes=NULL is a valid argument (to indicate whole buffer). But if pSizes is NULL we'll segfault at line 5401.
I think the for loop at line 5392 is wrong. It think it should be for (i = 0; i < bindingCount; i++). Then, inside the loop, the m_pTransformFeedbackState->params.target[i] array should be indexed with firstBinding + i instead. Also the 1 << i expression should then be 1 << (firstBinding + i).
I'm getting the following build error with 03a38de and gcc 7.3.1 20180303 (Red Hat 7.3.1-5)
[1579/1909] Building CXX object llpc/CMakeFiles/llpc.dir/context/llpcContext.cpp.o
FAILED: llpc/CMakeFiles/llpc.dir/context/llpcContext.cpp.o
/usr/bin/c++ -DICD_VULKAN_1_1 -DLITTLEENDIAN_CPU -D_SPIRV_LLVM_API -I../icd/api/llpc/include -I../icd/api/llpc/context -I../icd/api/llpc/imported/metrohash/src -I../icd/api/llpc/lower -I../icd/api/llpc/patch -I../icd/api/llpc/patch/gfx6/chip -I../icd/api/llpc/patch/gfx9/chip -I../icd/api/llpc/patch/gfx9 -I../icd/api/llpc/patch/generate -I../icd/api/llpc/translator -I../icd/api/llpc/translator/libSPIRV -I../icd/api/llpc/translator/Mangler -I../icd/api/llpc/util -I../../pal/src/core/hw/gfxip/gfx6/chip -I../../pal/src/core/hw/gfxip/gfx9/chip -I../../pal/inc/core -I../../pal/inc/util -I/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include -Illvm/include -I../icd/api/llpc/../include/khronos -O2 -g -DNDEBUG -Wno-unused-parameter -Wno-type-limits -Wno-switch -Wno-parentheses -Wno-maybe-uninitialized -Wno-delete-non-virtual-dtor -Wno-sign-compare -Wno-delete-incomplete -Wno-unused -Wno-ignored-qualifiers -Wno-missing-field-initializers -Wno-invalid-offsetof -fno-strict-aliasing -std=c++0x -fno-rtti -fPIC -MD -MT llpc/CMakeFiles/llpc.dir/context/llpcContext.cpp.o -MF llpc/CMakeFiles/llpc.dir/context/llpcContext.cpp.o.d -o llpc/CMakeFiles/llpc.dir/context/llpcContext.cpp.o -c ../icd/api/llpc/context/llpcContext.cpp
../icd/api/llpc/context/llpcContext.cpp: In constructor 'Llpc::Context::Context(Llpc::GfxIpVersion)':
../icd/api/llpc/context/llpcContext.cpp:159:63: error: no matching function for call to 'CloneModule(llvm::Module&)'
m_pNativeGlslEmuLib = CloneModule(*m_pGlslEmuLib.get());
^
In file included from ../icd/api/llpc/context/llpcContext.cpp:47:0:
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:52:25: note: candidate: std::unique_ptr<llvm::Module> llvm::CloneModule(const llvm::Module*)
std::unique_ptr<Module> CloneModule(const Module *M);
^~~~~~~~~~~
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:52:25: note: no known conversion for argument 1 from 'llvm::Module' to 'const llvm::Module*'
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:53:25: note: candidate: std::unique_ptr<llvm::Module> llvm::CloneModule(const llvm::Module*, llvm::ValueToValueMapTy&)
std::unique_ptr<Module> CloneModule(const Module *M, ValueToValueMapTy &VMap);
^~~~~~~~~~~
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:53:25: note: candidate expects 2 arguments, 1 provided
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:60:1: note: candidate: std::unique_ptr<llvm::Module> llvm::CloneModule(const llvm::Module*, llvm::ValueToValueMapTy&, llvm::function_ref<bool(const llvm::GlobalValue*)>)
CloneModule(const Module *M, ValueToValueMapTy &VMap,
^~~~~~~~~~~
/builddir/build/BUILD/amdvlk-vulkan-driver-2.19/llvm/include/llvm/Transforms/Utils/Cloning.h:60:1: note: candidate expects 3 arguments, 1 provided
Sampling from the same image with both an ordinary texture sampler and a depth-compare sampler causes the shader compiler to fail with the following error:
LLVM ERROR: Error: Attempt to redefine function:
; Function Attrs: nounwind
declare spir_func %spirv.SampledImage.float_1_1_0_0_1_0_0 addrspace(1)* @_Z12SampledImagePU3AS140__spirv_SampledImage_float_1_0_0_0_1_0_0i(%spirv.SampledImage.float_1_0_0_0_1_0_0 addrspace(1)*, i32) #0
=> %spirv.SampledImage.float_1_0_0_0_1_0_0 addrspace(1)* (%spirv.SampledImage.float_1_0_0_0_1_0_0 addrspace(1)*, i32)
An example shader which triggers the issue can be found here (both disassembled SPIR-V and GLSL):
https://gist.github.com/doitsujin/bfc03c0a432c441426f10f0493c1d4f7
Changing the SPIR-V shader so that the same OpTypeSampledImage
is used for both operations fixes the error, although it obviously leads to incorrect results.
AMDVLK dropped support for pre-Polaris at some point.
The switchable graphics layer shouldn't filter out these cards because they're still supported in RADV.
Right now, if a user installs AMDVLK by accident, or upgrades AMDVLK and finds their card no longer supported, they can no longer use Vulkan on their system.
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5196#note_1023202
When querying for surface capabilities via vkGetPhysicalDeviceSurfaceCapabilitiesKHR
, the supportedCompositeAlpha
field always reports opaque-only even if a 32-bit visual is used. Alpha surface support would be useful for Vulkan-rendered client-side window decorations and splash screens.
When an X surface is in use, the window surface alpha-composites despite the swapchain being created opaque.
When a Wayland surface is in use, the window appears opaque when swapchain is created that way, and PAL implements VK_COMPOSITE_ALPHA_PRE_MULTIPLIED_BIT_KHR
for wayland surfaces. Unfortunately, this upsets the vulkan validation layers because they reconcile the swapchain create flags with the results from vkGetPhysicalDeviceSurfaceCapabilitiesKHR
.
cmake version 3.10.2 on archlinux.
cmake -H. -Bbuilds/Release64 -DCMAKE_BUILD_TYPE=Release -DCMAKE_MODULE_PATH=$PWD/../pal/cmake/Modules -DXGL_PAL_PATH:PATH=$PWD/../pal \
-DCMAKE_C_FLAGS="-DLINUX -D__x86_64__ -D__AMD64__" -DCMAKE_CXX_FLAGS="-DLINUX -D__x86_64__ -D__AMD64__" -DXGL_LLVM_SRC_PATH=$PWD/../llvm-amdvlk
I'm not sure what's happening there but this seems excessive.
When compiling compile a shader where the OpEntryPoint
declares one string - matching the VkPipelineShaderStageCreateInfo
name, but the entry point has an OpName
of a different string, the pipeline compile crashes.
The code that identifies the SPIR-V function that is the entry point we want is here in SPIRVToLLVM::translate
.
// Find the targeted entry-point in this translation
for (unsigned I = 0, E = BM->getNumFunctions(); I != E; ++I) {
auto BF = BM->getFunction(I);
ExecutionModel ExecModel = ExecutionModelMax;
bool IsEntry = BM->isEntryPoint(BF->getId(), &ExecModel);
if (IsEntry && ExecModel == EntryExecModel && BF->getName() == EntryName) {
EntryTarget = BF;
break;
}
}
However getName()
returns the OpName
- it should be checking only against the declared entry point name. Consider that even if there isn't a mismatch, there could be an entry point declared earlier with a different "entry point name", but with the OpName we are searching for - OpName can be anything including duplicated.
Note that if there is no OpName
at all for the entry point ID, then everything works for me.
Minimal repro compute shader SPIR-V that will crash when passed to vkCreateComputePipelines
with appropriate boiler-plate pointing to entry point "rdc":
OpCapability Shader
OpMemoryModel Logical GLSL450
OpEntryPoint GLCompute %not_rdc "rdc"
OpExecutionMode %not_rdc LocalSize 128 1 1
OpName %main "main"
OpName %not_rdc "not_rdc"
%void = OpTypeVoid
%4 = OpTypeFunction %void
%main = OpFunction %void None %4
%5 = OpLabel
OpReturn
OpFunctionEnd
%not_rdc = OpFunction %void None %4
%6 = OpLabel
%7 = OpFunctionCall %void %main
OpReturn
OpFunctionEnd
If you remove the OpName %1(not_rdc) "not_rdc"
it will work OK.
FYI I ran into this on amdvlk as built from latest of today (GPUOpen-Drivers/pal@91e30f1, 3e2d125, GPUOpen-Drivers/llvm@eb5eb1c).
In some situations, the settings folder that XGL reads its settings files from (/etc/amd
) is not ideal. For instance, during development or testing when those settings should be changed, modifying files in /etc/amd
would require root access. The folder used to read the settings files from should instead be made configurable (e.g., via an environment variable).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.