Getting a bunch of the following error when it's compiling the CUDA kernel on windows

Would it have made your life easier if I had something like: <div class="highlight

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Failed to build CUDA ,about mozilla-ocho/llamafile

Comments (7)

jart commented on July 30, 2024 2

Would it have made your life easier if I had something like:

#ifdef __i386__
#error "you need to use a 64-bit compiler for llamafile"
#endif

from llamafile.

starhou commented on July 30, 2024 1

@savant117 I know the solution，RUN
C:\Program Files (x86)\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
in
x64_x86 Cross Tools Command Prompt for VS 2022，then it works

from llamafile.

jart commented on July 30, 2024

The warnings you can ignore. I plan to remove those soon. As for the CUDA header error, I've tested CUDA 12.3 SDK on Windows 10 and didn't encounter this issue. I don't currently have access to a Windows 11 system to try reproducing it there. Contributions are welcome. You may also want to try building llama.cpp on your machine and if it still happens, file an issue with the upstream project.

from llamafile.

CeruleanSky commented on July 30, 2024

I got the same error with both CUDA 12.3 & 12.1., But when I build the upstream llama.cpp using with cmake it works fine.

My specs:
Windows 10 64bit
Nvidia 1070 GTX notebook edition 8gb
driver_version 546.12 which was included in CUDA 12.3 update 1

I ran [Guru3D.com]-Display_Driver_uninstaller and erased everything on my system related to nvidia then did a clean instead of Cuda 12.3, no difference.
My visual studio environment is the latest, but I had issues with v17.8.0 as well.
** Visual Studio 2022 Developer Command Prompt v17.8.2

I ran nvcc with --verbose to get a better idea of what was going on, here is the log:

nvcc --verbose ggml-cuda.cu

D:\llama-model-data\.llamafile>nvcc --verbose ggml-cuda.cu
#$ C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat

D:\llama-model-data\.llamafile>call "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat"
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.8.2
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
#$ CUDA_PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3
#$ CUDA_PATH_V12_3=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3
#$ DevEnvDir=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\
#$ ExtensionSdkDir=C:\Program Files (x86)\Microsoft SDKs\Windows Kits\10\ExtensionSDKs
#$ EXTERNAL_INCLUDE=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um
#$ HOMEDRIVE=C:
#$ IFCPATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ifc\x86
#$ INCLUDE=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um
#$ is_x64_arch=true
#$ LIB=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x64;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64;C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x86;C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86
#$ LIBPATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\store\references;C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\store\references;C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319
#$ MSBUILDDISABLENODEREUSE=1
#$ NUMBER_OF_PROCESSORS=8
#$ OS=Windows_NT
#$ Path=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\
#$ PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW
#$ Platform=x64
#$ PROCESSOR_ARCHITECTURE=AMD64
#$ PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
#$ PROCESSOR_LEVEL=6
#$ PROCESSOR_REVISION=9e09
#$ ProgramData=C:\ProgramData
#$ ProgramFiles=C:\Program Files
#$ ProgramFiles(x86)=C:\Program Files (x86)
#$ ProgramW6432=C:\Program Files
#$ PROMPT=$P$G
#$ PUBLIC=C:\Users\Public
#$ PYTHONIOENCODING=UTF-8
#$ SESSIONNAME=Console
#$ SystemDrive=C:
#$ SystemRoot=C:\WINDOWS
#$ UCRTVersion=10.0.22621.0
#$ UniversalCRTSdkDir=C:\Program Files (x86)\Windows Kits\10\
#$ VCIDEInstallDir=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\
#$ VCINSTALLDIR=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\
#$ VCToolsInstallDir=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\
#$ VCToolsRedistDir=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Redist\MSVC\14.38.33130\
#$ VCToolsVersion=14.38.33130
#$ VisualStudioVersion=17.0
#$ VS140COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools\
#$ VS170COMNTOOLS=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\
#$ VSCMD_ARG_app_plat=Desktop
#$ VSCMD_ARG_HOST_ARCH=x64
#$ VSCMD_ARG_TGT_ARCH=x64
#$ VSCMD_VER=17.8.2
#$ VSINSTALLDIR=C:\Program Files\Microsoft Visual Studio\2022\Community\
#$ windir=C:\WINDOWS
#$ WindowsLibPath=C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0
#$ WindowsSdkBinPath=C:\Program Files (x86)\Windows Kits\10\bin\
#$ WindowsSdkDir=C:\Program Files (x86)\Windows Kits\10\
#$ WindowsSDKLibVersion=10.0.22621.0\
#$ WindowsSdkVerBinPath=C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\
#$ WindowsSDKVersion=10.0.22621.0\
#$ WindowsSDK_ExecutablePath_x64=C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\
#$ WindowsSDK_ExecutablePath_x86=C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\
#$ ytdash=-N 10 --extractor-args "youtube:formats=dashy"
#$ _NT_DEBUGGER_EXTENSION_PATH=C:\programming\debuggers\windbg_extensions
#$ _NT_SYMBOL_PATH=cache*d:\symbols;srv*d:\symbols*http://msdl.microsoft.com/download/symbols;
#$ __DOTNET_ADD_32BIT=1
#$ __DOTNET_ADD_64BIT=1
#$ __DOTNET_PREFERRED_BITNESS=64
#$ __VSCMD_PREINIT_PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ __VSCMD_PREINIT_VCToolsVersion=14.38.33130
#$ __VSCMD_PREINIT_VS170COMNTOOLS=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\
#$ PATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin
#$ _THERE_=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ _WIN_PLATFORM_=x64
#$ TOP=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/..
#$ NVVMIR_LIBRARY_DIR=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../nvvm/libdevice
#$ PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../nvvm/bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../lib;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ INCLUDES="-ID:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../include"
#$ LIBRARIES=  "/LIBPATH:D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../lib/x64"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ erase D:/TEMP/tmpxft_000008a0_00000000-11_a_dlink.reg.c
ggml-cuda.cu
#$ resource file D:\TEMP/tmpxft_000008a0_00000000-13.res: [-D__CUDA_ARCH_LIST__=520 -nologo -E -TP -EHsc -D__CUDACC__ -D__NVCC__  "-ID:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=3 -D__CUDACC_VER_BUILD__=103 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=3 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -FI "cuda_runtime.h" "ggml-cuda.cu" ]
#$ cl.exe @"D:\TEMP/tmpxft_000008a0_00000000-13.res" > "D:/TEMP/tmpxft_000008a0_00000000-9_ggml-cuda.cpp4.ii"
ggml-cuda.cu
#$ erase D:\TEMP/tmpxft_000008a0_00000000-13.res
#$ cudafe++ --microsoft_version=1938 --msvc_target_version=1938 --compiler_bindir "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../.." --sdk_dir "C:/Program Files (x86)/Windows Kits/10/" --display_error_number --orig_src_file_name "ggml-cuda.cu" --orig_src_path_name "D:\llama-model-data\.llamafile\ggml-cuda.cu" --allow_managed --m64 --parse_templates --gen_c_file_name "D:/TEMP/tmpxft_000008a0_00000000-10_ggml-cuda.cudafe1.cpp" --stub_file_name "tmpxft_000008a0_00000000-10_ggml-cuda.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "D:/TEMP/tmpxft_000008a0_00000000-8_ggml-cuda.module_id" "D:/TEMP/tmpxft_000008a0_00000000-9_ggml-cuda.cpp4.ii"
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1906): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.nc.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1912): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.nc.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1918): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cg.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1924): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cg.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1930): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.ca.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1936): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.ca.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1942): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cs.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1948): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cs.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1954): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.lu.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1960): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.lu.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1966): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cv.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1972): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cv.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1977): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wb.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1981): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wb.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1985): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cg.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1989): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cg.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1993): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cs.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1997): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cs.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(2001): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wt.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(2005): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wt.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(3428): error: asm operand type size(8) does not match type/size implied by constraint 'r'
                    : "r"(address), "h"(*(reinterpret_cast<const unsigned short *>(&(val))))
                      ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1830): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.nc.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1836): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.nc.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1842): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cg.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1848): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cg.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1854): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.ca.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1860): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.ca.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1866): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cs.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1872): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cs.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1878): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.lu.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1884): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.lu.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1890): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cv.b32 %0, [%1];"  : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                               ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1896): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("ld.global.cv.b16 %0, [%1];"  : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
                                                                                                 ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1902): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wb.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1906): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wb.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1910): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cg.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1914): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cg.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1918): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cs.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1922): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.cs.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1926): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wt.b32 [%0], %1;"  :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
                                            ^

D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1930): error: asm operand type size(8) does not match type/size implied by constraint 'r'
      asm ("st.global.wt.b16 [%0], %1;"  :: "r"(ptr),  "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
                                            ^

ggml-cuda.cu(4137): error: identifier "__CUDA_ARCH__" is undefined
      (void)( (!!((__CUDA_ARCH__, 0))) || (_wassert(L"(__CUDA_ARCH__, 0)", L"ggml-cuda.cu", (unsigned)(4137)), 0) );
                   ^

ggml-cuda.cu(5957): warning #69-D: integer conversion resulted in truncation
      size_t best_diff = 1ull << 36;
                         ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

42 errors detected in the compilation of "ggml-cuda.cu".
# --error 0x2 --

The __CUDA_ARCH__" is undefined only happens when running manually, but the otherwise is just about identical to what it looks like with llamafile-server invokes nvcc.

Upstream build:

cmake .. -DLLAMA_CUBLAS=ON
cmake --build . --config Release

Upstream log

D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake ..
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
-- The C compiler identification is MSVC 19.38.33130.0
-- The CXX compiler identification is MSVC 19.38.33130.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/msys64/usr/bin/git.exe (found version "2.42.1")
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- x86 detected
-- Performing Test HAS_AVX_1
-- Performing Test HAS_AVX_1 - Success
-- Performing Test HAS_AVX2_1
-- Performing Test HAS_AVX2_1 - Success
-- Performing Test HAS_FMA_1
-- Performing Test HAS_FMA_1 - Success
-- Performing Test HAS_AVX512_1
-- Performing Test HAS_AVX512_1 - Failed
-- Performing Test HAS_AVX512_2
-- Performing Test HAS_AVX512_2 - Failed
CMake Warning at common/CMakeLists.txt:24 (message):
  Git repository not found; to enable automatic generation of build info,
  make sure Git is installed and the project is a Git repository.


-- Configuring done (14.3s)
-- Generating done (0.6s)
-- Build files have been written to: D:/llama-model-data/llama.cpp/llama.cpp-master/build

D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake .. -DLLAMA_CUBLAS=ON
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
-- Found CUDAToolkit: D:/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include (found version "12.3.103")
-- cuBLAS found
-- The CUDA compiler identification is NVIDIA 12.3.103
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: D:/NVIDIA GPU Computing Toolkit/CUDA/v12.3/bin/nvcc.exe - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Using CUDA architectures: 52;61;70
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- x86 detected
CMake Warning at common/CMakeLists.txt:24 (message):
  Git repository not found; to enable automatic generation of build info,
  make sure Git is installed and the project is a Git repository.


-- Configuring done (18.0s)
-- Generating done (2.1s)
-- Build files have been written to: D:/llama-model-data/llama.cpp/llama.cpp-master/build

D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake --build . --config Release
MSBuild version 17.8.3+195e7f5a3 for .NET Framework

  Checking Build System
  Generating build details from Git
  -- Found Git: C:/msys64/usr/bin/git.exe (found version "2.42.1")
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  fatal: not a git repository (or any parent up to mount point /)
  Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/common/CMakeLists.txt
  build-info.cpp
  build_info.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\build_info.dir\Release\build_info.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
  Compiling CUDA source file ..\ggml-cuda.cu...

  D:\llama-model-data\llama.cpp\llama.cpp-master\build>"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Micr
  osoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu   -I"D:\llama-model-data\llama.cpp\llama.cpp-master\." -I"D:\NVIDIA GPU C
  omputing Toolkit\CUDA\v12.3\include" -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include"     --keep-dir x64\Release -use_fast_math -maxrregcount=0   --mac
  hine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-c
  ode=arch=compute_70,code=[compute_70,sm_70] -Xcompiler="/EHsc -Ob2"   -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUA
  NTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR=\"Release\"" -D_MBCS -DWIN32 -D_WINDOWS
   -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNING
  S -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR=\"Release\"" -Xcompiler "/EHsc /W3 /nologo /O2 /FS   /MD /GR" -Xcompiler "/Fdggml.dir\Release\ggml.pdb" -o ggml.dir\Rel
  ease\ggml-cuda.obj "D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-cuda.cu"
  ggml-cuda.cu
  tmpxft_00002f9c_00000000-7_ggml-cuda.compute_70.cudafe1.cpp
  ggml.c
  ggml-alloc.c
  ggml-backend.c
  ggml-quants.c
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21): warning C4477: 'fprintf' : format string '%lu' requires an argument of type 'unsigned lo
ng', but variadic argument 1 has type 'unsigned __int64' [D:\llama-model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
  D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
  consider using '%llu' in the format string
  D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
  consider using '%Iu' in the format string
  D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
  consider using '%I64u' in the format string

D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(627,26): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(845,36): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(846,36): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
  Generating Code...
  ggml.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\ggml.dir\Release\ggml.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
  llama.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(1207,31): warning C4305: 'initializing': truncation from 'double' to 'float' [D:\llama-model-data\llam
a.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(2498,69): warning C4566: character represented by universal-character-name '\u010A' cannot be represen
ted in the current code page (1252) [D:\llama-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9814,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9844,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
  llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\Release\llama.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
  llava.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\llava.cpp(32,24): warning C4244: 'initializing': conversion from 'double' to 'float', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
  clip.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(251,20): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss
 of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(465,9): warning C4297: 'clip_model_load': function assumed not to throw an exception but
 does [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
  D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(465,9):
  __declspec(nothrow), throw(), noexcept(true), or noexcept was specified on the function

D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(714,46): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: 'initializing': conversion from 'const _Ty' to 'uint8_t', possib
le loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244:         with [D:\llama-model-data\llama.cpp\llama.cpp-master\bui
ld\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244:         [ [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244:             _Ty=float [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244:         ] [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: 'initializing': conversion from 'const _Ty' to 'const uint8_t',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244:         with [D:\llama-model-data\llama.cpp\llama.cpp-master\bui
ld\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244:         [ [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244:             _Ty=float [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244:         ] [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(855,20): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss
 of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(999,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(999,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1002,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1002,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1005,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1005,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1008,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1008,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1011,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1011,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1073,42): warning C4244: 'return': conversion from 'int64_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
  Generating Code...
  llava.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.dir\Release\llava.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/common/CMakeLists.txt
  common.cpp
  sampling.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\common\sampling.cpp(75,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of d
ata [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\common\sampling.cpp(75,20): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possible los
s of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
  console.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\common\console.cpp(253,30): warning C4267: 'initializing': conversion from 'size_t' to 'DWORD', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\common\console.cpp(407,28): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of d
ata [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
  grammar-parser.cpp
  train.cpp
  Generating Code...
D:\llama-model-data\llama.cpp\llama.cpp-master\common\common.cpp(887): warning C4715: 'gpt_random_prompt': not all control paths return a value [D:\llama-model
-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
  common.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\Release\common.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/baby-llama/CMakeLists.txt
  baby-llama.cpp
  baby-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\baby-llama.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/batched/CMakeLists.txt
  batched.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(72,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(72,24): warning C4267: 'initializing': conversion from 'size_t' to 'const int', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(114,50): warning C4267: 'argument': conversion from 'size_t' to 'int32_t', possible
 loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(118,48): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possib
le loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
  batched.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\batched.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/batched-bench/CMakeLists.txt
  batched-bench.cpp
  batched-bench.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\batched-bench.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/beam-search/CMakeLists.txt
  beam-search.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\beam-search\beam-search.cpp(163,83): warning C4267: 'argument': conversion from 'size_t' to 'int32_t',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\beam-search\beam-search.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\beam-search\beam-search.cpp(168,31): warning C4267: '+=': conversion from 'size_t' to 'int', possible l
oss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\beam-search\beam-search.vcxproj]
  beam-search.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\beam-search.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/benchmark/CMakeLists.txt
  benchmark-matmult.cpp
  benchmark.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\benchmark.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/convert-llama2c-to-ggml/CMakeLists.txt
  convert-llama2c-to-ggml.cpp
  convert-llama2c-to-ggml.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\convert-llama2c-to-ggml.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/embedding/CMakeLists.txt
  embedding.cpp
  embedding.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\embedding.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/export-lora/CMakeLists.txt
  export-lora.cpp
  export-lora.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\export-lora.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/finetune/CMakeLists.txt
  finetune.cpp
  finetune.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\finetune.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
  ggml_static.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\Release\ggml_static.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/infill/CMakeLists.txt
  infill.cpp
  infill.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\infill.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llama-bench/CMakeLists.txt
  llama-bench.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: 'initializing': conversion from 'double' to 'T', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244:         with [D:\llama-model-data\llama.cpp\llama.cp
p-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244:         [ [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244:             T=uint64_t [D:\llama-model-data\llama.cp
p\llama.cpp-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244:         ] [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llama-bench\llama-bench.vcxproj]
  D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13):
  the template instantiation context (the oldest one first) is
        D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(531,18):
        see reference to function template instantiation 'T stdev<uint64_t>(const std::vector<uint64_t,std::allocator<uint64_t>> &)' being compiled
          with
          [
              T=uint64_t
          ]

  llama-bench.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\llama-bench.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
  llava-cli.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\llava-cli.cpp(150,105): warning C4267: 'argument': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava-cli.vcxproj]
  llava-cli.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\llava-cli.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
  llava_static.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\Release\llava_static.lib
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/lookahead/CMakeLists.txt
  lookahead.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(91,33): warning C4267: 'initializing': conversion from 'size_t' to 'int', possi
ble loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(91,23): warning C4267: 'initializing': conversion from 'size_t' to 'const int',
 possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(108,16): warning C4267: 'initializing': conversion from 'size_t' to 'int', poss
ible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(365,129): warning C4267: 'argument': conversion from 'size_t' to 'int', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
  lookahead.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\lookahead.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/main/CMakeLists.txt
  main.cpp
  main.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\main.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/parallel/CMakeLists.txt
  parallel.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(159,21): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss
 of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(165,55): warning C4267: 'initializing': conversion from 'size_t' to 'int32_t', po
ssible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(165,35): warning C4267: 'initializing': conversion from 'size_t' to 'const int32_
t', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(257,68): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', poss
ible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(265,58): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss
 of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
  parallel.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\parallel.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/perplexity/CMakeLists.txt
  perplexity.cpp
  perplexity.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\perplexity.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/pocs/vdot/CMakeLists.txt
  q8dot.cpp
  q8dot.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\q8dot.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/quantize/CMakeLists.txt
  quantize.cpp
  quantize.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\quantize.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/quantize-stats/CMakeLists.txt
  quantize-stats.cpp
  quantize-stats.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\quantize-stats.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/save-load-state/CMakeLists.txt
  save-load-state.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\save-load-state\save-load-state.cpp(42,69): warning C4267: 'argument': conversion from 'size_t' to 'int
32_t', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\save-load-state\save-load-state.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\save-load-state\save-load-state.cpp(43,26): warning C4267: '+=': conversion from 'size_t' to 'int', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\save-load-state\save-load-state.vcxproj]
  save-load-state.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\save-load-state.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/server/CMakeLists.txt
  server.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(88,16): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(102,52): warning C4267: '=': conversion from 'size_t' to 'uint8_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(126,48): warning C4267: '=': conversion from 'size_t' to 'uint8_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(818,49): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(822,93): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(869,67): warning C4101: 'e': unreferenced local variable [D:\llama-model-data\llama.c
pp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(924,67): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possible
 loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(997,53): warning C4267: '-=': conversion from 'size_t' to 'int32_t', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1472,26): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1652,71): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1714,64): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1741,68): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1760,50): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
 data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1768,78): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1791,96): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
  server.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\server.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/simple/CMakeLists.txt
  simple.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(71,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(71,24): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possi
ble loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(99,48): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
  simple.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\simple.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/speculative/CMakeLists.txt
  speculative.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(130,33): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(130,23): warning C4267: 'initializing': conversion from 'size_t' to 'const
int', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(151,20): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(152,20): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
  speculative.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\speculative.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-c.c
  test-c.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-c.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-grad0.cpp
  test-grad0.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-grad0.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-grammar-parser.cpp
  test-grammar-parser.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-grammar-parser.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-llama-grammar.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(1207,31): warning C4305: 'initializing': truncation from 'double' to 'float' [D:\llama-model-data\llam
a.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
  (compiling source file '../../tests/test-llama-grammar.cpp')

D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(2498,69): warning C4566: character represented by universal-character-name '\u010A' cannot be represen
ted in the current code page (1252) [D:\llama-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
  (compiling source file '../../tests/test-llama-grammar.cpp')

D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9814,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
  (compiling source file '../../tests/test-llama-grammar.cpp')

D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9844,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
  (compiling source file '../../tests/test-llama-grammar.cpp')

  test-llama-grammar.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-llama-grammar.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-quantize-fns.cpp
  test-quantize-fns.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-quantize-fns.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-quantize-perf.cpp
  test-quantize-perf.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-quantize-perf.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-rope.cpp
  test-rope.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-rope.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-sampling.cpp
  test-sampling.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-sampling.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-tokenizer-0-falcon.cpp
  test-tokenizer-0-falcon.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-0-falcon.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-tokenizer-0-llama.cpp
  test-tokenizer-0-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-0-llama.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-tokenizer-1-bpe.cpp
  test-tokenizer-1-bpe.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-1-bpe.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
  test-tokenizer-1-llama.cpp
  test-tokenizer-1-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-1-llama.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/tokenize/CMakeLists.txt
  tokenize.cpp
  tokenize.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\tokenize.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/train-text-from-scratch/CMakeLists.txt
  train-text-from-scratch.cpp
  train-text-from-scratch.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\train-text-from-scratch.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/pocs/vdot/CMakeLists.txt
  vdot.cpp
  vdot.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\vdot.exe
  Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt

You can see the full command line of nvcc excerpted below from the upstream build's log and I'm sure it is the lack of some of those parameters causing the issues with llamafile's invocation:

D:\llama-model-data\llama.cpp\llama.cpp-master\build>"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu -I"D:\llama-model-data\llama.cpp\llama.cpp-master." -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" --keep-dir x64\Release -use_fast_math -maxrregcount=0 --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] -Xcompiler="/EHsc -Ob2" -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdggml.dir\Release\ggml.pdb" -o ggml.dir\Release\ggml-cuda.obj "D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-cuda.cu"

from llamafile.

jart commented on July 30, 2024

@CeruleanSky I'm using the same version of CUDA as you, on Windows. Slightly different MSVC revision. Things work fine for me. The nvcc command run by llama.cpp's cmake build config is highly different from what the llama.cpp makefile config uses. llamafile is based off the makefile config. The nvcc command that llamafile runs on my machine is:

"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe"
-arch=native --shared --forward-unknown-to-host-compiler -use_fast_math
--compiler-options "-fPIC -O3 -march=native -mtune=native" -DNDEBUG
-DGGML_BUILD=1 -DGGML_SHARED=1 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1
-DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128
-DGGML_USE_CUBLAS -o C:\Users\jtunn\.llamafile\ggml-cuda.dll
C:\Users\jtunn\.llamafile\ggml-cuda.cu -lcublas

I won't make changes to the build config unless I understand why they need to be made. Could you please troubleshoot things further and tell me how specifically the above command needs to be changed so that it'll work on your machine? Thanks!

from llamafile.

CeruleanSky commented on July 30, 2024

I think the readme should change MSVC x64 native command prompt to x64 Native Tools Command Prompt for VS as what is likely happening we both just open Developer Command Prompt for VS and expect it to work.

The clue was thanks to the below error when I starting adding parameters:

nvcc fatal : cl.exe in PATH (C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86) is different than one specified with -ccbin (C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX64/x64) when I added the -ccbin parameter from above.

This lead me to finding out that running nvcc from Developer Command Prompt for VS 2022 that the 32 bit compiler C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86 is used so instead run llamafile-server from x64 Native Tools Command Prompt for VS 2022 to ensure that C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64 is used.

I believe the reason the cmake build works is that it modifies the PATH variable and then manually picks out the correct compiler using -ccbin

from llamafile.

starhou commented on July 30, 2024

@savant117 I have the same question, have you found a solution?
windows 11 cuda 12.3 `

from llamafile.

Failed to build CUDA about llamafile HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs