Comments (7)
Would it have made your life easier if I had something like:
#ifdef __i386__
#error "you need to use a 64-bit compiler for llamafile"
#endif
from llamafile.
@savant117 I know the solution,RUN
C:\Program Files (x86)\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" x64
in
x64_x86 Cross Tools Command Prompt for VS 2022,then it works
from llamafile.
The warnings you can ignore. I plan to remove those soon. As for the CUDA header error, I've tested CUDA 12.3 SDK on Windows 10 and didn't encounter this issue. I don't currently have access to a Windows 11 system to try reproducing it there. Contributions are welcome. You may also want to try building llama.cpp on your machine and if it still happens, file an issue with the upstream project.
from llamafile.
I got the same error with both CUDA 12.3 & 12.1., But when I build the upstream llama.cpp using with cmake it works fine.
My specs:
Windows 10 64bit
Nvidia 1070 GTX notebook edition 8gb
driver_version 546.12 which was included in CUDA 12.3 update 1
I ran [Guru3D.com]-Display_Driver_uninstaller and erased everything on my system related to nvidia then did a clean instead of Cuda 12.3, no difference.
My visual studio environment is the latest, but I had issues with v17.8.0 as well.
** Visual Studio 2022 Developer Command Prompt v17.8.2
I ran nvcc with --verbose to get a better idea of what was going on, here is the log:
nvcc --verbose ggml-cuda.cu
D:\llama-model-data\.llamafile>nvcc --verbose ggml-cuda.cu
#$ C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat
D:\llama-model-data\.llamafile>call "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat"
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.8.2
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
#$ CUDA_PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3
#$ CUDA_PATH_V12_3=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3
#$ DevEnvDir=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\
#$ ExtensionSdkDir=C:\Program Files (x86)\Microsoft SDKs\Windows Kits\10\ExtensionSDKs
#$ EXTERNAL_INCLUDE=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um
#$ HOMEDRIVE=C:
#$ IFCPATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ifc\x86
#$ INCLUDE=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\include;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\VS\include;C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\um;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\shared;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\winrt;C:\Program Files (x86)\Windows Kits\10\\include\10.0.22621.0\\cppwinrt;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um
#$ is_x64_arch=true
#$ LIB=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x64;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64;C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86;C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x86;C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x86;C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86
#$ LIBPATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\store\references;C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\ATLMFC\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\store\references;C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319
#$ MSBUILDDISABLENODEREUSE=1
#$ NUMBER_OF_PROCESSORS=8
#$ OS=Windows_NT
#$ Path=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\
#$ PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW
#$ Platform=x64
#$ PROCESSOR_ARCHITECTURE=AMD64
#$ PROCESSOR_IDENTIFIER=Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
#$ PROCESSOR_LEVEL=6
#$ PROCESSOR_REVISION=9e09
#$ ProgramData=C:\ProgramData
#$ ProgramFiles=C:\Program Files
#$ ProgramFiles(x86)=C:\Program Files (x86)
#$ ProgramW6432=C:\Program Files
#$ PROMPT=$P$G
#$ PUBLIC=C:\Users\Public
#$ PYTHONIOENCODING=UTF-8
#$ SESSIONNAME=Console
#$ SystemDrive=C:
#$ SystemRoot=C:\WINDOWS
#$ UCRTVersion=10.0.22621.0
#$ UniversalCRTSdkDir=C:\Program Files (x86)\Windows Kits\10\
#$ VCIDEInstallDir=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\
#$ VCINSTALLDIR=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\
#$ VCToolsInstallDir=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\
#$ VCToolsRedistDir=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Redist\MSVC\14.38.33130\
#$ VCToolsVersion=14.38.33130
#$ VisualStudioVersion=17.0
#$ VS140COMNTOOLS=C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\Tools\
#$ VS170COMNTOOLS=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\
#$ VSCMD_ARG_app_plat=Desktop
#$ VSCMD_ARG_HOST_ARCH=x64
#$ VSCMD_ARG_TGT_ARCH=x64
#$ VSCMD_VER=17.8.2
#$ VSINSTALLDIR=C:\Program Files\Microsoft Visual Studio\2022\Community\
#$ windir=C:\WINDOWS
#$ WindowsLibPath=C:\Program Files (x86)\Windows Kits\10\UnionMetadata\10.0.22621.0;C:\Program Files (x86)\Windows Kits\10\References\10.0.22621.0
#$ WindowsSdkBinPath=C:\Program Files (x86)\Windows Kits\10\bin\
#$ WindowsSdkDir=C:\Program Files (x86)\Windows Kits\10\
#$ WindowsSDKLibVersion=10.0.22621.0\
#$ WindowsSdkVerBinPath=C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\
#$ WindowsSDKVersion=10.0.22621.0\
#$ WindowsSDK_ExecutablePath_x64=C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\
#$ WindowsSDK_ExecutablePath_x86=C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\
#$ ytdash=-N 10 --extractor-args "youtube:formats=dashy"
#$ _NT_DEBUGGER_EXTENSION_PATH=C:\programming\debuggers\windbg_extensions
#$ _NT_SYMBOL_PATH=cache*d:\symbols;srv*d:\symbols*http://msdl.microsoft.com/download/symbols;
#$ __DOTNET_ADD_32BIT=1
#$ __DOTNET_ADD_64BIT=1
#$ __DOTNET_PREFERRED_BITNESS=64
#$ __VSCMD_PREINIT_PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ __VSCMD_PREINIT_VCToolsVersion=14.38.33130
#$ __VSCMD_PREINIT_VS170COMNTOOLS=C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\
#$ PATH=C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ _NVVM_BRANCH_=nvvm
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin
#$ _THERE_=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ _WIN_PLATFORM_=x64
#$ TOP=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/..
#$ NVVMIR_LIBRARY_DIR=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../nvvm/libdevice
#$ PATH=D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../nvvm/bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../lib;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\x64\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x64;C:\Program Files (x86)\Windows Kits\10\bin\\x64;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX86\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\VC\VCPackages;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Current\bin\Roslyn;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.8 Tools\;C:\Program Files (x86)\HTML Help Workshop;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\FSharp\Tools;C:\Program Files\Microsoft Visual Studio\2022\Community\Team Tools\DiagnosticsHub\Collector;C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\\x86;C:\Program Files (x86)\Windows Kits\10\bin\\x86;C:\Program Files\Microsoft Visual Studio\2022\Community\\MSBuild\Current\Bin\amd64;C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\;C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\Tools\;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin;D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\libnvvp;;C:\windows\system32;C:\windows;C:\WINDOWS\System32\Wbem;C:\Program Files\CMake\bin;C:\Program Files\Java\jdk-17.0.6.10-hotspot\bin;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;
#$ INCLUDES="-ID:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../include"
#$ LIBRARIES= "/LIBPATH:D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../lib/x64"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ erase D:/TEMP/tmpxft_000008a0_00000000-11_a_dlink.reg.c
ggml-cuda.cu
#$ resource file D:\TEMP/tmpxft_000008a0_00000000-13.res: [-D__CUDA_ARCH_LIST__=520 -nologo -E -TP -EHsc -D__CUDACC__ -D__NVCC__ "-ID:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin/../include" -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=3 -D__CUDACC_VER_BUILD__=103 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=3 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -FI "cuda_runtime.h" "ggml-cuda.cu" ]
#$ cl.exe @"D:\TEMP/tmpxft_000008a0_00000000-13.res" > "D:/TEMP/tmpxft_000008a0_00000000-9_ggml-cuda.cpp4.ii"
ggml-cuda.cu
#$ erase D:\TEMP/tmpxft_000008a0_00000000-13.res
#$ cudafe++ --microsoft_version=1938 --msvc_target_version=1938 --compiler_bindir "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86/../../../../../../.." --sdk_dir "C:/Program Files (x86)/Windows Kits/10/" --display_error_number --orig_src_file_name "ggml-cuda.cu" --orig_src_path_name "D:\llama-model-data\.llamafile\ggml-cuda.cu" --allow_managed --m64 --parse_templates --gen_c_file_name "D:/TEMP/tmpxft_000008a0_00000000-10_ggml-cuda.cudafe1.cpp" --stub_file_name "tmpxft_000008a0_00000000-10_ggml-cuda.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "D:/TEMP/tmpxft_000008a0_00000000-8_ggml-cuda.module_id" "D:/TEMP/tmpxft_000008a0_00000000-9_ggml-cuda.cpp4.ii"
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1906): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.nc.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1912): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.nc.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1918): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cg.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1924): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cg.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1930): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.ca.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1936): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.ca.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1942): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cs.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1948): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cs.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1954): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.lu.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1960): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.lu.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1966): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cv.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1972): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cv.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1977): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wb.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1981): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wb.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1985): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cg.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1989): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cg.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1993): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cs.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(1997): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cs.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(2001): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wt.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(2005): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wt.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_fp16.hpp(3428): error: asm operand type size(8) does not match type/size implied by constraint 'r'
: "r"(address), "h"(*(reinterpret_cast<const unsigned short *>(&(val))))
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1830): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.nc.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1836): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.nc.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1842): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cg.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1848): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cg.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1854): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.ca.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1860): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.ca.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1866): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cs.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1872): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cs.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr));
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1878): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.lu.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1884): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.lu.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1890): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cv.b32 %0, [%1];" : "=r"(*(reinterpret_cast<unsigned int *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1896): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("ld.global.cv.b16 %0, [%1];" : "=h"(*(reinterpret_cast<unsigned short *>(&(ret)))) : "r"(ptr) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1902): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wb.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1906): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wb.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1910): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cg.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1914): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cg.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1918): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cs.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1922): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.cs.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1926): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wt.b32 [%0], %1;" :: "r"(ptr), "r"(*(reinterpret_cast<const unsigned int *>(&(value)))) : "memory");
^
D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda_bf16.hpp(1930): error: asm operand type size(8) does not match type/size implied by constraint 'r'
asm ("st.global.wt.b16 [%0], %1;" :: "r"(ptr), "h"(*(reinterpret_cast<const unsigned short *>(&(value)))) : "memory");
^
ggml-cuda.cu(4137): error: identifier "__CUDA_ARCH__" is undefined
(void)( (!!((__CUDA_ARCH__, 0))) || (_wassert(L"(__CUDA_ARCH__, 0)", L"ggml-cuda.cu", (unsigned)(4137)), 0) );
^
ggml-cuda.cu(5957): warning #69-D: integer conversion resulted in truncation
size_t best_diff = 1ull << 36;
^
Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"
42 errors detected in the compilation of "ggml-cuda.cu".
# --error 0x2 --
The __CUDA_ARCH__" is undefined
only happens when running manually, but the otherwise is just about identical to what it looks like with llamafile-server invokes nvcc.
Upstream build:
cmake .. -DLLAMA_CUBLAS=ON
cmake --build . --config Release
Upstream log
D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake ..
-- Building for: Visual Studio 17 2022
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
-- The C compiler identification is MSVC 19.38.33130.0
-- The CXX compiler identification is MSVC 19.38.33130.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: C:/msys64/usr/bin/git.exe (found version "2.42.1")
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- x86 detected
-- Performing Test HAS_AVX_1
-- Performing Test HAS_AVX_1 - Success
-- Performing Test HAS_AVX2_1
-- Performing Test HAS_AVX2_1 - Success
-- Performing Test HAS_FMA_1
-- Performing Test HAS_FMA_1 - Success
-- Performing Test HAS_AVX512_1
-- Performing Test HAS_AVX512_1 - Failed
-- Performing Test HAS_AVX512_2
-- Performing Test HAS_AVX512_2 - Failed
CMake Warning at common/CMakeLists.txt:24 (message):
Git repository not found; to enable automatic generation of build info,
make sure Git is installed and the project is a Git repository.
-- Configuring done (14.3s)
-- Generating done (0.6s)
-- Build files have been written to: D:/llama-model-data/llama.cpp/llama.cpp-master/build
D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake .. -DLLAMA_CUBLAS=ON
-- Selecting Windows SDK version 10.0.22621.0 to target Windows 10.0.19045.
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
-- Found CUDAToolkit: D:/NVIDIA GPU Computing Toolkit/CUDA/v12.3/include (found version "12.3.103")
-- cuBLAS found
-- The CUDA compiler identification is NVIDIA 12.3.103
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: D:/NVIDIA GPU Computing Toolkit/CUDA/v12.3/bin/nvcc.exe - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Using CUDA architectures: 52;61;70
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- CMAKE_GENERATOR_PLATFORM:
-- x86 detected
CMake Warning at common/CMakeLists.txt:24 (message):
Git repository not found; to enable automatic generation of build info,
make sure Git is installed and the project is a Git repository.
-- Configuring done (18.0s)
-- Generating done (2.1s)
-- Build files have been written to: D:/llama-model-data/llama.cpp/llama.cpp-master/build
D:\llama-model-data\llama.cpp\llama.cpp-master\build>cmake --build . --config Release
MSBuild version 17.8.3+195e7f5a3 for .NET Framework
Checking Build System
Generating build details from Git
-- Found Git: C:/msys64/usr/bin/git.exe (found version "2.42.1")
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/common/CMakeLists.txt
build-info.cpp
build_info.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\build_info.dir\Release\build_info.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
Compiling CUDA source file ..\ggml-cuda.cu...
D:\llama-model-data\llama.cpp\llama.cpp-master\build>"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files\Micr
osoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu -I"D:\llama-model-data\llama.cpp\llama.cpp-master\." -I"D:\NVIDIA GPU C
omputing Toolkit\CUDA\v12.3\include" -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" --keep-dir x64\Release -use_fast_math -maxrregcount=0 --mac
hine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-c
ode=arch=compute_70,code=[compute_70,sm_70] -Xcompiler="/EHsc -Ob2" -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUA
NTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR=\"Release\"" -D_MBCS -DWIN32 -D_WINDOWS
-DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNING
S -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR=\"Release\"" -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdggml.dir\Release\ggml.pdb" -o ggml.dir\Rel
ease\ggml-cuda.obj "D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-cuda.cu"
ggml-cuda.cu
tmpxft_00002f9c_00000000-7_ggml-cuda.compute_70.cudafe1.cpp
ggml.c
ggml-alloc.c
ggml-backend.c
ggml-quants.c
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21): warning C4477: 'fprintf' : format string '%lu' requires an argument of type 'unsigned lo
ng', but variadic argument 1 has type 'unsigned __int64' [D:\llama-model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
consider using '%llu' in the format string
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
consider using '%Iu' in the format string
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-backend.c(875,21):
consider using '%I64u' in the format string
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(627,26): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(845,36): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-quants.c(846,36): warning C4244: '=': conversion from 'float' to 'int8_t', possible loss of data [D:\llama-
model-data\llama.cpp\llama.cpp-master\build\ggml.vcxproj]
Generating Code...
ggml.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\ggml.dir\Release\ggml.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
llama.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(1207,31): warning C4305: 'initializing': truncation from 'double' to 'float' [D:\llama-model-data\llam
a.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(2498,69): warning C4566: character represented by universal-character-name '\u010A' cannot be represen
ted in the current code page (1252) [D:\llama-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9814,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9844,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\llama.vcxproj]
llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\Release\llama.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
llava.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\llava.cpp(32,24): warning C4244: 'initializing': conversion from 'double' to 'float', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
clip.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(251,20): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(465,9): warning C4297: 'clip_model_load': function assumed not to throw an exception but
does [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(465,9):
__declspec(nothrow), throw(), noexcept(true), or noexcept was specified on the function
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(714,46): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: 'initializing': conversion from 'const _Ty' to 'uint8_t', possib
le loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: with [D:\llama-model-data\llama.cpp\llama.cpp-master\bui
ld\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: [ [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: _Ty=float [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,44): warning C4244: ] [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: 'initializing': conversion from 'const _Ty' to 'const uint8_t',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: with [D:\llama-model-data\llama.cpp\llama.cpp-master\bui
ld\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: [ [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: _Ty=float [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(817,34): warning C4244: ] [D:\llama-model-data\llama.cpp\llama.cpp-master\build\
examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(855,20): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(999,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(999,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1002,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1002,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1005,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1005,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1008,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1008,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1011,88): warning C4244: 'argument': conversion from 'int64_t' to 'int', possible loss o
f data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1011,71): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\clip.cpp(1073,42): warning C4244: 'return': conversion from 'int64_t' to 'int', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.vcxproj]
Generating Code...
llava.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava.dir\Release\llava.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/common/CMakeLists.txt
common.cpp
sampling.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\common\sampling.cpp(75,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of d
ata [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\common\sampling.cpp(75,20): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possible los
s of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
console.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\common\console.cpp(253,30): warning C4267: 'initializing': conversion from 'size_t' to 'DWORD', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\common\console.cpp(407,28): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of d
ata [D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
grammar-parser.cpp
train.cpp
Generating Code...
D:\llama-model-data\llama.cpp\llama.cpp-master\common\common.cpp(887): warning C4715: 'gpt_random_prompt': not all control paths return a value [D:\llama-model
-data\llama.cpp\llama.cpp-master\build\common\common.vcxproj]
common.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\common\Release\common.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/baby-llama/CMakeLists.txt
baby-llama.cpp
baby-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\baby-llama.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/batched/CMakeLists.txt
batched.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(72,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(72,24): warning C4267: 'initializing': conversion from 'size_t' to 'const int', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(114,50): warning C4267: 'argument': conversion from 'size_t' to 'int32_t', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\batched\batched.cpp(118,48): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possib
le loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\batched\batched.vcxproj]
batched.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\batched.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/batched-bench/CMakeLists.txt
batched-bench.cpp
batched-bench.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\batched-bench.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/beam-search/CMakeLists.txt
beam-search.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\beam-search\beam-search.cpp(163,83): warning C4267: 'argument': conversion from 'size_t' to 'int32_t',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\beam-search\beam-search.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\beam-search\beam-search.cpp(168,31): warning C4267: '+=': conversion from 'size_t' to 'int', possible l
oss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\beam-search\beam-search.vcxproj]
beam-search.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\beam-search.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/benchmark/CMakeLists.txt
benchmark-matmult.cpp
benchmark.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\benchmark.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/convert-llama2c-to-ggml/CMakeLists.txt
convert-llama2c-to-ggml.cpp
convert-llama2c-to-ggml.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\convert-llama2c-to-ggml.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/embedding/CMakeLists.txt
embedding.cpp
embedding.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\embedding.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/export-lora/CMakeLists.txt
export-lora.cpp
export-lora.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\export-lora.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/finetune/CMakeLists.txt
finetune.cpp
finetune.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\finetune.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
ggml_static.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\Release\ggml_static.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/infill/CMakeLists.txt
infill.cpp
infill.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\infill.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llama-bench/CMakeLists.txt
llama-bench.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: 'initializing': conversion from 'double' to 'T', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: with [D:\llama-model-data\llama.cpp\llama.cp
p-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: [ [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: T=uint64_t [D:\llama-model-data\llama.cp
p\llama.cpp-master\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13): warning C4244: ] [D:\llama-model-data\llama.cpp\llama.cpp-m
aster\build\examples\llama-bench\llama-bench.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(72,13):
the template instantiation context (the oldest one first) is
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llama-bench\llama-bench.cpp(531,18):
see reference to function template instantiation 'T stdev<uint64_t>(const std::vector<uint64_t,std::allocator<uint64_t>> &)' being compiled
with
[
T=uint64_t
]
llama-bench.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\llama-bench.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
llava-cli.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\llava\llava-cli.cpp(150,105): warning C4267: 'argument': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\llava-cli.vcxproj]
llava-cli.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\llava-cli.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/llava/CMakeLists.txt
llava_static.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\llava\Release\llava_static.lib
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/lookahead/CMakeLists.txt
lookahead.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(91,33): warning C4267: 'initializing': conversion from 'size_t' to 'int', possi
ble loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(91,23): warning C4267: 'initializing': conversion from 'size_t' to 'const int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(108,16): warning C4267: 'initializing': conversion from 'size_t' to 'int', poss
ible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\lookahead\lookahead.cpp(365,129): warning C4267: 'argument': conversion from 'size_t' to 'int', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\lookahead\lookahead.vcxproj]
lookahead.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\lookahead.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/main/CMakeLists.txt
main.cpp
main.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\main.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/parallel/CMakeLists.txt
parallel.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(159,21): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(165,55): warning C4267: 'initializing': conversion from 'size_t' to 'int32_t', po
ssible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(165,35): warning C4267: 'initializing': conversion from 'size_t' to 'const int32_
t', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(257,68): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', poss
ible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\parallel\parallel.cpp(265,58): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\parallel\parallel.vcxproj]
parallel.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\parallel.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/perplexity/CMakeLists.txt
perplexity.cpp
perplexity.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\perplexity.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/pocs/vdot/CMakeLists.txt
q8dot.cpp
q8dot.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\q8dot.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/quantize/CMakeLists.txt
quantize.cpp
quantize.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\quantize.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/quantize-stats/CMakeLists.txt
quantize-stats.cpp
quantize-stats.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\quantize-stats.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/save-load-state/CMakeLists.txt
save-load-state.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\save-load-state\save-load-state.cpp(42,69): warning C4267: 'argument': conversion from 'size_t' to 'int
32_t', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\save-load-state\save-load-state.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\save-load-state\save-load-state.cpp(43,26): warning C4267: '+=': conversion from 'size_t' to 'int', pos
sible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\save-load-state\save-load-state.vcxproj]
save-load-state.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\save-load-state.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/server/CMakeLists.txt
server.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(88,16): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(102,52): warning C4267: '=': conversion from 'size_t' to 'uint8_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(126,48): warning C4267: '=': conversion from 'size_t' to 'uint8_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(818,49): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(822,93): warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss
of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(869,67): warning C4101: 'e': unreferenced local variable [D:\llama-model-data\llama.c
pp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(924,67): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(997,53): warning C4267: '-=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1472,26): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1652,71): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1714,64): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1741,68): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1760,50): warning C4267: '=': conversion from 'size_t' to 'int32_t', possible loss of
data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1768,78): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\server\server.cpp(1791,96): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possibl
e loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\server\server.vcxproj]
server.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\server.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/simple/CMakeLists.txt
simple.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(71,45): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible lo
ss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(71,24): warning C4267: 'initializing': conversion from 'size_t' to 'const int', possi
ble loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\simple\simple.cpp(99,48): warning C4267: 'argument': conversion from 'size_t' to 'llama_pos', possible
loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\simple\simple.vcxproj]
simple.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\simple.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/speculative/CMakeLists.txt
speculative.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(130,33): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(130,23): warning C4267: 'initializing': conversion from 'size_t' to 'const
int', possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(151,20): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
D:\llama-model-data\llama.cpp\llama.cpp-master\examples\speculative\speculative.cpp(152,20): warning C4267: 'initializing': conversion from 'size_t' to 'int',
possible loss of data [D:\llama-model-data\llama.cpp\llama.cpp-master\build\examples\speculative\speculative.vcxproj]
speculative.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\speculative.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-c.c
test-c.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-c.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-grad0.cpp
test-grad0.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-grad0.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-grammar-parser.cpp
test-grammar-parser.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-grammar-parser.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-llama-grammar.cpp
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(1207,31): warning C4305: 'initializing': truncation from 'double' to 'float' [D:\llama-model-data\llam
a.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
(compiling source file '../../tests/test-llama-grammar.cpp')
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(2498,69): warning C4566: character represented by universal-character-name '\u010A' cannot be represen
ted in the current code page (1252) [D:\llama-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
(compiling source file '../../tests/test-llama-grammar.cpp')
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9814,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
(compiling source file '../../tests/test-llama-grammar.cpp')
D:\llama-model-data\llama.cpp\llama.cpp-master\llama.cpp(9844,28): warning C4146: unary minus operator applied to unsigned type, result still unsigned [D:\llam
a-model-data\llama.cpp\llama.cpp-master\build\tests\test-llama-grammar.vcxproj]
(compiling source file '../../tests/test-llama-grammar.cpp')
test-llama-grammar.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-llama-grammar.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-quantize-fns.cpp
test-quantize-fns.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-quantize-fns.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-quantize-perf.cpp
test-quantize-perf.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-quantize-perf.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-rope.cpp
test-rope.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-rope.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-sampling.cpp
test-sampling.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-sampling.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-tokenizer-0-falcon.cpp
test-tokenizer-0-falcon.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-0-falcon.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-tokenizer-0-llama.cpp
test-tokenizer-0-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-0-llama.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-tokenizer-1-bpe.cpp
test-tokenizer-1-bpe.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-1-bpe.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/tests/CMakeLists.txt
test-tokenizer-1-llama.cpp
test-tokenizer-1-llama.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\test-tokenizer-1-llama.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/tokenize/CMakeLists.txt
tokenize.cpp
tokenize.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\tokenize.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/examples/train-text-from-scratch/CMakeLists.txt
train-text-from-scratch.cpp
train-text-from-scratch.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\train-text-from-scratch.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/pocs/vdot/CMakeLists.txt
vdot.cpp
vdot.vcxproj -> D:\llama-model-data\llama.cpp\llama.cpp-master\build\bin\Release\vdot.exe
Building Custom Rule D:/llama-model-data/llama.cpp/llama.cpp-master/CMakeLists.txt
You can see the full command line of nvcc excerpted below from the upstream build's log and I'm sure it is the lack of some of those parameters causing the issues with llamafile's invocation:
D:\llama-model-data\llama.cpp\llama.cpp-master\build>"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64" -x cu -I"D:\llama-model-data\llama.cpp\llama.cpp-master." -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" -I"D:\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include" --keep-dir x64\Release -use_fast_math -maxrregcount=0 --machine 64 --compile -cudart static --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] -Xcompiler="/EHsc -Ob2" -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -DGGML_USE_CUBLAS -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1 -DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128 -D_CRT_SECURE_NO_WARNINGS -D_XOPEN_SOURCE=600 -D"CMAKE_INTDIR="Release"" -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdggml.dir\Release\ggml.pdb" -o ggml.dir\Release\ggml-cuda.obj "D:\llama-model-data\llama.cpp\llama.cpp-master\ggml-cuda.cu"
from llamafile.
@CeruleanSky I'm using the same version of CUDA as you, on Windows. Slightly different MSVC revision. Things work fine for me. The nvcc command run by llama.cpp's cmake build config is highly different from what the llama.cpp makefile config uses. llamafile is based off the makefile config. The nvcc command that llamafile runs on my machine is:
"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\bin\nvcc.exe"
-arch=native --shared --forward-unknown-to-host-compiler -use_fast_math
--compiler-options "-fPIC -O3 -march=native -mtune=native" -DNDEBUG
-DGGML_BUILD=1 -DGGML_SHARED=1 -DGGML_CUDA_DMMV_X=32 -DGGML_CUDA_MMV_Y=1
-DK_QUANTS_PER_ITERATION=2 -DGGML_CUDA_PEER_MAX_BATCH_SIZE=128
-DGGML_USE_CUBLAS -o C:\Users\jtunn\.llamafile\ggml-cuda.dll
C:\Users\jtunn\.llamafile\ggml-cuda.cu -lcublas
I won't make changes to the build config unless I understand why they need to be made. Could you please troubleshoot things further and tell me how specifically the above command needs to be changed so that it'll work on your machine? Thanks!
from llamafile.
I think the readme should change MSVC x64 native command prompt
to x64 Native Tools Command Prompt for VS
as what is likely happening we both just open Developer Command Prompt for VS
and expect it to work.
The clue was thanks to the below error when I starting adding parameters:
nvcc fatal : cl.exe in PATH (C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86) is different than one specified with -ccbin (C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX64/x64)
when I added the -ccbin parameter from above.
This lead me to finding out that running nvcc from Developer Command Prompt for VS 2022
that the 32 bit compiler C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/HostX86/x86
is used so instead run llamafile-server from x64 Native Tools Command Prompt for VS 2022
to ensure that C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\bin\HostX64\x64
is used.
I believe the reason the cmake build works is that it modifies the PATH variable and then manually picks out the correct compiler using -ccbin
from llamafile.
@savant117 I have the same question, have you found a solution?
windows
11 cuda 12.3 `
from llamafile.
Related Issues (20)
- Porting "Inplace Upgrading Of Llamafiles Engine Bash Script" to llamafile for general usage
- Illegal Instruction when running a llamafile HOT 7
- Are embeddings not supported with the mistral-7b-instruct-v0.2 model? HOT 4
- llamafile as LLM server for Mantella mod and Skyrim, is working nice but there is a little problem. HOT 6
- Can't run on AMD GPU, while llama.cpp does
- Add explanation for Windows user to how to Create EXE files HOT 4
- How to run llamafile as a linux service HOT 3
- Build on Windows HOT 2
- fail to load Qwen1.5-MoE-A2.7B-Chat on win10
- How to set context size? Running dolphin mixtral q4km, using too much of my 64gb of ram. want to lower it. HOT 3
- A better gui?
- .llamafile folder corruption?
- May I ask how to export and use LORA? I cannot use the BIN file converted with llama.cpp on my end HOT 2
- unknown pre-tokenizer type: 'qwen2' HOT 2
- CUDA kernel vec_dot_q4_K_q8_1_impl_vmmq has no device code compatible with CUDA arch 600 HOT 1
- Completion of error handling
- Is it possible for llamafile to use Vulkan or OpenCL Acceleration? HOT 9
- Added the ability to use LLAMA_HIP_UMA HOT 6
- Windows 10 GPU support bug HOT 2
- AMD - tinyBLAS windows prebuilt support stopped working with 0.8.5 HOT 24
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llamafile.