I'm <a href="https://github.com/microsoft/kernel-memory/blob/83ffa83031529f3ac9e04cfc7

Test 1: only CPU nuget Time taken: 29 secs <d

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Also you could try adding a breakpoint in <a href="https://github.com/SciSharp/LLamaSh

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support for multiple backends, auto-select the most optimal about llamasharp HOT 7 CLOSED

dluc commented on September 26, 2024

Support for multiple backends, auto-select the most optimal

from llamasharp.

Comments (7)

AsakusaRinne commented on September 26, 2024 1

Could you please take a look of the file structure of your output directory? (x64/debug/net6.0/runtimes for example) Then please use NativeLibraryConfig.Instance.WithLogs() to see the details during backend selection.

from llamasharp.

dluc commented on September 26, 2024

Test 1: only CPU nuget

Time taken: 29 secs

<PackageReference Include="LLamaSharp" /> 
<PackageReference Include="LLamaSharp.Backend.Cpu" />

[LLamaSharp Native] [Info] NativeLibraryConfig Description:
- Path: 
- PreferCuda: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- Logging: True
- SearchDirectories and Priorities: { ./ }
[LLamaSharp Native] [Info] Detected OS Platform: WINDOWS
[LLamaSharp Native] [Info] Detected cuda major version 12.
[LLamaSharp Native] [Info] Tried to load runtimes/win-x64/native/cuda12/libllama.dll but failed.
[LLamaSharp Native] [Info] ./runtimes/win-x64/native/avx2/libllama.dll is selected and loaded successfully.

Test 2: only CUDA 12 nuget

Time taken: 22 secs

<PackageReference Include="LLamaSharp" />
<PackageReference Include="LLamaSharp.Backend.Cuda12" />

[LLamaSharp Native] [Info] NativeLibraryConfig Description:
- Path: 
- PreferCuda: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- Logging: True
- SearchDirectories and Priorities: { ./ }
[LLamaSharp Native] [Info] Detected OS Platform: WINDOWS
[LLamaSharp Native] [Info] Detected cuda major version 12.
[LLamaSharp Native] [Info] ./runtimes/win-x64/native/cuda12/libllama.dll is selected and loaded successfully.

Under "native" there's only one folder with the DLL inside:

Test 3: both CPU and CUDA 12 nugets

Time taken: 29 secs

<PackageReference Include="LLamaSharp" /> 
<PackageReference Include="LLamaSharp.Backend.Cpu" />
 <PackageReference Include="LLamaSharp.Backend.Cuda12" />

[LLamaSharp Native] [Info] NativeLibraryConfig Description:
- Path: 
- PreferCuda: True
- PreferredAvxLevel: AVX2
- AllowFallback: True
- SkipCheck: False
- Logging: True
- SearchDirectories and Priorities: { ./ }
[LLamaSharp Native] [Info] Detected OS Platform: WINDOWS
[LLamaSharp Native] [Info] Detected cuda major version 12.
[LLamaSharp Native] [Info] ./runtimes/win-x64/native/cuda12/libllama.dll is selected and loaded successfully.

FunctionalTests\bin\Debug\net7.0\runtimes:

FunctionalTests\bin\Debug\net7.0\runtimes\win-x64\native:

from llamasharp.

dluc commented on September 26, 2024

@AsakusaRinne so everything looks ok from the logs, but as you can see the time taken doesn't match.

Test 1: only CPU nuget - Time taken: 29 secs
Test 2: only CUDA 12 nuget - Time taken: 22 secs
Test 3: both CPU and CUDA 12 nugets - Time taken: 29 secs

from llamasharp.

AsakusaRinne commented on September 26, 2024

If you delete the libllama.dll at the root of output path, would the time taken act as expected? Note that after removing the file, please don't re-compile the project.

from llamasharp.

martindevans commented on September 26, 2024

Also you could try adding a breakpoint in here. From there you can see the exact set of binaries it's going to try (libraryTryLoadOrder) and which one it succeeds on.

from llamasharp.

dluc commented on September 26, 2024

I've created a new project to run these tests and taken the steps to manually delete the DLL files. Interestingly, in this new project, the performance timings between cuda12 and avx2 are too close to definitively say if there's an issue. For the time being, I'm placing my trust in the logs. If an opportunity arises where I can obtain a machine with a more powerful Nvidia card, I'll conduct further tests where the performance gap would likely be more evident.

from llamasharp.

AsakusaRinne commented on September 26, 2024

@dluc If needed, you could set GpuLayerCount a larger number to increase the time gap between cpu version and gpu version.

from llamasharp.

Support for multiple backends, auto-select the most optimal about llamasharp HOT 7 CLOSED

Comments (7)

Test 1: only CPU nuget

Test 2: only CUDA 12 nuget

Test 3: both CPU and CUDA 12 nugets

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs