Comments (12)
Are you sure you're using the correct version? Each version of LLamaSharp is compatible with one version of llama.cpp - you must use exactly the right version when compiling the binaries!
See the table at the bottom of the readme for the versions: https://github.com/SciSharp/LLamaSharp?tab=readme-ov-file#map-of-llamasharp-and-llamacpp-versions
from llamasharp.
Thanks for your reply. Yes I've compiled against the correct commit. The error I receive is:
An error occurred during startup
at LLama.Native.NativeApi.llama_max_devices()
at LLama.Abstractions.TensorSplitsCollection..ctor()
at LLama.Common.ModelParams..ctor(String modelPath)
at PSLLM.CompletionEngine.StartMessageProcessor() in my-app\CompletionEngine.cs:line 266
at PSLLM.CompletionEngine.Start(LLMConfig config) in my-app\CompletionEngine.cs:line 40
at Program.
I've verified that LlamaSharp is pointing to a valid dll by adding a log callback (the 'Path' parameter shown is correct)
from llamasharp.
Is there more to that error dump? I'm hoping for a more specific message at the bottom of the stack trace.
from llamasharp.
Unfortunately not much, the inner exception is:
LLama.Exceptions.RuntimeError: Failed to load the native library [my-app/bin/Debug/net8.0/CustomBackends/IntelARC/llama.dll] you specified.
at LLama.Native.NativeApi.TryLoadLibraries(LibraryName lib)
at LLama.Native.NativeApi.<>c.b__60_0(String name, Assembly _, Nullable`1 _)
at System.Runtime.InteropServices.NativeLibrary.LoadLibraryCallbackStub(String libraryName, Assembly assembly, Boolean hasDllImportSearchPathFlags, UInt32 dllImportSearchPathFlags)
at LLama.Native.NativeApi.llama_max_devices()
at LLama.Native.NativeApi.llama_empty_call()
at LLama.Native.NativeApi..cctor()
from llamasharp.
I've hidden the path for anonymity, but other than that it's unchanged
from llamasharp.
Ah, unfortunately that's a pretty generic error.
I don't really know anything about SYCL, but from a bit of googling I think your guess about the set-vars script looks likely to be the issue. If you run the script before launching your application does that not fix the issue?
from llamasharp.
Maybe you could try to use dotnet cli in the intel one-API CMD line? Such as dotnet run xxx.csproj
from llamasharp.
I ran into this too, I think it's more an issue with llama.cpp than with LLamaSharp. you need to copy all the dependencies to wherever llama.dll is being loaded from - it's these files:
ggml_shared.dll
libmmd.dll
llava_shared.dll
mkl_core.2.dll
mkl_sycl_blas.4.dll
mkl_tbb_thread.2.dll
pi_win_proxy_loader.dll
svml_dispmd.dll
sycl7.dll
..which are all included in the sycl package but then that package doesn't include llama.dll 😂 so you have to build that yourself (if you do that then at least you'll already have all these dependencies on your system!). here's an issue to include the shared libs in the sycl package: ggerganov/llama.cpp#7361
I would love a backend nuget for SYCL!
from llamasharp.
Hi All,
Thank you for all your input, @gfody I think that was it. Got the built dependencies alongside llama.dll and it seemingly works. The backend loads and I'm able to start inference. However, I have noticed a interesting issue:
Inference is wildly unstable. Unlike when using the cpu backend, the sycl based backend just spits out garbage. Here's a copy of what my Llama 3 8b instruct model responded to me with when I prompted it with "Hello, how are you?":
"I gravel ?? enced pm ["$ Equipment secondary initWithStyle O zcze / assistant equality standby Perkins .Is balances opak I or 378 juices Is shel )data"
And another run:
' I """""""""""""""""""""""""""""""""""""""" '
I believe this is something unique to my sycl backend, though I'm not sure what yet.
@AsakusaRinne & @martindevans, I've added a call to Process to call the setvars.bat of the intel oneAPI tool from within my app before the backend is loaded. I originally thought that it wouldn't work since it'd be run against the context of the child process, but it does? I'm not really sure what happens under the hood here so idk. But I will try and do more testing. Would LlamaSharp accept a SYCL nuget package if we / I am able to create one? Apologies, I'm new here (and to collaborative open source dev).
Thank you again
from llamasharp.
Scratch that, a different model file works. Unsure why that particular model runs fine on cpu / nvidia but fails with sycl.
from llamasharp.
@gfody @mrtopkid Thank you for your efforts here. :)
I would love a backend nuget for SYCL!
Would LlamaSharp accept a SYCL nuget package if we / I am able to create one?
We certainly would not reject such a PR as long as it could run stably. I'll appreciate it if you'd like to contribute. If you would like to add such a backend, here's some tips for you:
- Please add a new
.nuspec
file like what was done in #489 - Please include both the windows and linux library files.
- I'm not sure whether the files @gfody mentioned are always requires to be in the same folder with
llama.dll
, or setting env var is enough. If the latest matches, please split the sycl backend toLLamaSharp.Backend.SYCL
andLLamaSharp.Backend.SYCL.Deps
because maybe some users only want the sycl-versionllama.dll
.
the sycl based backend just spits out garbage...
Could you please test the same model with llama.cpp main.exe compiled with SYCL? I'm not sure if its a LLamaSharp issue.
Note that some gguf files on hyuggingface are outdated. For example, a gguf file generated by llama.cpp 3 months ago probably cannot work well with the latest llama.cpp version. LLamaSharp has just completed a binary updating 2 weeks ago. So please avoid using gguf files uploaded several months ago.
from llamasharp.
Scratch that, a different model file works. Unsure why that particular model runs fine on cpu / nvidia but fails with sycl.
@mrtopkid I've noticed this as well increasing ContextSize
and GpuLayerCount
helps (I'm using 4096, and 80 for llava) - I think that Intel ARCs are just not getting as much testing considering all the hoops we had to jump through, maybe that will change once there's a LLamaSharp.Backend.SYCL nuget!
@AsakusaRinne I've tested this and you can indeed just put the sycl dependencies in your path, they don't necessarily have to be in the same location as llama.dll. presumably though maybe this could cause some unintended consequences for whatever reason why the oneAPI Toolkit doesn't just put these in our path and why also llama.cpp bundle these with their sycl release package - I don't know
from llamasharp.
Related Issues (20)
- [Feature]: AuthorRole can custom role labels be supported ?
- SEHException on Tokenize model. HOT 5
- Cannot figure out how to switch backend to OpenCL HOT 3
- [Feature]: Support JSON Schema from llama.cpp HOT 3
- [BUG]: Bad Typo in LLama/Extensions/IContextParamsExtensions.cs ? HOT 2
- [Feature]: Support for Gemma2 HOT 5
- [BUG]: NETSDK1152 Found multiple publish output files with the same relative path HOT 4
- [BUG]: The native library cannot be correctly loaded on Mac M1 HOT 1
- New Development Binaries System HOT 14
- Async implementation of LLamaExecutors HOT 2
- How to handle `CUDA error: out of memory`? HOT 1
- Method not found: 'Double Microsoft.KernelMemory.AI.TextGenerationOptions.get_TopP()'. HOT 9
- [BUG]: When the number of GpuLayerCount is more than 5, no data is returned or the speed is very slow HOT 2
- [BUG]: LLamaSharp.Backend not added as reference HOT 3
- [BUG]: Tokenization in 0.14.0 adds spaces HOT 4
- [BUG]: ChatSession unnecessarily prevents arbitrary conversation interleaving
- Strange Behaviors in Executors
- [BUG]: Method 'GetTokens' in type 'LLamaSharp.KernelMemory.LLamaSharpTextEmbeddingGenerator' from assembly 'LLamaSharp.KernelMemory, Version=0.14.0.0, Culture=neutral, PublicKeyToken=null' does not have an implementation. HOT 8
- [BUG]: App crashes with CUDA error in ggml-cuda.cu:1503
- [BUG]: llama_encode has no implementation HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llamasharp.