Comments (11)
It's possible to launch the github action on a server with gpu to compile cuda. However I only have my personal computer.πΆβπ«οΈ I could help to compile cuda version and test manually until there's a solution
from llamasharp.
I've managed to work out the Linux cuda build, turns out it was basically the same as the Windows one. Fortunately we don't need a GPU for compiling CUDA, although it might be a good idea to run some CI on a GPU server in the future.
Someone gave me some tips on how to build for metal, so I will investigate adding that to the GitHub action. But that probably won't be until next weekend.
from llamasharp.
In my Fork I have a different workflow and a Self-Hosted action to pass the MacOS test in my computer.
from llamasharp.
WIP: https://github.com/martindevans/LLamaSharp/tree/cron_job
from llamasharp.
I'm late to see this good idea. Is there any help needed for the binary release? BTW, I remember that github action only has intel MAC instead of M1/M2 MAC. Is there already a fix for it?
from llamasharp.
I'm late to see this good idea. Is there any help needed for the binary release?
I've done most of the work for the compiling of binaries in the new GitHub action (currently only manually triggered). This pulls in the master branch from llama.cpp and builds for:
- Windows
- Windows+CUBLAS
- Linux
- MacOS (ARM64)
- MacOS (x86_64)
I have not yet done a metal build for MacOS. I have no idea how that works, so any help there would be really good.
BTW, I remember that github action only has intel MAC instead of M1/M2 MAC. Is there already a fix for it?
Unfortunately not, Mac runners are still all x86_64. I've set up the new action to cross compile, so it can create the ARM64 binaries, it just can't run them. That does mean we still have no way to test ARM64 though :(
from llamasharp.
I just noticed the GitHub action doesn't include cuda .so files either, so that needs doing as well.
from llamasharp.
@martindevans , I post that on the issue related to the flags used to compile on MacOS with and without Metal #38 (comment)
from llamasharp.
@martindevans Once that we are able to generate all the binaries from Actions. Do we agree that this binaries should be committed to master/LLama/runtime in a daily frequency (that's an example)?. In that way we are sure that we are aligned with llama.cpp and that the binaries are generated without modifications.
from llamasharp.
Yeah that's basically the idea. I was thinking that rather than committing directly to master it would open up a PR with the new binaries occasionally (e.g. weekly). The CI would then run on that PR and tell us if anything has broken (since llama.cpp has such an unstable interface I would expect some human intervention required most times).
from llamasharp.
I just noticed that one of the recent updates (https://github.com/ggerganov/llama.cpp/releases/tag/b1250) has enabled building the shared libs for all Windows builds. Maybe the GitHub action for building binaries should be updated to just download those directly?
That would save a lot of time in the build process (compiling Windows+CUDA takes about 20 minutes).
from llamasharp.
Related Issues (20)
- Unable to load SYCL compiled backend HOT 12
- LLamaSharp runtime binaries don't support Rosetta2 HOT 7
- Split the main package
- Make `LLamaSharp.semantic-kernel` depend on `Microsoft.SemanticKernel.Core` instead of `Microsoft.SemanticKernel`. HOT 1
- [Feature]: SemanticKernel FuctionCall HOT 3
- [BUG]: When using large models with the GPU the code crashes with cannot allocate kvcache HOT 13
- Llava DLL issue in Unity HOT 2
- [BUG]: When using the output IAsyncEnumerable<string> of session.ChatAsync() strings are not streamed into client HOT 5
- Semantic Kernel - Home Automation Sample HOT 4
- Phi-3-medium-128k-instruct - error due to tensor shape expected 245, got 243 HOT 2
- [BUG]: Offset and length were out of bounds HOT 1
- Have an error then try run example "KernelMemorySaveAndLoad" HOT 1
- [Feature]: Expose implementation details of the KV Cache HOT 6
- Unable to Utilize Full CPU Capacity During Inference HOT 2
- [BUG]: Cannot load the backend on MACOS HOT 2
- [BUG]: qwen2 nvidia abnormal occurrence HOT 7
- [Feature]: AuthorRole can custom role labels be supported οΌ
- SEHException on Tokenize model. HOT 5
- Cannot figure out how to switch backend to OpenCL HOT 3
- [Feature]: Support JSON Schema from llama.cpp HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llamasharp.