GithubHelp home page GithubHelp logo

Comments (14)

m0nsky avatar m0nsky commented on August 27, 2024 2

Yes, indeed. I don't know how much/fast the binaries on the llama.cpp side will grow, but it sounds like a matter of time until we run into the same issue. If the transition to the LLamaSharp-Binaries workflow goes smooth, a temporary solution could probably be skipped entirely.

I think it's cleaner to split the binaries to LLamaSharp-Binaries, to avoid confusion in the LLamaSharp releases section.

So, when doing a binary update:

  • Update llama.cpp commit id in LLamaSharp-Binaries
  • Run new build action in LLamaSharp-Binaries
  • Publish new release in LLamaSharp-Binaries
  • Update LLamaSharp-Binaries release id in LLamaSharp
  • Proceed as usual (code changes, testing)

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024 1

Note: until this is resolved, there will be no new binary update :(

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

I have pushed up a test of one potential solution (downloading directly from llama.cpp). You can see it here: https://github.com/SciSharp/LLamaSharp/blob/july-2024-binaries/LLama/LLamaSharp.csproj#L55

From a usability perspective this is pretty nice. Changing versions jsut requires updating that single LlamaCppReleaseTag property!

However, llama.cpp do not publish any shared objects for Linux!

from llamasharp.

AsakusaRinne avatar AsakusaRinne commented on August 27, 2024

In my opinion, it's a better option to drop the local binaries in git everywhere except the backend package building. It's similar to your option2 and I'd like to give a detailed description of it.

All the binaries could be removed from LLamaSharp git repo. Instead, we use the nuget backend packages in our example project. Where to put the binaries could be flexible because users are not supposed to touch it. For example, we can put the binaries in another git repo with git-lfs and the reference it as a submodule.

In this case, when binaries update is merged into master branch, we must publish a new release at once to make the new binaries available on nuget.

To keep our CI available, we need to update the submodules before running the test in workflows. Besides, we need to copy the binaries to output folder in LLamaSharp.unittest.csproj.

However, I have to say that the unit test coverage would be a problem. What I said above is based on an assumption that unit test is only for CI and examples are only for users. But actually, the example project is also responsible for test coverage now, which needs to be improved in the future.

I think this way is more clear for users because they only need to care about the nuget packages. Any ideas? @martindevans

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

What do you think of the potential solution I pushed up here. With this idea, the process would be:

  1. Run a build with the GitHub action (same as now)
  2. Create a Release somewhere on GitHub (either in this repo, or a dedicated LLamaSharp-Binaries repo). This is really just acting as file storage.
  3. Change the <LlamaCppReleaseTag>b3289</LlamaCppReleaseTag> to refer to the new release version
  4. msbuild will magically download all of the files into the folders where they are now

The example linked above is downloading directly from llama.cpp, but that isn't an option at the moment (they don't publish Linux shared objects). So it would be modified to download from our own release (e.g. https://github.com/martindevans/LLamaSharp/releases/tag/test-binaries), but the idea is the same.


we can put the binaries in another git repo with git-lfs and the reference it as a submodule.

Just a note about git-lfs, unfortunately I don't think we can use it at all. The GitHub free limits on git-lfs are tiny - just 1GiB a month across your whole account! So downloading 5x CUDA binaries would totally exhaust your entire account allocation for the whole month. That's not something I want to risk happening by accident!

from llamasharp.

AsakusaRinne avatar AsakusaRinne commented on August 27, 2024

What do you think of the potential solution I pushed up here.

It's ok but I think we should use the nuget package directly in our example project. Thus it will be more clear for new users. The only thing we need to do is to publish a new nuget package every time we update the binaries. Since git-lfs is limited on github, then downloading from the release is a good option.

In this way there will be another problem. If you want to run the github workflows, you need to let the unit test project downloads the new binaries. The binaries are put in release. However, we shouldn't publish a release without passing all the workflows. The only way I can come up with is to delete the release if the workflow fails.

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

The binaries are put in release. However, we shouldn't publish a release without passing all the workflows. The only way I can come up with is to delete the release if the workflow fails.

I think if we published the releases ourselves we'd basically end up with two types of "release".

  1. There would be the releases we currently have - actual releases with a change in version number, extensive release notes etc etc.
  2. Then there would also be these new "binary only" releases which are not really releases, they're just a way to store files for dev.

It's pretty messy :(

Splitting out the releases to another repo (which exists just for binary releases) might be a way to work around that, but it's a bit of a pain to have multiple repos.


I'm going to go and open an issue on the llama.cpp repo asking about shared objects in their releases. That way we would be able to skip the entire build step, our binaries would be the "official" ones, and we wouldn't have to mess around with any binary-only-releases.

That'll probably be slower than anything we do ourselves here, but it seems like the best overall solution.

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

Ok I changed my mind, I was typing up the feature request and it would be a colossal increase in the number of binaries they would need to compile for every release. I don't think there's any chance they would do it!

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

I've created a new release in this repo, just to test what it would look like. It's here: https://github.com/SciSharp/LLamaSharp/releases/tag/test-release-please-ignore. If we decide to go ahead with investigating this approach I'll attach some binaries to it.

If we went with a "binary only" release in this repo, we would do this:

  1. Run a GitHub action to generate new binaries
  2. Make a release with this binaries, it's not marked as the latest release so the front page still points to 0.13.0
  3. Make necessary code changes to support new binaries, change LlamaCppReleaseTag in csproj to point to the release we just created
  4. Open PR, test on all platforms. Anyone opening this version will automatically download the binaries.
  5. Merge it. Anyone opening the project on master will auto download the new binaries.
  6. Make nuget packages, as normal. Publish a new "proper" release with release notes etc.

from llamasharp.

m0nsky avatar m0nsky commented on August 27, 2024

Hmm, what about compressing the deps (at the end of the build action in compile.yml) and extracting when building the project? CUDA12 in the llama.cpp repo is ~95MB. I just did a quick test here and the compressed CUDA12 dep archive (containing llama.dll + ggml.dll) resulted in 91MB.

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

The limit is 100MB and according to an issue in the llama.cpp repo discussing the size these binaries are going to grow (support for new GPUs, new kernels etc). Given how close we already are to the limit when zipped that'd be a temporary solution. It is the probably the easiest option though.

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

Created a repo here for development: https://github.com/martindevans/LLamaSharpBinaries/releases/tag/1c5eba6f8e62

I'll put together a prototype downloading binaries from here, and will transfer ownership to SciSharp if we go ahead with this approach.

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

Draft PR here: #833

from llamasharp.

martindevans avatar martindevans commented on August 27, 2024

Released now with 0.14.0

from llamasharp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.