GithubHelp home page GithubHelp logo

Comments (6)

EricLBuehler avatar EricLBuehler commented on July 18, 2024

Hi @niranjanakella!

Candle does not support LoRA adapters. Additionally, neither will candle-lora, since you have adapter_model.bin and adapter_config.json, which probably means you trained with PEFT? Mistral.rs supports LoRA adapters from PEFT though, and you can run them on a GGUF model (we don't have a T5 model yet, but I can add that soon if you are interested). It will merge the weights into the base model at runtime to optimize performance and because training is not expected. Is there a reason why you do not want to use weight merging?

from candle.

niranjanakella avatar niranjanakella commented on July 18, 2024

@EricLBuehler Yes exactly!! I have trained using PEFT and I want the adapter to be loaded alongside the GGUF model at runtime. Support for the T5 model is highly required and appreciated and I am planning to build major applications with T5. So please do let me know how soon can we expect the support for T5 in mistral.rs.

from candle.

EricLBuehler avatar EricLBuehler commented on July 18, 2024

@niranjanakella I can add it over the weekend. Can you please open an issue on mistral.rs as this is non-Candle discussion? Thanks!

from candle.

EricLBuehler avatar EricLBuehler commented on July 18, 2024

@niranjanakella, would a T5 GGUF model be the best option?

from candle.

niranjanakella avatar niranjanakella commented on July 18, 2024

@EricLBuehler Yes T5 architecture would be best for most downstream encoder-decoder tasks given the fact that flan version of the model is widely used across the industry. Sure I shall open an issue for the support of T5 architecture in mistral.rs. Awesome.

from candle.

niranjanakella avatar niranjanakella commented on July 18, 2024

@EricLBuehler I have opened "# 384" in mistral.rs that relates to the integration of T5 architecture type. And BTW, T5 is a Seq2Seq Language model, it doesn't fall under embedding models.

from candle.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.