GithubHelp home page GithubHelp logo

Comments (3)

Facico avatar Facico commented on August 15, 2024 4

@greatewei 目前还不支持。如果只是单纯把lora模型的权重叠加并不会有好的效果。

不过lora应该可以像MoE那样将多个lora模型合并,这是一个很有前途的架构,估计现在有很多科研前线的研究人员在做了,其实就和adapterFusion一个道理,原理很简单AdapterFusion: Non-Destructive Task Composition for Transfer Learning(AdapterFusion),stable diffusion那边挺多弄这个的。

要实现的话可以参考一下思路:
1、hard MoE,在对一个句子动态选择使用哪个lora权重
2、soft MoE,对一个句子的时候,把各种lora计算一个attention权重,然后fusion起来

这些都是很有意思的idea,不过我们目前还不支持这样做。

from chinese-vicuna.

Facico avatar Facico commented on August 15, 2024 1

@greatewei lora的效果并不具有很强的学习新知识的能力,可以参考这个issue的回答

如果这个小领域是llama预训练覆盖的一些任务,可能会有挖掘它的能力的效果,所以我也不能回答你是否能确切地解决小领域的问题。

数据量的问题不是数据越多越好,在我们这个架构上,而是这个领域内是否能有高质量的instruction,同时要求instruction的种类丰富(instructGPT中每个任务都有不同的instruction),对于instruction改怎么写,这个可以参考相关论文、或者看看我们使用的相关数据,或者可以用chatgpt帮助你生成类似的instruction指令。

from chinese-vicuna.

greatewei avatar greatewei commented on August 15, 2024

lora模型训练的数据大小有要求吗,我现在想要训练一个小领域的数据,但是数据量可能比较小,base_model使用BELLE,便于理解中文,这样训练出来的模型效果如何,是否能够正确回答lora小领域的问题。

from chinese-vicuna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.