GithubHelp home page GithubHelp logo

lora-torch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

lora-torch's Issues

Some problem on that structure modification of pre-trained model when use loratorch

Hi, I am having problems with your library as I am new to lora.

I have a pre-trained model class and the individual layers defined.

How do I do if I want to use lora to fine tune one of the embedding?

I changed the corresponding layer in model.py to self.emb_lora = lora.Embedding(l_dim, embed_dim, r=16, lora_alpha=32) as instructed in the readme. But when loading the checkpoint, I found that the corresponding key is missing (lora_a, lora_b are missing because the structure is original at the time of training),how should I fix it?

Torch .compile() vs. implementation

Line 71 ends up failing under .compile() as it returns a NameError with self missing.

My offhand recommendations would probably be to institute:

  • getattr(self, f' ... ')
  • Cytoolz.get on self
  • Some sort of Cytoolz.keyfilter on a dict representation of self

MultiheadAttention out_projection

Hello,

thanks for this implementation - very useful.

I had a question regarding MultiheadAttention class - it seems like out_proj.weight is not updated or I am missing something?

Thanks!

a question about nn.MultiheadAttention

Hello there, thank you for this amazing work!

Since reading your README, it seems that MultiheadAttention is supported. Are there any documents referring to this?

In my understanding, the in_proj_weight represents the weights_qkv, similar to the linear_layer in other implementations.

However, when I try to use lora-torch at this step, it results in the following error: "TypeError: cannot assign 'loralib.layers.Embedding' as parameter 'in_proj_weight' (torch.nn.Parameter or None expected)."

Please let me know if it is incorrect to replace the w_qkv with lora or any instruction is much appreciated.

w_qkv = blk.attn.in_proj_weight
w_qkv_shape = w_qkv.shape
lora_embedding = loralib.Embedding(w_qkv_shape[0], w_qkv_shape[1], r=r)
blk.attn.in_proj_weight = lora_embedding

About Code Efficiency Improvement

Hello, thanks for such a useful code! However, when I use lora in a multi-head attention layer, I find that it is slow when inferring.
I looked at the source code and suspected that it was due to the need to re-merge the parameters of the lora and recover them each time the forward function is called (the corresponding functions are self.merge_lora_param() and self.sub_lora_data()).

So, can I please remove the above two directives from the forward and move them to the end of the __init__ function at the same time? I'm not sure this operation will have any bad effects.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.