baijiong-lin / lora-torch Goto Github PK

View Code? Open in Web Editor NEW

36.0 2.0 3.0 30 KB

PyTorch Reimplementation of LoRA

License: MIT License

Python 100.00%

lora fine-tuning finetuning peft

lora-torch's People

Stargazers

Watchers

Forkers

till-m fatflower sareena-2001

lora-torch's Issues

Some problem on that structure modification of pre-trained model when use loratorch

Hi, I am having problems with your library as I am new to lora.

I have a pre-trained model class and the individual layers defined.

How do I do if I want to use lora to fine tune one of the embedding?

I changed the corresponding layer in model.py to self.emb_lora = lora.Embedding(l_dim, embed_dim, r=16, lora_alpha=32) as instructed in the readme. But when loading the checkpoint, I found that the corresponding key is missing (lora_a, lora_b are missing because the structure is original at the time of training),how should I fix it?

Torch .compile() vs. implementation

Line 71 ends up failing under .compile() as it returns a NameError with self missing.

My offhand recommendations would probably be to institute:

getattr(self, f' ... ')
Cytoolz.get on self
Some sort of Cytoolz.keyfilter on a dict representation of self

MultiheadAttention out_projection

Hello,

thanks for this implementation - very useful.

I had a question regarding MultiheadAttention class - it seems like out_proj.weight is not updated or I am missing something?

Thanks!

a question about nn.MultiheadAttention

Hello there, thank you for this amazing work!

Since reading your README, it seems that MultiheadAttention is supported. Are there any documents referring to this?

In my understanding, the in_proj_weight represents the weights_qkv, similar to the linear_layer in other implementations.

However, when I try to use lora-torch at this step, it results in the following error: "TypeError: cannot assign 'loralib.layers.Embedding' as parameter 'in_proj_weight' (torch.nn.Parameter or None expected)."

Please let me know if it is incorrect to replace the w_qkv with lora or any instruction is much appreciated.

w_qkv = blk.attn.in_proj_weight
w_qkv_shape = w_qkv.shape
lora_embedding = loralib.Embedding(w_qkv_shape[0], w_qkv_shape[1], r=r)
blk.attn.in_proj_weight = lora_embedding

About Code Efficiency Improvement

Hello, thanks for such a useful code! However, when I use lora in a multi-head attention layer, I find that it is slow when inferring.
I looked at the source code and suspected that it was due to the need to re-merge the parameters of the lora and recover them each time the forward function is called (the corresponding functions are self.merge_lora_param() and self.sub_lora_data()).

So, can I please remove the above two directives from the forward and move them to the end of the __init__ function at the same time? I'm not sure this operation will have any bad effects.

baijiong-lin / lora-torch Goto Github PK

lora-torch's People

Stargazers

Watchers

Forkers

lora-torch's Issues

Some problem on that structure modification of pre-trained model when use loratorch

Torch .compile() vs. implementation

MultiheadAttention out_projection

a question about nn.MultiheadAttention

About Code Efficiency Improvement

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs