baijiong-lin / lora-torch Goto Github PK
View Code? Open in Web Editor NEWPyTorch Reimplementation of LoRA
License: MIT License
PyTorch Reimplementation of LoRA
License: MIT License
Hi, I am having problems with your library as I am new to lora.
I have a pre-trained model class and the individual layers defined.
How do I do if I want to use lora to fine tune one of the embedding?
I changed the corresponding layer in model.py to self.emb_lora = lora.Embedding(l_dim, embed_dim, r=16, lora_alpha=32)
as instructed in the readme. But when loading the checkpoint, I found that the corresponding key is missing (lora_a, lora_b are missing because the structure is original at the time of training),how should I fix it?
Line 71 ends up failing under .compile() as it returns a NameError
with self
missing.
My offhand recommendations would probably be to institute:
getattr(self, f' ... ')
Hello,
thanks for this implementation - very useful.
I had a question regarding MultiheadAttention
class - it seems like out_proj.weight
is not updated or I am missing something?
Thanks!
Hello there, thank you for this amazing work!
Since reading your README, it seems that MultiheadAttention is supported. Are there any documents referring to this?
In my understanding, the in_proj_weight represents the weights_qkv, similar to the linear_layer in other implementations.
However, when I try to use lora-torch at this step, it results in the following error: "TypeError: cannot assign 'loralib.layers.Embedding' as parameter 'in_proj_weight' (torch.nn.Parameter or None expected)."
Please let me know if it is incorrect to replace the w_qkv with lora or any instruction is much appreciated.
w_qkv = blk.attn.in_proj_weight
w_qkv_shape = w_qkv.shape
lora_embedding = loralib.Embedding(w_qkv_shape[0], w_qkv_shape[1], r=r)
blk.attn.in_proj_weight = lora_embedding
Hello, thanks for such a useful code! However, when I use lora in a multi-head attention layer, I find that it is slow when inferring.
I looked at the source code and suspected that it was due to the need to re-merge the parameters of the lora and recover them each time the forward function is called (the corresponding functions are self.merge_lora_param()
and self.sub_lora_data()
).
So, can I please remove the above two directives from the forward and move them to the end of the __init__
function at the same time? I'm not sure this operation will have any bad effects.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.