yjiangcm / lte Goto Github PK

View Code? Open in Web Editor NEW

24.0 24.0 3.0 35.24 MB

Code for "Learning to Edit: Aligning LLMs with Knowledge Editing (ACL 2024)"

Home Page: https://arxiv.org/abs/2402.11905

License: Apache License 2.0

Python 95.66% Shell 1.08% Dockerfile 0.10% Jupyter Notebook 3.16%

knowledge-editing large-language-models llama model-editing qwen

lte's Introduction

Hey there!👋 I'm Yuxin Jiang (personal web: https://yjiangcm.github.io/).

🤝🏻 Connect with me

lte's People

Contributors

Stargazers

Watchers

Forkers

like-ying sfrrrr jie311

lte's Issues

OOM question

Excellent work! But I ran into OOM issue when i runing the lora method for llama-2-7b-chat model:

Specifically, I applied 4xA100(80GB) in the slurm system and then run the code below:
bash FastChat/lora_train.sh

The following error messages are displayed

torch.cudatorch.cuda..OutOfMemoryErrorOutOfMemoryError: : CUDA out of memory. Tried to allocate 2.00 GiB (GPU 0; 79.15 GiB total capacity; 77.13 GiB already allocated; 307.31 MiB free; 78.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONFCUDA out of memory. Tried to allocate 2.00 GiB (GPU 3; 79.15 GiB total capacity; 77.13 GiB already allocated; 355.31 MiB free; 78.21 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I'm confused about this situation, lora should be a peft method, why this OOM issue could occur when applying 4xA100(80GB)?
What could be the reason, please？

Code for Parallel Data Construction

Hi @YJiangcm, may I ask is the codes for parallel data construction is included in this repository? If not, could you add that code please? Thank you so much.

How to load LoRA model for evaluation?

Hello👋
Thank you for your great work for the community!

I've been trying to reproduce your results, but I'm a bit stuck. Due to the limited computing resources, I can only train Llama2-7b using LoRA. Still, there is only an evaluation script for full FT. So, I'm not sure how to use this script to evaluate the LoRA model.

Thank you in advance for your time and support!

Provide trained models

Hi @YJiangcm, the performance of the model trained using the code in this repository differs from that claimed in the paper.

At your convenience, would you provide the trained models for evaluation please? Thank you so much.

How to perform a batch edit experiment？

The base method adopted by LTE is the IKE method, but the batch_editor file shows that the IKE method does not support batch edit, so how is the batch edit experiment carried out in the paper?

Clarification on Ablation Study for Retrieval Number and Additional Code Request

Hi there,
I am running the code in the repo and notice that prompt and target_new are being used directly as updated information without retrieval. I'm curious about how you've approached the ablation study for the retrieval number k. Are you following the IKE method? Also, would it be possible to share more code to shed light on this process?

Thanks for the fantastic work you're doing! I can't wait to hear back from you.

yjiangcm / lte Goto Github PK

lte's Introduction

Hey there!👋 I'm Yuxin Jiang (personal web: https://yjiangcm.github.io/).

🤝🏻 Connect with me

lte's People

Contributors

Stargazers

Watchers

Forkers

lte's Issues

OOM question

Code for Parallel Data Construction

How to load LoRA model for evaluation?

Provide trained models

How to perform a batch edit experiment？

Clarification on Ablation Study for Retrieval Number and Additional Code Request

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs