liuqidong07 / moelora-peft Goto Github PK
View Code? Open in Web Editor NEW[SIGIR'24] The official implementation code of MOELoRA.
Home Page: https://arxiv.org/abs/2310.18339
License: MIT License
[SIGIR'24] The official implementation code of MOELoRA.
Home Page: https://arxiv.org/abs/2310.18339
License: MIT License
请问作者可以提供一个镜像来方便环境的配置吗?仅依靠requirements的环境,还是难以让代码跑起来。
作者你好呀,请问我出现了这个错误该怎么解决呢,我的环境是按照requirements配置的
Traceback (most recent call last): File "run_mlora.py", line 6, in
from src.MLoRA.main import main
File "/cognitive_comp/chen/projects/MOELoRA-peft/src/MLoRA/main.py", line 42, in
from src.MLoRA.trainer_seq2seq import Seq2SeqTrainer
File "/cognitive_comp/chen/projects/MOELoRA-peft/src/MLoRA/trainer_seq2seq.py", line 27, in from .trainer import Trainer
File "/cognitive_comp/chen/projects/MOELoRA-peft/src/MLoRA/trainer.py", line 41, in
from transformers.integrations import (
File "/home/chen/miniconda3/envs/moet12/lib/python3.8/site-packages/transformers/integrations.py", line 71, in
from .trainer_callback import ProgressCallback, TrainerCallback # noqa: E402
File "/home/chen/miniconda3/envs/moet12/lib/python3.8/site-packages/transformers/trainer_callback.py", line 27, in from .training_args import TrainingArguments
File "/home/ochen/miniconda3/envs/moet12/lib/python3.8/site-packages/transformers/training_args.py", line 69, in
import torch_xla.core.xla_model as xm
File "/home/hen/miniconda3/envs/moet12/lib/python3.8/site-packages/torch_xla/init.py", line 114, in
import _XLAC
ImportError: /home/chen/miniconda3/envs/moet12/lib/python3.8/site-packages/_XLAC.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK5torch4lazy4Node16nullable_operandEm
`
import jsonlines
import os
import numpy as np
import pandas as pd
from utils import read_data, extract_data, partition
from evaluation import calculate_score, process_CTC
task_list=('CMeIE', 'CHIP-CDN', 'CHIP-CDEE', 'CHIP-MDCFNPC',
'CHIP-CTC', 'KUAKE-QIC',
'IMCS-V2-MRG', 'MedDG',)
pred_path = "pred"
true_path = "true"
target_list = ["test_predictions.json"]
score_dict = {target: [] for target in target_list}
score_dict
score_dict = {target: [] for target in target_list}
label_dict = {}
for target in target_list:
target_path = os.path.join(pred_path, target) #pred/task_predictions.json
if not os.path.exists(os.path.join(pred_path, task_list[0])): # needs partition
# all_data = read_data(os.path.join(os.path.join(pred_path, "test_predictions.json"))
all_data = read_data(target_path)
partition(extract_data(all_data), task_list, pred_path)
for task in task_list:
pp = os.path.join(target_path, task)
tp = os.path.join(true_path, task)
pp = os.path.join("pred", "test_predictions.json") #pred_path
tp = os.path.join("pred", "test.json") #truth_path
if task == "CHIP-CTC": # CTC needs post process
post_process_function = process_CTC
else:
post_process_function = None
score, labels, _ = calculate_score(task, pp, tp, post_process_function)
score_dict[target].append(score)
label_dict[task] = labels
import tqdm
def read_data(data_path):
with jsonlines.open(data_path, "r") as f:
data = [meta_data for meta_data in f]
return data
def save_data(data_path, data):
with jsonlines.open(data_path, "w") as w:
for meta_data in data:
w.write(meta_data)
def extract_data(data):
data_dict = {}
for meta_data in data:
if meta_data['task_dataset'] not in data_dict.keys():
data_dict[meta_data['task_dataset']] = []
data_dict[meta_data['task_dataset']].append(meta_data)
print("extract conpletion")
return data_dict
def partition(data_dict, task_list, output_path):
for task in task_list:
task_path = os.path.join(output_path, task)
if not os.path.exists(task_path):
os.makedirs(task_path)
save_data(os.path.join(task_path, "dev.json"), data_dict[task])
target_path = "test.json" # 预测结果文件路径
all_data = read_data(target_path)
extracted_data = extract_data(all_data)
partition(extracted_data, task_list, pred_path)
res_data, res_key = [], []
for key, value in score_dict.items():
res_data.append(value)
res_key.append(key)
res_df = pd.DataFrame(columns=task_list,
index=res_key,
data=res_data)
res_df["average"] = res_df.mean(axis=1)
res_df.head(20)
try:
new_res_df = res_df.drop(columns=["CHIP-STS", "KUAKE-IR", "average"])
except:
new_res_df = res_df
new_res_df["average"] = new_res_df.mean(axis=1)
new_res_df.head(50)
for st in ["CHIP-CDN", "CHIP-MDCFNPC", "IMCS-V2-MRG", "KUAKE-QIC"]:
score, _, _ = calculate_score(st,
"pred/%s/test_predictions.json" % st,
"pred/%s/dev.json" %st,
post_process_function)
print("The score for task %s is: %.5f" % (st, score))
`
以上是我对evaluate.ipynb按照自己想法修改的结果,大概意思就是把项目中的‘dev.json’改为使用'test.json'作为ground true,不知道这是不是作者的想法。
最后得到的结果如下图所示:
我对第二张图的结果有点疑惑,因为test_pretections.json中的target与test.json中的target好像是一模一样的?我看了代码,pretections.json也确实是moe模型生成的答案
最近在学习hugging face的peft库的使用方式,不知道作者是使用lora模块在glm上面修改还是直接使用peft库微调呢?两者都想要学习一下。如果可以的话请大佬更新到仓库上或者发到我的邮箱[email protected]。感激不尽!
卡在了环境配置中…… torch_xla出现编译错误
是否支持其他开源模型,如Qwen1.5-32B等主流开源模型?谢谢!
请问task_num参数做什么的
this file don't updown
大佬您好啊,我还想对剩下8个任务进行一下实验,但是我看见您论文里面给的比赛链接,比赛结束了,不知道还可以去哪里可以获得这个数据集呢?
作者您好,看到您的论文我对您做的方向很感兴趣,进行复现时候发现代码很多bug,应该是您删改了很多内容,导致代码跑不通,请问可以分享一份完整的代码么?同时我修改您代码里面的bug,可以跑通发现几个问题,论文提到V100可以复现您的工作,我这边发现V100会直接显存爆炸,而且80G的A100batchsize也只能设置为1-2,但是看到您论文里面说V100batchsize可以设为4,对此感到非常困惑,希望得到您的回复,谢谢!
不好意思,前面在忙,现在才转头搞这个代码。。。。
还是这个错误,我在代码里确实没找到这个变量
我试着修改了一下,因为我看都赋同一个值。
然后我接着遇到了下列错误,是因为修改代码的时候疏忽了这里吗?
最后,这个PeftModelForCausalLMSharedM没在peft_model这个文件内
另外torch_xla实在有些装不上,能否请有心人给我一个docker或者conda镜像,邮箱:[email protected]
能不能提供一下跑通代码的环境,accelerate,deepspeed,transformers,torch。
我按照requirements并不能跑通
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.