Comments (5)
稍等 我们排查一下
from paddlenlp.
麻烦对应着把batchify_fn函数改一下,之前是对三个field进行batch化组合,现在只有2个,因此注释一下。
经过实验,您最好load一份已经训练好的参数,这样不会进行出错。
from paddlenlp.
好的,非常感谢,但在预测的时候,总是出现keyerror的报错, 如下是主函数部分代码:
paddle.set_device('gpu')
data = "./data/userinput.txt"
modelparams = "./modelspath/final.pdparams"
train_ds_word_num = 20941
train_ds_label_num = 13
label_keys = ['P-B', 'P-I', 'T-B', 'T-I', 'A1-B', 'A1-I', 'A2-B', 'A2-I', 'A3-B', 'A3-I', 'A4-B', 'A4-I', 'O']
data_ds = ExpressDatasettest(data)
batchify_fn = lambda samples, fn=Tuple(
Pad(axis=0, pad_val=20941),
Stack()
# Pad(axis=0, pad_val=13)
): fn(samples)
data_loader = paddle.io.DataLoader(
dataset=data_ds,
batch_size=100,
drop_last=False,
return_list=True,
collate_fn=batchify_fn)
weights = paddle.load(modelparams)@
network = BiGRUWithCRF(300, 300, train_ds_word_num, train_ds_label_num)
network.set_state_dict(weights, use_structured_name=False)
model = paddle.Model(network)
model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()),
loss=LinearChainCrfLoss(network.crf),
metrics=ChunkEvaluator(label_list=label_keys, suffix=True)
)
outputs, lens, decodes = model.predict(data_loader)
preds = parse_decodes(data_ds, decodes, lens)
# print("preds: ", preds)
print('\n'.join(preds[:10]))
出现的错误:
Predict begin...
step 5/5 [==============================] - 116ms/step
Predict samples: 424
Traceback (most recent call last):
File "predict.py", line 174, in
preds = parse_decodes(data_ds, decodes, lens)
File "predict.py", line 38, in parse_decodes
tags = [id_label[x] for x in decodes[idx][:end]]
File "predict.py", line 38, in
tags = [id_label[x] for x in decodes[idx][:end]]
KeyError: 13
在原来的训练脚本中,则没有这样的报错,但是预测自己的数据集效果不佳
可以帮忙看下这两个问题吗,谢谢
from paddlenlp.
好的,非常感谢,但在预测的时候,总是出现keyerror的报错, 如下是主函数部分代码:
paddle.set_device('gpu')data = "./data/userinput.txt" modelparams = "./modelspath/final.pdparams" train_ds_word_num = 20941 train_ds_label_num = 13 label_keys = ['P-B', 'P-I', 'T-B', 'T-I', 'A1-B', 'A1-I', 'A2-B', 'A2-I', 'A3-B', 'A3-I', 'A4-B', 'A4-I', 'O'] data_ds = ExpressDatasettest(data) batchify_fn = lambda samples, fn=Tuple( Pad(axis=0, pad_val=20941), Stack() # Pad(axis=0, pad_val=13) ): fn(samples) data_loader = paddle.io.DataLoader( dataset=data_ds, batch_size=100, drop_last=False, return_list=True, collate_fn=batchify_fn) weights = paddle.load(modelparams)@ network = BiGRUWithCRF(300, 300, train_ds_word_num, train_ds_label_num) network.set_state_dict(weights, use_structured_name=False) model = paddle.Model(network) model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()), loss=LinearChainCrfLoss(network.crf), metrics=ChunkEvaluator(label_list=label_keys, suffix=True) ) outputs, lens, decodes = model.predict(data_loader) preds = parse_decodes(data_ds, decodes, lens) # print("preds: ", preds) print('\n'.join(preds[:10]))
出现的错误:
Predict begin...
step 5/5 [==============================] - 116ms/step
Predict samples: 424
Traceback (most recent call last):
File "predict.py", line 174, in
preds = parse_decodes(data_ds, decodes, lens)
File "predict.py", line 38, in parse_decodes
tags = [id_label[x] for x in decodes[idx][:end]]
File "predict.py", line 38, in
tags = [id_label[x] for x in decodes[idx][:end]]
KeyError: 13在原来的训练脚本中,则没有这样的报错,但是预测自己的数据集效果不佳
可以帮忙看下这两个问题吗,谢谢
嗯嗯 这个问题是需要加载一个训练好的参数即可,这样就不会出现上诉的问题;出现上诉的问题报错确实代码报错风格不好,我们会优化。
我们内部讨论一下,看看是否需要加一下预测代码,后续反馈给您
from paddlenlp.
好的,已成功,谢谢
from paddlenlp.
Related Issues (20)
- [Question]: 请问ernie-3.0-base有模型合并的方法吗? HOT 6
- uie训练的checkpoint存储的内存一直在增加,有办法不存储这么多checkpoint吗? HOT 9
- fluid operator conditional_block_infer 问题 HOT 2
- [Question]: 使用hugging face的配置训练OPT-13b时,报了不支持GPT2Tokenizer的错误 HOT 1
- [Question]: 我想使用UIE模型去进行关系或实体抽取的zero-shot,我想使用我的数据然后使用我定义的标签去完成,这些都可以实现,但是结果给到的是具体的实体和关系,我想问下UIE模型中有对zero-shot的f1、precision、recall值得计算程序吗,在哪里呢?的计算程序嘛 HOT 3
- [Question]: 在llama预训练中,paddlenlp是否支持在customdevice(比如mlu、npu)使用flashattention HOT 1
- 层次分类模型,预训练后,checkpoint文件夹是空的 HOT 2
- [Question]: the device must be a string which is like 'cpu' HOT 1
- [Question]: FAQ pipeline能否给个能运行的说明? HOT 4
- [Question]: 再使用UIE-X封闭域信息抽取时遇到的问题 HOT 12
- [Question]: 牛爷爷们,救救孩子... uie-x-base 封闭域信息抽取问题 HOT 9
- [Question]: llama3 支持计划 HOT 2
- Paddle在单卡上是否支持并行推理? HOT 3
- [Question]: qwen推理显存不足,如何设置多卡推理 HOT 4
- [Bug]: llama模型loss=0时出现"Tensor need be reduced must not empty [Hint: Expected x.numel() > 0, but received x.numel():0 <= 0:0.]"错误 HOT 2
- [Question]: Taskflow文本抽取结束后显存越积越多,应该如何释放 HOT 3
- [Question]: 为什么判断attn_mask是否为causal是通过比较上三角矩阵的方式? HOT 1
- [Bug]: 权重加载导致的主机内存不足 HOT 3
- [Bug]: NER分析GPU环境使用CPU报错,提示(InvalidArgument) Variable value (input) of OP(fluid.layers.embedding) HOT 1
- [Bug]: 关于训练PaddleNLP/model_zoo /ernie-layout/下面 XFUND-ZH Train出现的问题。 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from paddlenlp.