GithubHelp home page GithubHelp logo

express_ner中,在数据预测部分出现SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception about paddlenlp HOT 5 CLOSED

paddlepaddle avatar paddlepaddle commented on May 15, 2024
express_ner中,在数据预测部分出现SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception

from paddlenlp.

Comments (5)

wawltor avatar wawltor commented on May 15, 2024

稍等 我们排查一下

from paddlenlp.

wawltor avatar wawltor commented on May 15, 2024

image
麻烦对应着把batchify_fn函数改一下,之前是对三个field进行batch化组合,现在只有2个,因此注释一下。
经过实验,您最好load一份已经训练好的参数,这样不会进行出错。

from paddlenlp.

xiaomeng654321 avatar xiaomeng654321 commented on May 15, 2024

好的,非常感谢,但在预测的时候,总是出现keyerror的报错, 如下是主函数部分代码:
paddle.set_device('gpu')

data = "./data/userinput.txt"

modelparams = "./modelspath/final.pdparams"

train_ds_word_num = 20941
train_ds_label_num = 13

label_keys = ['P-B', 'P-I', 'T-B', 'T-I', 'A1-B', 'A1-I', 'A2-B', 'A2-I', 'A3-B', 'A3-I', 'A4-B', 'A4-I', 'O']

data_ds = ExpressDatasettest(data)

batchify_fn = lambda samples, fn=Tuple(
    Pad(axis=0, pad_val=20941),
    Stack()
    # Pad(axis=0, pad_val=13)
): fn(samples)

data_loader = paddle.io.DataLoader(
    dataset=data_ds,
    batch_size=100,
    drop_last=False,
    return_list=True,
    collate_fn=batchify_fn)


weights = paddle.load(modelparams)@

network = BiGRUWithCRF(300, 300, train_ds_word_num, train_ds_label_num)

network.set_state_dict(weights, use_structured_name=False)
model = paddle.Model(network)
model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()),
              loss=LinearChainCrfLoss(network.crf),
              metrics=ChunkEvaluator(label_list=label_keys, suffix=True)
              )

outputs, lens, decodes = model.predict(data_loader)
preds = parse_decodes(data_ds, decodes, lens)
# print("preds: ", preds)

print('\n'.join(preds[:10]))

出现的错误:
Predict begin...
step 5/5 [==============================] - 116ms/step
Predict samples: 424
Traceback (most recent call last):
File "predict.py", line 174, in
preds = parse_decodes(data_ds, decodes, lens)
File "predict.py", line 38, in parse_decodes
tags = [id_label[x] for x in decodes[idx][:end]]
File "predict.py", line 38, in
tags = [id_label[x] for x in decodes[idx][:end]]
KeyError: 13

在原来的训练脚本中,则没有这样的报错,但是预测自己的数据集效果不佳
可以帮忙看下这两个问题吗,谢谢

from paddlenlp.

wawltor avatar wawltor commented on May 15, 2024

好的,非常感谢,但在预测的时候,总是出现keyerror的报错, 如下是主函数部分代码:
paddle.set_device('gpu')

data = "./data/userinput.txt"

modelparams = "./modelspath/final.pdparams"

train_ds_word_num = 20941
train_ds_label_num = 13

label_keys = ['P-B', 'P-I', 'T-B', 'T-I', 'A1-B', 'A1-I', 'A2-B', 'A2-I', 'A3-B', 'A3-I', 'A4-B', 'A4-I', 'O']

data_ds = ExpressDatasettest(data)

batchify_fn = lambda samples, fn=Tuple(
    Pad(axis=0, pad_val=20941),
    Stack()
    # Pad(axis=0, pad_val=13)
): fn(samples)

data_loader = paddle.io.DataLoader(
    dataset=data_ds,
    batch_size=100,
    drop_last=False,
    return_list=True,
    collate_fn=batchify_fn)


weights = paddle.load(modelparams)@

network = BiGRUWithCRF(300, 300, train_ds_word_num, train_ds_label_num)

network.set_state_dict(weights, use_structured_name=False)
model = paddle.Model(network)
model.prepare(optimizer=paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters()),
              loss=LinearChainCrfLoss(network.crf),
              metrics=ChunkEvaluator(label_list=label_keys, suffix=True)
              )

outputs, lens, decodes = model.predict(data_loader)
preds = parse_decodes(data_ds, decodes, lens)
# print("preds: ", preds)

print('\n'.join(preds[:10]))

出现的错误:
Predict begin...
step 5/5 [==============================] - 116ms/step
Predict samples: 424
Traceback (most recent call last):
File "predict.py", line 174, in
preds = parse_decodes(data_ds, decodes, lens)
File "predict.py", line 38, in parse_decodes
tags = [id_label[x] for x in decodes[idx][:end]]
File "predict.py", line 38, in
tags = [id_label[x] for x in decodes[idx][:end]]
KeyError: 13

在原来的训练脚本中,则没有这样的报错,但是预测自己的数据集效果不佳
可以帮忙看下这两个问题吗,谢谢

嗯嗯 这个问题是需要加载一个训练好的参数即可,这样就不会出现上诉的问题;出现上诉的问题报错确实代码报错风格不好,我们会优化。

  1. 先不改动示例代码(记得保存之前的已经改动的代码),先训练一个最好的模型出来,模型地址 results/final.pdparams
  2. 加上如下图片所示代码
    image
  3. 开始预测

我们内部讨论一下,看看是否需要加一下预测代码,后续反馈给您

from paddlenlp.

xiaomeng654321 avatar xiaomeng654321 commented on May 15, 2024

好的,已成功,谢谢

from paddlenlp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.