GithubHelp home page GithubHelp logo

bert-bilstm-crf-pytorch's Introduction

Bert-BiLSTM-CRF-pytorch

使用谷歌预训练bert做字嵌入的BiLSTM-CRF序列标注模型

本模型使用谷歌预训练bert模型(https://github.com/google-research/bert), 同时使用pytorch-pretrained-BERT(https://github.com/huggingface/pytorch-pretrained-BERT) 项目加载bert模型并转化为pytorch参数,CRF代码参考了SLTK(https://github.com/liu-nlper/SLTK)

准备数据格式参见data

模型参数可以在config中进行设置

运行代码

python main.py train --use_cuda=False --batch_size=10

pytorch.bin 百度网盘链接 链接:https://pan.baidu.com/s/160cvZXyR_qdAv801bDY2mQ 提取码:q67r 

作者也是新手,很希望看到的大家能够提意见,共同学习

bert-bilstm-crf-pytorch's People

Contributors

chenxiaoyouyou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bert-bilstm-crf-pytorch's Issues

CRF 的几点问题

https://github.com/CLUEbenchmark/CLUENER2020/blob/da6631c21d050309117ea28640c757bb46a1255e/pytorch_version/models/crf.py#L41-L42

这里是不是应该写成这样?

self.transitions.detach()[:, :self.tag_dictionary[self.START_TAG]] = -10000

还有这里, 既然tags已经是包含[CLS][SEP]的标签序列了, 为什么还要分别在左边和右边cat上[CLS][SEP]? 有点不解.

https://github.com/CLUEbenchmark/CLUENER2020/blob/da6631c21d050309117ea28640c757bb46a1255e/pytorch_version/models/crf.py#L133-L150

    def _score_sentence(self, feats, tags, lens_):
        start = torch.LongTensor([self.tag_dictionary[self.START_TAG]]).to(self.device)
        start = start[None, :].repeat(tags.shape[0], 1)
        stop = torch.LongTensor([self.tag_dictionary[self.STOP_TAG]]).to(self.device)
        stop = stop[None, :].repeat(tags.shape[0], 1)
        pad_start_tags = torch.cat([start, tags], 1)
        pad_stop_tags = torch.cat([tags, stop], 1)
        for i in range(len(lens_)):
            pad_stop_tags[i, lens_[i] :] = self.tag_dictionary[self.STOP_TAG]
        score = torch.FloatTensor(feats.shape[0]).to(self.device)
        for i in range(feats.shape[0]):
            r = torch.LongTensor(range(lens_[i])).to(self.device)
            score[i] = torch.sum(
                self.transitions[
                    pad_stop_tags[i, : lens_[i] + 1], pad_start_tags[i, : lens_[i] + 1]
                ]
            ) + torch.sum(feats[i, r, tags[i, : lens_[i]]])
        return score

报错:OSError 22

OSError: [Errno 22] Invalid argument: 'result\\2022-02-10#11:07:43--epoch:0'
运行了一下,结果到这一步就不动了

如何避免BERT模型内存过大的问题

    self.embed = BertModel.from_pretrained('./bert-base-uncased')  # bert 预训练模型

这样做应该是吧整个BERT视作了Embed层,我在训练时使用了Bert的768维的词向量,导致内存占用非常高,50G+,请问有什么方法可以避免占用过大的内存吗,譬如直接使用词嵌入而不嵌入整个模型?

使用 gpu运行出现TypeError: 'generator' object is not subscriptable?

你好,我使用gpu运行您的代码时出现以下错误,我的pytorch 版本是0.4.1:
self.check_forward_args(input, hx, batch_sizes)
File "/home/nlp/anaconda2/envs/Bert-BiLSTM-CRF-pytorch/lib/python3.5/site-packages/torch/nn/modules/rnn.py", line 146, in check_forward_args
check_hidden_size(hidden[0], expected_hidden_size,
TypeError: 'generator' object is not subscriptable

模型预测精度

想问下模型在测试集的精度大概如何? 以及做词性标注大概能到多少精度呀? 谢谢!

TypeError: 'NoneType' object is not callable

在这一步报错
File "/data/Bert-BiLSTM-CRF-pytorch/model/bert_lstm_crf.py", line 47, in forward
embeds, _ = self.word_embeds(sentence, attention_mask=attention_mask, output_all_encoded_layers=False)
TypeError: 'NoneType' object is not callable
请问是什么原因

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.