GithubHelp home page GithubHelp logo

ner's Introduction

命名实体识别实践与探索

完整文档:https://zhuanlan.zhihu.com/p/166496466

最近在做命名实体识别(Named Entity Recognition, NER)的工作,也就是序列标注(Sequence Tagging),老 NLP task 了,虽然之前也做过但是想细致地捋一下,看一下自从有了 LSTM+CRF 之后,NER 在做些什么,顺便记录一下最近的工作,中间有些经验和想法,有什么就记点什么

还是先放结论

命名实体识别虽然是一个历史悠久的老任务了,但是自从2015年有人使用了BI-LSTM-CRF模型之后,这个模型和这个任务简直是郎才女貌,天造地设,轮不到任何妖怪来反对。直到后来出现了BERT。在这里放两个问题:

  1. 2015-2019年,BERT出现之前4年的时间,命名实体识别就只有 BI-LSTM-CRF 了吗?
  2. 2019年BERT出现之后,命名实体识别就只有 BERT-CRF(或者 BERT-LSTM-CRF)了吗?

经过我不完善也不成熟的调研之后,好像的确是的,一个能打的都没有

既然模型打不动了,然后我找了找 ACL2020 做NER的论文,看看现在的NER还在做哪些事情,主要分几个方面

  • 多特征:实体识别不是一个特别复杂的任务,不需要太深入的模型,那么就是加特征,特征越多效果越好,所以字特征、词特征、词性特征、句法特征、KG表征等等的就一个个加吧,甚至有些中文NER任务里还加入了拼音特征、笔画特征。。?心有多大,特征就有多多
  • 多任务:很多时候做NER的目的并不仅是为了NER,而是一个大任务下的子任务,比如信息抽取、问答系统等等的,如果要做一个端到端的模型,那么就需要根据自己的需求和场景,做成一个多任务模型,把NER作为其中一个子任务;另外,单独的NER本身也可以做成多任务,比如一个用来识别实体,一个用来判断实体类型
  • 时令大杂烩:把当下比较流行的深度学习话题或方法跟NER结合一下,比如结合强化学习的NER、结合 few-shot learning 的NER、结合多模态信息的NER、结合跨语种学习的NER等等的,具体就不提了

所以沿着上述思路,就在一个中文NER任务上做一些实践,写一些模型。都列在下面了,首先是 LSTM-CRF 和 BERT-CRF,然后 Cascade 开头的是几个多任务模型(因为实体类型比较多,把NER拆成两个任务,一个用来识别实体,另一个用来判断实体类型),后面的几个模型里,WLF 指的是 Word Level Feature(即在原本字级别的序列标注任务上加入词级别的表征),WOL 指的是 Weight of Loss(即在loss函数方面通过设置权重来权衡Precision与Recall,以达到提高F1的目的)

ner's People

Contributors

wavewangyue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ner's Issues

使用train_multitask_bert.py时的路径报错

从知乎一路追过来的,您给出的代码源文件里定义的路径是

bert_vocab_path = "./bert_model/vocab.txt" bert_config_path = "./bert_model/bert_config.json" bert_ckpt_path = "./bert_model/chinese_L-12_H-768_A-12.ckpt"

但bert文件解压出来以后没有这个ckpt文件,请问应该怎么解决
试过更改文件等方法,也没有查找到解决策略,多谢解答!

chinese_L-12_H-768_A-12.ckpt文件是哪一个?

你好,因为是小白所以诸多不懂,想问一下chinese_L-12_H-768_A-12.ckpt文件是从哪里得到的?从bert项目中下载下来的是一个压缩包,包含如下文件,请问作者您使用的是哪个呢?
image

关于词汇信息的融入的问题

看了一下WLF的代码,词向量和字符向量是直接拼接在一起的,想知道是怎么保证分词后的序列与字符序列的长度是一样的呢?直接拼接是否能把词信息融合到对应的字符中去呢,毕竟NER是基于字符的标注。

how to define the words inside the vocab_attr.txt?

Hi ,

Thanks for your amazing works. I agree with your point very much : For ner task with too many ner classes, we should use multi-task to identify BIOS and NER classes.

In your codes, I appreciate your vocab_attr.txt very much. If i have my own dataset, how to define the words to be put into this txt file?

I focus on extracting financial related entities with my dataset.

Thanks a lot.

Marcus

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.