GithubHelp home page GithubHelp logo

anupamaray / crossweigh Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zihanwangki/crossweigh

0.0 1.0 0.0 1.81 MB

CrossWeigh: Training Named Entity Tagger from Imperfect Annotations

Home Page: https://arxiv.org/abs/1909.01441

License: Apache License 2.0

Python 90.98% Shell 9.02%

crossweigh's Introduction

CrossWeigh

CrossWeigh

CrossWeigh: Training Named Entity Tagger from Imperfect Annotations

Motivation

The label annotation mistakes by human annotators brings up two challenges to NER:

  • mistakes in the test set can interfere the evaluation results and even lead to an inaccurate assessment of model performance.
  • mistakes in the training set can hurt NER model training.

We address these two problems by:

  • manually correcting the mistakes in the test set to form a cleaner benchmark.
  • develop framework CrossWeigh to handle the mistakes in the training set.

CrossWeigh works with any NER algorithm that accepts weighted training instances. It is composed of two modules. 1) mistake estimation: where potential mistakes are identified in the training data through a cross-checking process and 2) mistake re-weighing: where weights of those mistakes are lowered during training the final NER model.

Data

We formally name our corrected dataset as CoNLL++.
/data/conllpp_test.txt is the manually corrected test set, there should be exactly 186 sentences that differ from the original test set.
/data/conllpp_train.txt and /data/conllpp_dev.txt are the original dataset of CoNLL03 from Named-Entity-Recognition-NER-Papers.

Scripts

split.py can be used to generate a k-fold entity disjoint dataset from a list of datasets(usually both the train and development set)
flair_scripts/flair_ner.py can be used to train a weighted version of flair.
collect.py can be used to collect all the predictions on the k folded test set.

Steps to reproduce

Make sure you are in a python3.6+ environment.
See example.sh to reproduce the results.
Using Flair (non-pooled version), the final result should achieve around 93.19F1 on the original test dataset and 94.18F1 on the corrected test set. Using Flair without CrossWeigh gives around 92.9F1.

Results

All the results are averaged across 5 runs and standard deviation is reported.

Model w/o CrossWeigh (original) w/ CrossWeigh (original) w/o CrossWeigh (corrected) w/ CrossWeigh (corrected)
VanillaNER 91.44(±0.16) 91.780.06) 92.32(±0.16) 92.640.08)
Flair 92.87(±0.08) 93.19(±0.09) 93.89(±0.06) 94.180.06)
Pooled-Flair 93.14(±0.14) 93.430.06) 94.13(±0.11) 94.280.05)
GCDT 93.33(±0.14) 93.430.05) 94.58(±0.15) 94.650.06)
LSTM-CRF 90.64(±0.23) 91.47(±0.15)
LSTM-CNNs-CRF 90.65(±0.57) 91.87(±0.50)
ELMo 92.28(±0.19) 93.42(±0.15)

For all models, we use their suggested parameter settings.
For GCDT, we used the weights estimated from Pooled-Flair for efficiency purposes.

Citation

Please cite the following paper if you found our dataset or framework useful. Thanks!

Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, and Jiawei Han. "CrossWeigh: Training Named Entity Tagger from Imperfect Annotations." arXiv preprint arXiv:1909.01441 (2019).

@article{wang2019cross,
  title={CrossWeigh: Training Named Entity Tagger from Imperfect Annotations},
  author={Wang, Zihan and Shang, Jingbo and Liu, Liyuan and Lu, Lihao and Liu, Jiacheng and Han, Jiawei},
  journal={arXiv preprint arXiv:1909.01441},
  year={2019}
}

crossweigh's People

Contributors

blester125 avatar zihanwangki avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.