GithubHelp home page GithubHelp logo

mylv1222 / arns Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sanmusunrise/arns

0.0 1.0 0.0 350 KB

Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks

License: GNU General Public License v3.0

Python 100.00%

arns's Introduction

Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks

This is the source code for paper "Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks" in ACL2019.

Requirements

  • Pytorch >= 0.4.0
  • tabulate == 0.8.3

Usage

First, please unzip the datasets in "data/"

  • tar zxvf ACE2005.tgz
  • tar zxvf GENIA.tgz
  • tar zxvf KBP2017.tgz

Then enter "src/workspace_*" dir, run the program like

  • python main_seq2nugget.py

Hyperparameters in our paper are saved in "main_seq2nugget.py" files.

Additional Notes on KBP2017 Evaluation

KBP2017 contains forum data and the original annoatations marked the author fields of each post as either "PER" or "ORG". This can result in an undervalued recall rate if we did not take this into consideration because the data used in our model removed all author fields.

Following all previous works, we annotate all post author field in original data as "PER" in "kbp2017_evaluator/kbp2017_test_eval_files/authors.dat". Then for correct evaluation, we need to combine the "authors.dat" with the system output, and then use the official evalution toolkit to obtain the accurate evaluation result.

For this, we should first install the offical evaluation toolkit with command

And then transform our system output into the required format (also combine it with the author fields) by entering "kbp2017_evaluator" dir and then use the command:

  • python generate_eval_format.py [test_result_file] sys_output_for_eval.dat

Finally, we obtain the offical evalution result with command

  • neleval evaluate -m strong_typed_mention_match -f tab -g kbp2017_test_golden.dat sys_output_for_eval.dat

The "test_result_file" can be found in the Model dir in workspace. We also provide a sample test_result in "kbp2017_test_eval_files/sample_sys_test_output.dat" which is produced by the model reported in our paper.

Citation

Please cite:

  • Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun. Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.
@InProceedings{lin-Etal:2019:ACL2019sequence,
  author    = {Lin, Hongyu and Lu, Yaojie and Han, Xianpei and Sun, Le},
  title     = {Sequence-to-Nuggets: Nested Entity Mention Detection via Anchor-Region Networks},
  booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
  month     = {July},
  year      = {2019},
  publisher = {Association for Computational Linguistics}
}

Contact

If you have any question or want to request for the used datasets (ACE2005, GENIA and KBP2017 if you have the license), please contact me by

arns's People

Contributors

sanmusunrise avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.