GithubHelp home page GithubHelp logo

sheng-z / figet Goto Github PK

View Code? Open in Web Editor NEW
13.0 2.0 4.0 152 KB

Fine-grained Entity Typing / Fine-grained Entity Classification

License: MIT License

Python 91.86% Shell 8.14%
natural-language-processing machine-learning pytorch fine-grained-entity-typing fine-grained-entity-classification figer

figet's Introduction

Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds

Source code and data for StarSem'18 paper Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds.

logo

Citation

The source code and data in this repository aims at facilitating the study of fine-grained entity typing. If you use the code/data, please cite it as follows:

@InProceedings{zhang-EtAl:2018:starSEM,
  author    = {Zhang, Sheng  and  Duh, Kevin  and  {Van Durme}, Benjamin},
  title     = {{Fine-grained Entity Typing through Increased Discourse Context and Adaptive Classification Thresholds}},
  booktitle = {Proceedings of the 7th Joint Conference on Lexical and Computational Semantics (*SEM 2018)},
  month     = {June},
  year      = {2018}
}

Benchmark Performance

1. OntoNotes (Gillick et al., 2014)

Approach Strict F1 Macro F1 Micro F1
Our Approach 55.52 73.33 67.61
    w/o Adaptive thresholds 53.49 73.11 66.78
    w/o Document-level contexts 53.17 72.14 66.51
Approach Strict F1 Macro F1 Micro F1
Our Approach 60.23 78.67 75.52
    w/o Adaptive thresholds 60.05 78.50 75.39
Approach Strict F1 Macro F1 Micro F1
Our Approach 60.87 77.75 76.94
    w/o Adaptive thresholds 58.47 75.84 75.03
    w/o Document-level contexts 58.12 75.65 75.11

Prerequisites

  • Python 2.7
  • PyTorch 0.2.0 (w/ CUDA support)
  • Numpy
  • tqdm

Running

Once getting the prerequisites, you can run the whole process very easily. Take the OntoNotes corpus for example,

Step 1: Download the data

./scripts/ontonotes.sh get_data

Step 2: Preprocess the data

./scripts/ontonotes.sh preprocess

Step 3: Train the model

./scripts/ontonotes.sh train

Step 4: Tune the threshold

./scripts/ontonotes.sh adaptive-thres

Step 5: Do inference

./scripts/ontonotes.sh inference

Acknowledgements

The datasets (Wiki and OntoNotes) are copies from Sonse Shimaoka's repository.

License

MIT

figet's People

Contributors

sheng-z avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

figet's Issues

strange training data

In train.txt I see a lot of annotations that don't make any sense to me.
For example
0 1 Oh , after I shave it I rub on a little oil . /person/athlete /person
So the mention "Oh" is supposed to be categorized as an athlete???
0 1 Do you wear a hat ? /other /other/religion
Here "Do" is a religion?

Results on ontonotes

Hi I'm trying to replicate your numbers on Ontonotes but running for 15 Epochs which is default, gives me a performance of 52.98, 69.58, 62.95 which is much lower than the numbers in the paper. Can you tell me if I'm doing something wrong? Thanks very much

PyTorch version

Hello, I am so sorry to bother you. Can I use PyTorch1.1 to run your code? When I try to download PyTorch0.20, it tolds me that I must use CUDA7.5, and my version of CUDA is 10.0. I am looking forward to your reply. Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.