GithubHelp home page GithubHelp logo

astro-mt5's Introduction

Astro-mT5

This repository contains code as well as the paper for [ Astro-mT5: Entity Extraction from Astrophysics Literature using mT5 Language Model ] accepted at AACL-IJCNLP Workshop '2022.

Abstract

Scientific research requires reading and extracting relevant information from existing scientific literature in an effective way. To gain insights over a collection of such scientific documents, extraction of entities and recognizing their types is considered to be one of the important tasks. Numerous studies have been conducted in this area of research. In our study, we introduce a framework for entity recognition and identification of NASA astrophysics dataset, which was published as a part of the DEAL SharedTask. We use a pre-trained multilingual model, based on a natural language processing framework for the given sequence labeling tasks. Experiments show that our model, Astro-mT5, outperforms the existing baseline in astrophysics related information extraction. Our paper is available at work.

Setup

Install Package Dependencies

git clone https://github.com/flairNLP/flair.git
cd flair
git checkout add-t5-encoder-support
pip3 install -e .
For running the experiment run_ner.py and test.py have to be kept inside the flair directory.

Training

The main training procedure is:
python3 run_ner.py --dataset_name NER_MASAKHANE \
--model_name_or_path google/mt5-large\
--layers -1\
--subtoken_pooling first_last\
--hidden_size 256\
--batch_size 4\
--learning_rate 5e-05\
--num_epochs 100\
--use_crf True\
--output_dir /content/mt5-large

Tesing

After training, you can find the best checkpoint on the dev set according to the evaluation results. For this run
python3 test.py

astro-mt5's People

Contributors

madhu000 avatar payelsantra avatar mllab4cs avatar

Stargazers

Dhruv Sondhi avatar

Watchers

 avatar

Forkers

payelsantra

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.