GithubHelp home page GithubHelp logo

bc-li / nabert-large Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 10.83 MB

🌠 Naïve natural Language Inference System based on NABERT+. Retrained with BERT-Large encoder to gain 8.2% EM and 7.4% F1 improvement.

Jsonnet 2.63% Python 97.37%
inference natural-language-processing nlp drop allennlp bert-large nabert-plus

nabert-large's Introduction

NABERT-Large+

model_3

Getting Started | Results

This repository provides:

  • Reproduction guides and Training results of:
    • NAQANet
    • NABERT
    • NABERT+
  • We also retrained NABERT+ with BERT-Large, which gained 8.2% EM and 7.4% F1 improvement on dev datasets of Discrete Reasoning Over the content of Paragraphs (DROP).
  • A detailed report

The codes and training configs are based on @raylin1000(NABERT Model) and AI2.

PyTorch Config: AllenNLP

Getting Started

Install dependencies

# clone project
git clone https://github.com/BC-Li/nabert-large
cd nabert-large

# [OPTIONAL] create conda environment
conda create -n myenv python=3.7
conda activate myenv

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt

Running NAQANet Baseline

This will require a GRAM for ~8GB and about 20 hours on RTX3090 to make the model reach convergence.

allennlp train /nabert-large/src/baseline/config/naqanet.jsonnet -s /nabert-large/src/baseline/storage --include-package baseline

Running NABERT or NABERT+ Baseline.

allennlp train /nabert/src/nabert/config/nabert.json -s /nabert/src/nabert/storage --include-package nabert

Train NABERT-Large+

Use config from src/nabert-large/config.

Please ensure you have a GPU which have >22GB GRAM. I trained it on a single RTX3090 for about 20 hours with early stopping to avoid overfitting.

allennlp train /nabert/src/nabert/config/nabert-large.json -s /nabert-large/src/nabert-large/storage --include-package nabert-large

TensorBoard Support

AllenNLP also supports TensorBoard. To open it, just change the --logdir parameter to run following command.

tensorboard --logdir="/nabert-large/src/nabert-large/storage"

Results

Model/Human EM F1
NAQANet 46.20 49.24
NABERT 54.67 57.64
NABERT+ 62.67 66.29
NumNet 64.92 68.31
NABERT-Large+ (dropout=0.11) 67.82 71.25
OPERA (Current Rank 1 on DROP Leaderboard) 86.79 89.41
Human 94.90 96.42

Training Result

NABERT-Large+

Train Batch EM Train Batch F1 Validation/Train EM Validation/Train F1
epoch_metrics_em epoch_metrics_f1 em f1

NAQANet Baseline

Train Batch EM Train Batch F1 Train EM Train F1
train_batch_em (1) train_batch_f1 train_em train_f1
Train Loss Validation EM Validation F1
train_loss validation_em validation_f1

NABERT

Train Batch EM Train Batch F1 Validation/Train EM Validation/Train F1
epoch_metrics_em epoch_metrics_f1 em f1

NABERT+

Train Batch EM Train Batch F1 Validation/Train EM Validation/Train F1
epoch_metrics_em epoch_metrics_f1 em f1

nabert-large's People

Contributors

bc-li avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.