GithubHelp home page GithubHelp logo

self-explaining-nlp's Introduction

self-explaining-NLP

Code, models and Datasets for《Self-Explaining Structures Improve NLP Models》.

installation

pip install -r requirements.txt

Prepare Datasets and Models

  • Download the IMDB dataset, the official corpus can be found HERE. We provide processed raw text which you can download HERE. Save the processed raw text dataset at [IMDB_PATA_PATH].
  • Download the SST-5 dataset, the official corpus can be found HERE. We provide processed raw text which you can download HERE Save the processed raw text dataset at [SST_PATA_PATH].
  • Download the SNLI dataset, the official corpus can be found HERE. Save the SNLI dataset at [SNLI_PATA_PATH].
  • Download the vanilla RoBERTa-base model released by HuggingFace. Save the model at [ROBERTA_BASE_PATH], it can be found HERE

Reproduce paper results step by step

In this paper, we utilize self-explaining structures in different NLP tasks. This repo contains all train and evaluate codes, but here, we only provide commands for SST-5 task as an example. For other tasks, you can reproduce the results simply by modifying the commands.

1.Train the self-explaining model

SST-5 is a task with five classes, so we should modify the Roberta-base config file. Open [ROBERTA_BASE_PATH]\config.json and set num_labels=5. Then run the following commands.

cd explain
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--save_path [SELF_EXPLAINING_MODEL_CHECKPOINTS] \
--gpus=0,1,2,3  \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \
--max_epoch=20

After training, the checkpoints and training log will be saved at [SELF_EXPLAINING_MODEL_CHECKPOINTS].

2.Evaluate the self-explaining model

Run the following evaluation command to get the performance on test dataset. After evaluation, you will get two output file at [SPAN_SAVE_PATH]: output.txt and test.txt. output.txt records visual extract spans and prediction results. text.txt only records top-ranked span as span-base test data for next stage.

cd explain
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [SST_PATA_PATH] \
--task sst5 \
--checkpoint_path [SELF_EXPLAINING_MODEL_CHECKPOINTS]/***.ckpt \
--save_path [SPAN_SAVE_PATH] \
--gpus=0, \
--mode eval

3.Check the extracted span

In previous stage, we got span-based test data. You can use the same method to get span-based train data.
To check the extracted span, we set four experiments which are full-full mode, full-span mode, span-full mode and span-span mode. For example, full-span mode means we use origin SST-5 train data as train data, and use span-based test data as test data.
You should save the origin SST-5 train data and span-base test data at [FULL_SPAN_PATH]

scp [SST_PATA_PATH]/train.txt  [FULL_SPAN_PATH]
scp [SPAN_SAVE_PATH]/test/txt [FULL_SPAN_PATH]
cd check
python trainer.py \
--bert_path [ROBERTA_BASE_PATH] \
--data_dir [FULL_SPAN_PATH] \
--task sst5 \
--save_path [CHECK_MODEL_CHECKPOINTS] \
--gpus=0,1,2,3  \
--precision 16 \
--lr=2e-5 \
--batch_size=10 \
--lamb=1.0 \
--workers=4 \
--max_epoch=20

self-explaining-nlp's People

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.