Refactoring-Summarization

Code for our paper: "RefSum: Refactoring Neural Summarization", NAACL 2021.

We present a model, Refactor, which can be used either as a base system or a meta system for text summarization.

Outline

1. How to Install

Requirements

python3
conda create --name env --file spec-file.txt
pip3 install -r requirements.txt

Description of Codes

main.py -> training and evaluation procedure
model.py -> Refactor model
data_utils.py -> dataloader
utils.py -> utility functions
demo.py -> off-the-shelf refactoring

2. How to Run

Hyper-parameter Setting

You may specify the hyper-parameters in main.py.

Train

python main.py --cuda --gpuid [list of gpuid] -l

Fine-tune

python main.py --cuda --gpuid [list of gpuid] -l --model_pt [model path]

Evaluate

python main.py --cuda --gpuid [single gpu] -e --model_pt [model path] --model_name [model name]

3. Off-the-shelf Refactoring

You may use our model with you own data by running

python demo.py DATA_PATH MODEL_PATH RESULT_PATH

DATA_PATH is the path of you data, which should be a file of which each line is in json format: {"article": str, "summary": str, "candidates": [str]}.

RESULT_PATH is the path of the result of which each line is a candidate summary.

4. Data

We use four datasets for our experiments.

CNN/DailyMail -> https://github.com/abisee/cnn-dailymail
XSum -> https://github.com/EdinburghNLP/XSum
PubMed -> https://github.com/armancohan/long-summarization
WikiHow -> https://github.com/mahnazkoupaee/WikiHow-Dataset

You can find the processed data for all of our experiments here. After downloading, you should put the data in ./data directory.

Dataset	Experiment	Link
CNNDM	Pre-train	Download
	BART Reranking	Download
	GSum Reranking	Download
	Two-system Combination (System-level)	Download
	Two-system Combination (Sentence-level)	Download
	Three-system Combination (System-level)	Download
XSum	Pre-train	Download
XSum	PEGASUS Reranking	Download
PubMed	Pre-train	Download
PubMed	BART Reranking	Download
WikiHow	Pre-train	Download
WikiHow	BART Reranking	Download

5. Results

CNNDM

Reranking BART

	ROUGE-1	ROUGE-2	ROUGE-L
BART	44.26	21.12	41.16
Refactor	45.15	21.70	42.00

Reranking GSum

	ROUGE-1	ROUGE-2	ROUGE-L
GSum	45.93	22.30	42.68
Refactor	46.18	22.36	42.91

System-Combination (BART and pre-trained Refactor)

	ROUGE-1	ROUGE-2	ROUGE-L
BART	44.26	21.12	41.16
pre-trained Refactor	44.13	20.51	40.29
Summary-Level Combination	45.04	21.61	41.72
Sentence-Level Combination	44.93	21.48	41.42

System-Combination (BART, pre-trained Refactor and GSum)

	ROUGE-1	ROUGE-2	ROUGE-L
BART	44.26	21.12	41.16
pre-trained Refactor	44.13	20.51	40.29
GSum	45.93	22.30	42.68
Summary-Level Combination	46.12	22.46	42.92

XSum

Reranking PEGASUS

	ROUGE-1	ROUGE-2	ROUGE-L
PEGASUS	47.12	24.46	39.04
Refactor	47.45	24.55	39.41

PubMed

Reranking BART

	ROUGE-1	ROUGE-2	ROUGE-L
BART	43.42	15.32	39.21
Refactor	43.72	15.41	39.51

WikiHow

Reranking BART

	ROUGE-1	ROUGE-2	ROUGE-L
BART	41.98	18.09	40.53
Refactor	42.12	18.13	40.66

yixinl7 / refactoring-summarization Goto Github PK