GithubHelp home page GithubHelp logo

tencent-ailab / season Goto Github PK

View Code? Open in Web Editor NEW
23.0 6.0 5.0 128 KB

[EMNLP 2022] Salience Allocation as Guidance for Abstractive Summarization

Home Page: https://arxiv.org/pdf/2210.12330.pdf

License: Apache License 2.0

Python 98.78% Shell 1.22%
nlp summarization summarization-model

season's Introduction

Salience Allocation Guided Abstractive Summarization

Code and model weights for our paper "Salience Allocation as Guidance for Abstractive Summarization" accepted at EMNLP 2022. If you find the code useful, please cite the following paper.

@inproceedings{wang2022salience,
  title={Salience Allocation as Guidance for Abstractive Summarization},
  author={Wang, Fei and Song, Kaiqiang and Zhang, Hongming and Jin, Lifeng and Cho, Sangwoo and Yao, Wenlin and Wang, Xiaoyang and Chen, Muhao and Yu, Dong},
  booktitle={Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},
  year={2022}
}

Quick Links

Overview

We propose allocation of salience expectation as flexible and reliable guidance for abstractive summarization. To estimate and incorporate the salience allocation, we propose a salience-aware cross-attention that is free to plug into any Transformer-based encoder-decoder models, consisting of three steps:

  1. Estimate salience degrees of each sentence.
  2. Map salience degrees to embeddings.
  3. Add salience embeddings to key states of cross-attention.

Environment

Create the environment with conda and pip.

conda env create -f environment.yml
conda activate season
pip install -r requirements.txt

Install nltk "punkt" package.

python -c "import nltk; nltk.download('punkt');"

We've tested this environment with python 3.8 and cuda 10.2. (For other CUDA version, please install the corresponding packages)

Data Preprocessing

Run the following commands to download the CNN/DM dataset, preprocess it, and save it locally.

mkdir data
python preprocess.py

Train

Please run the scripts below:

bash run_train.sh

The trained model parameters and training logs are saved in outputs/train folder.

Note that the evaluation process for each checkpoint during training are simplified for efficiency, so the results are lower than the final evaluation results. You can change the setting according to this post. You can further evaluate the trained model by following the inference steps.

Inference

You can use our trained model weights to generate summaries for your data.

Step 1. Download Trained Model Weights to checkpoints directory.

mkdir checkpoints
cd checkpoints
unzip season_cnndm.zip

Step 2. Generate summaries for CNN/DM Test set.

bash run_inference.sh

After running the script, you will get the results in outputs/inference folder including the predicted summaries in generated_predictions.txt and the ROUGE results in predict_results.json.

License

Copyright 2022 Tencent

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Disclaimer

This repo is only for research purpose. It is not an officially supported Tencent product.

season's People

Contributors

feiwang96 avatar kaiqiangsong avatar tencent-ailab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

season's Issues

json data

Hello,

Thank you very much for sharing your work.
Could we have the data train.json, validation.json, and test.json for CNN DailyMail dataset?

Thanks.

Training duration

Hello, does model training require the use of eight GPUs? I use five pieces to reproduce CNN data. Does it take 60 hours for the Progress bar to display?

code

where is the code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.