Comments (1)
CRFを用いて単一文書要約の手法を考えましたという話。
気持ちとしては、
1. Supervisedなモデルでは、当時は原文書中の各文を独立に2値分類して要約を生成するモデルが多く、sentence間のrelationが考慮できていなかった
2. unsupervisedな手法では、ルールに基づくものなどが多く、汎用的ではなかった
といった問題があったので、CRF使ってそれを解決しましたという主張
CRFを使って、要約の問題を系列ラベリング問題に落とすことで、文間の関係性を考慮できるようにし、従来使われてきたルール(素性)をそのままCRFの素性としてぶちこんでしまえば、要約モデル学習できるよねっていうことだろうと思う。
CRFのFeatureとしては、文のpositionや、長さ、文の尤度、thematic wordsなどの基本的なFeatureに加え、LSAやHitsのScoreも利用している。
DUC2001のデータで評価した結果、basicな素性のみを使用した場合、unsupervisedなベースライン(Random, Lead, LSA, HITS)、およびsupervisedなベースライン(NaiveBayes, SVM, Logistic Regression, HMM)をoutperform。
また、LSAやHITSなどのFeatureを追加した場合、basicな素性のみと比べてROUGEスコアが有意に向上し、なおかつ提案手法がbest
結構referされているので、知っておいて損はないかもしれない。
from summarization_papers.
Related Issues (20)
- Recent Advances in Document Summarization, Yao+, Knowledge and Information Systems'17 HOT 1
- Distraction-Based Neural Networks for Modeling Documents, Chen+, IJCAI'16.
- Get To The Point: Summarization with Pointer-Generator Networks, See+, ACL'17
- Incorporating Copying Mechanism in Sequence-to-Sequence Learning, Gu+, ACL'16
- Neural Summarization by Extracting Sentences and Words, Chenc+, ACL'16
- Cutting-off redundant repeating generations for neural abstractive summarization, Suzuki+, EACL'17
- A Deep Reinforced Model for Abstractive Summarization, Paulus (and Socher)+, arXiv'17
- A Neural Attention Model for Sentence Summarization, Rush+, EMNLP'15
- A Trainable Document Summarizer, Kupiec+, SIGIR'95
- Text Summarization using a trainable summarizer and latent semantic analysis, Yeh+, Information Processing and Management 2005.
- Learning from Numerous Untailored Summaries, Kikuchi+, PRICAI'16 HOT 1
- Learning to Generate Coherent Sumamry with Discriminative Hidden Semi-Markov Model, Nishikawa+, COLING'14 HOT 3
- CTSUM: Extracting More Certain Summaries for News Articles, Wan+, SIGIR'14 HOT 4
- Sentence Compression by Deletion with LSTMs, Fillipova+, EMNLP'15 HOT 1
- Japanese Sentence Compression with a Large Training Dataset, Hasegawa+, ACL2017
- Summarizing Lengthy Questions, Ishigaki+, IJCNLP2017
- Coarse-to-Fine Attention Models for Document Summarization, Ling+ (with Rush), ACL'17 HOT 1
- Graph-based Neural Multi-Document Summarization, Yasunaga+, CoNLL'17 HOT 3
- Generating Sentences by Editing Prototypes, Guu+, arXiv'17 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from summarization_papers.