GithubHelp home page GithubHelp logo

gec-papers's Introduction

Paper List of Grammatical Error Correction

Introduction

This repo lists papers of Grammatical Error Correction (GEC) and papers of related topics, such as Grammatical Error Detection (GED) and Spoken Grammatical Error Correction (SGEC).

Update Notes

2022/7/28: Added some symbols. {D}: Papers of GEC/GED datasets. {LOTE}: Papers of GEC/GED for languages other than English.

2022/5/18: Updating. The papers will be organized by publication years. Note that the key words are not from the paper authors, they are created by the repo author.

GEC Papers 2022

  1. Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction
  • Authors: Maksym Tarnavskyi, Artem Chernodub, Kostiantyn Omelianchuk
  • Conference: ACL
  • Link: https://aclanthology.org/2022.acl-long.266/
  • Code: https://github.com/MaksTarnavskyi/gector-large
  • Abstract In this paper, we investigate improvements to the GEC sequence tagging architecture with a focus on ensembling of recent cutting-edge Transformer-based encoders in Large configurations. We encourage ensembling models by majority votes on span-level edits because this approach is tolerant to the model architecture and vocabulary size. Our best ensemble achieves a new SOTA result with an F0.5 score of 76.05 on BEA-2019 (test), even without pre-training on synthetic datasets. In addition, we perform knowledge distillation with a trained ensemble to generate new synthetic training datasets, “Troy-Blogs” and “Troy-1BW”. Our best single sequence tagging model that is pretrained on the generated Troy- datasets in combination with the publicly available synthetic PIE dataset achieves a near-SOTA result with an F0.5 score of 73.21 on BEA-2019 (test). The code, datasets, and trained models are publicly available.
  • Key Words: Empirical Study; Bigger PLMs; Ensembling Comparison; Knowledge Distilling
  1. Interpretability for Language Learners Using Example-Based Grammatical Error Correction
  • Authors: Masahiro Kaneko, Sho Takase, Ayana Niwa, Naoaki Okazaki
  • Conference: ACL
  • Link: https://aclanthology.org/2022.acl-long.496/
  • Code: https://github.com/kanekomasahiro/eb-gec
  • Abstract Grammatical Error Correction (GEC) should not focus only on high accuracy of corrections but also on interpretability for language learning.However, existing neural-based GEC models mainly aim at improving accuracy, and their interpretability has not been explored.A promising approach for improving interpretability is an example-based method, which uses similar retrieved examples to generate corrections. In addition, examples are beneficial in language learning, helping learners understand the basis of grammatically incorrect/correct texts and improve their confidence in writing.Therefore, we hypothesize that incorporating an example-based method into GEC can improve interpretability as well as support language learners.In this study, we introduce an Example-Based GEC (EB-GEC) that presents examples to language learners as a basis for a correction result.The examples consist of pairs of correct and incorrect sentences similar to a given input and its predicted correction.Experiments demonstrate that the examples presented by EB-GEC help language learners decide to accept or refuse suggestions from the GEC output.Furthermore, the experiments also show that retrieved examples improve the accuracy of corrections.
  • Key Words: Interpretability; kNN-MT; Seq2Seq; Application Oriented
  1. Adjusting the Precision-Recall Trade-Off with Align-and-Predict Decoding for Grammatical Error Correction
  • Authors: Xin Sun, Houfeng Wang
  • Conference: ACL
  • Link: https://aclanthology.org/2022.acl-short.77/
  • Code: https://github.com/AutoTemp/Align-and-Predict
  • Abstract Modern writing assistance applications are always equipped with a Grammatical Error Correction (GEC) model to correct errors in user-entered sentences. Different scenarios have varying requirements for correction behavior, e.g., performing more precise corrections (high precision) or providing more candidates for users (high recall). However, previous works adjust such trade-off only for sequence labeling approaches. In this paper, we propose a simple yet effective counterpart – Align-and-Predict Decoding (APD) for the most popular sequence-to-sequence models to offer more flexibility for the precision-recall trade-off. During inference, APD aligns the already generated sequence with input and adjusts scores of the following tokens. Experiments in both English and Chinese GEC benchmarks show that our approach not only adapts a single model to precision-oriented and recall-oriented inference, but also maximizes its potential to achieve state-of-the-art results. Our code is available at https://github.com/AutoTemp/Align-and-Predict.
  • Key Words: Precision-Recall Trade-Off; Beam Search; Seq2Seq; Application Oriented
  1. {LOTE} “Is Whole Word Masking Always Better for Chinese BERT?”: Probing on Chinese Grammatical Error Correction
  • Authors: Yong Dai, Linyang Li, Cong Zhou, Zhangyin Feng, Enbo Zhao, Xipeng Qiu, Piji Li, Duyu Tang
  • Conference: ACL Findings
  • Link: https://aclanthology.org/2022.findings-acl.1/
  • Abstract Whole word masking (WWM), which masks all subwords corresponding to a word at once, makes a better English BERT model. For the Chinese language, however, there is no subword because each token is an atomic character. The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters. Such difference motivates us to investigate whether WWM leads to better context understanding ability for Chinese BERT. To achieve this, we introduce two probing tasks related to grammatical error correction and ask pretrained models to revise or insert tokens in a masked language modeling manner. We construct a dataset including labels for 19,075 tokens in 10,448 sentences. We train three Chinese BERT models with standard character-level masking (CLM), WWM, and a combination of CLM and WWM, respectively. Our major findings are as follows: First, when one character needs to be inserted or replaced, the model trained with CLM performs the best. Second, when more than one character needs to be handled, WWM is the key to better performance. Finally, when being fine-tuned on sentence-level downstream tasks, models trained with different masking strategies perform comparably.
  1. Type-Driven Multi-Turn Corrections for Grammatical Error Correction
  • Authors: Shaopeng Lai, Qingyu Zhou, Jiali Zeng, Zhongli Li, Chao Li, Yunbo Cao, Jinsong Su
  • Conference: ACL Findings
  • Link: https://aclanthology.org/2022.findings-acl.254/
  • Code: https://github.com/DeepLearnXMU/TMTC
  • Abstract Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors. In this aspect, dominant models are trained by one-iteration learning while performing multiple iterations of corrections during inference. Previous studies mainly focus on the data augmentation approach to combat the exposure bias, which suffers from two drawbacks.First, they simply mix additionally-constructed training instances and original ones to train models, which fails to help models be explicitly aware of the procedure of gradual corrections. Second, they ignore the interdependence between different types of corrections.In this paper, we propose a Type-Driven Multi-Turn Corrections approach for GEC. Using this approach, from each training instance, we additionally construct multiple training instances, each of which involves the correction of a specific type of errors. Then, we use these additionally-constructed training instances and the original one to train the model in turn.Experimental results and in-depth analysis show that our approach significantly benefits the model training. Particularly, our enhanced model achieves state-of-the-art single-model performance on English GEC benchmarks. We release our code at Github.
  • Key Words: Iterative Correction; Edit Operation; Sequence Labeling
  1. Frustratingly Easy System Combination for Grammatical Error Correction
  • Authors: Muhammad Qorib, Seung-Hoon Na, Hwee Tou Ng
  • Conference: NAACL
  • Link: https://aclanthology.org/2022.naacl-main.143/
  • Code: https://github.com/nusnlp/esc
  • Abstract In this paper, we formulate system combination for grammatical error correction (GEC) as a simple machine learning task: binary classification. We demonstrate that with the right problem formulation, a simple logistic regression algorithm can be highly effective for combining GEC models. Our method successfully increases the F0.5 score from the highest base GEC system by 4.2 points on the CoNLL-2014 test set and 7.2 points on the BEA-2019 test set. Furthermore, our method outperforms the state of the art by 4.0 points on the BEA-2019 test set, 1.2 points on the CoNLL-2014 test set with original annotation, and 3.4 points on the CoNLL-2014 test set with alternative annotation. We also show that our system combination generates better corrections with higher F0.5 scores than the conventional ensemble.
  • Key Words: Ensembling; Edit Type; Linear Regression; Application Oriented
  1. {D} ErAConD: Error Annotated Conversational Dialog Dataset for Grammatical Error Correction
  • Authors: Xun Yuan, Derek Pham, Sam Davidson, Zhou Yu
  • Conference: NAACL
  • Link: https://aclanthology.org/2022.naacl-main.5/
  • Code: https://github.com/yuanxun-yx/eracond
  • Abstract Currently available grammatical error correction (GEC) datasets are compiled using essays or other long-form text written by language learners, limiting the applicability of these datasets to other domains such as informal writing and conversational dialog. In this paper, we present a novel GEC dataset consisting of parallel original and corrected utterances drawn from open-domain chatbot conversations; this dataset is, to our knowledge, the first GEC dataset targeted to a human-machine conversational setting. We also present a detailed annotation scheme which ranks errors by perceived impact on comprehension, making our dataset more representative of real-world language learning applications. To demonstrate the utility of the dataset, we use our annotated data to fine-tune a state-of-the-art GEC model. Experimental results show the effectiveness of our data in improving GEC model performance in a conversational scenario.
  1. {D, LOTE} MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Grammatical Error Correction
  • Authors: Yue Zhang, Zhenghua Li, Zuyi Bao, Jiacheng Li, Bo Zhang, Chen Li, Fei Huang, Min Zhang
  • Conference: NAACL
  • Link: https://aclanthology.org/2022.naacl-main.227/
  • Code: https://github.com/HillZhang1999/MuCGEC
  • Abstract This paper presents MuCGEC, a multi-reference multi-source evaluation dataset for Chinese Grammatical Error Correction (CGEC), consisting of 7,063 sentences collected from three Chinese-as-a-Second-Language (CSL) learner sources. Each sentence is corrected by three annotators, and their corrections are carefully reviewed by a senior annotator, resulting in 2.3 references per sentence. We conduct experiments with two mainstream CGEC models, i.e., the sequence-to-sequence model and the sequence-to-edit model, both enhanced with large pretrained language models, achieving competitive benchmark performance on previous and our datasets. We also discuss CGEC evaluation methodologies, including the effect of multiple references and using a char-based metric. Our annotation guidelines, data, and code are available at https://github.com/HillZhang1999/MuCGEC.
  1. {D,LOTE} Czech Grammar Error Correction with a Large and Diverse Corpus
  • Authors: Jakub Náplava, Milan Straka, Jana Straková, Alexandr Rosen
  • Conference: TACL
  • Link: https://aclanthology.org/2022.tacl-1.26/
  • Code: https://github.com/ufal/errant_czech
  • Abstract We introduce a large and diverse Czech corpus annotated for grammatical error correction (GEC) with the aim to contribute to the still scarce data resources in this domain for languages other than English. The Grammar Error Correction Corpus for Czech (GECCC) offers a variety of four domains, covering error distributions ranging from high error density essays written by non-native speakers, to website texts, where errors are expected to be much less common. We compare several Czech GEC systems, including several Transformer-based ones, setting a strong baseline to future research. Finally, we meta-evaluate common GEC metrics against human judgments on our data. We make the new Czech GEC corpus publicly available under the CC BY-SA 4.0 license at http://hdl.handle.net/11234/1-4639.
  1. Grammatical Error Correction: Are We There Yet?
  • Authors: Muhammad Reza Qorib, Hwee Tou Ng
  • Conference: COLING
  • Link: https://aclanthology.org/2022.coling-1.246/
  • Abstract There has been much recent progress in natural language processing, and grammatical error correction (GEC) is no exception. We found that state-of-the-art GEC systems (T5 and GECToR) outperform humans by a wide margin on the CoNLL-2014 test set, a benchmark GEC test corpus, as measured by the standard F0.5 evaluation metric. However, a careful examination of their outputs reveals that there are still classes of errors that they fail to correct. This suggests that creating new test data that more accurately measure the true performance of GEC systems constitutes important future work.
  1. {LOTE} Position Offset Label Prediction for Grammatical Error Correction
  • Authors: Xiuyu Wu, Jingsong Yu, Xu Sun, Yunfang Wu
  • Conference: COLING
  • Link: https://aclanthology.org/2022.coling-1.480/
  • Code: Not released yet.
  • Abstract We introduce a novel position offset label prediction subtask to the encoder-decoder architecture for grammatical error correction (GEC) task. To keep the meaning of the input sentence unchanged, only a few words should be inserted or deleted during correction, and most of tokens in the erroneous sentence appear in the paired correct sentence with limited position movement. Inspired by this observation, we design an auxiliary task to predict position offset label (POL) of tokens, which is naturally capable of integrating different correction editing operations into a unified framework. Based on the predicted POL, we further propose a new copy mechanism (P-copy) to replace the vanilla copy module. Experimental results on Chinese, English and Japanese datasets demonstrate that our proposed POL-Pc framework obviously improves the performance of baseline models. Moreover, our model yields consistent performance gain over various data augmentation methods. Especially, after incorporating synthetic data, our model achieves a 38.95 F-0.5 score on Chinese GEC dataset, which outperforms the previous state-of-the-art by a wide margin of 1.98 points.
  1. {LOTE} String Editing Based Chinese Grammatical Error Diagnosis
  • Authors: Haihua Xie, Xiaoqing Lyu, Xuefei Chen
  • Conference: COLING
  • Link: https://aclanthology.org/2022.coling-1.474/
  • Code: https://github.com/xiebimsa/se-cged
  • Abstract Chinese Grammatical Error Diagnosis (CGED) suffers the problems of numerous types of grammatical errors and insufficiency of training data. In this paper, we propose a string editing based CGED model that requires less training data by using a unified workflow to handle various types of grammatical errors. Two measures are proposed in our model to enhance the performance of CGED. First, the detection and correction of grammatical errors are divided into different stages. In the stage of error detection, the model only outputs the types of grammatical errors so that the tag vocabulary size is significantly reduced compared with other string editing based models. Secondly, the correction of some grammatical errors is converted to the task of masked character inference, which has plenty of training data and mature solutions. Experiments on datasets of NLPTEA-CGED demonstrate that our model outperforms other CGED models in many aspects.

GED Papers 2022

  1. {LOTE} Improving Chinese Grammatical Error Detection via Data augmentation by Conditional Error Generation
  • Authors: Tianchi Yue, Shulin Liu, Huihui Cai, Tao Yang, Shengkang Song, TingHao Yu
  • Conference: ACL Findings
  • Link: https://aclanthology.org/2022.findings-acl.233/
  • Code: https://github.com/tc-yue/DA_CGED
  • Abstract Chinese Grammatical Error Detection(CGED) aims at detecting grammatical errors in Chinese texts. One of the main challenges for CGED is the lack of annotated data. To alleviate this problem, previous studies proposed various methods to automatically generate more training samples, which can be roughly categorized into rule-based methods and model-based methods. The rule-based methods construct erroneous sentences by directly introducing noises into original sentences. However, the introduced noises are usually context-independent, which are quite different from those made by humans. The model-based methods utilize generative models to imitate human errors. The generative model may bring too many changes to the original sentences and generate semantically ambiguous sentences, so it is difficult to detect grammatical errors in these generated sentences. In addition, generated sentences may be error-free and thus become noisy data. To handle these problems, we propose CNEG, a novel Conditional Non-Autoregressive Error Generation model for generating Chinese grammatical errors. Specifically, in order to generate a context-dependent error, we first mask a span in a correct text, then predict an erroneous span conditioned on both the masked text and the correct span. Furthermore, we filter out error-free spans by measuring their perplexities in the original sentences. Experimental results show that our proposed method achieves better performance than all compared data augmentation methods on the CGED-2018 and CGED-2020 benchmarks.
  • Key Words: Generative CGED; BERT Masking; Conditional Error Generation
  1. Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors
  • Authors: Ryo Nagata, Manabu Kimura, Kazuaki Hanawa
  • Conference: ACL Findings
  • Link: https://aclanthology.org/2022.findings-acl.324/
  • Code: https://github.com/tc-yue/DA_CGED
  • Abstract Abstract In this paper, we explore the capacity of a language model-based method for grammatical error detection in detail. We first show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance equivalent to what a non-language model-based method can achieve with the full training data; recall improves much faster with respect to training data size in the BERT-based method than in the non-language model method. This suggests that (i) the BERT-based method should have a good knowledge of the grammar required to recognize certain types of error and that (ii) it can transform the knowledge into error detection rules by fine-tuning with few training samples, which explains its high generalization ability in grammatical error detection. We further show with pseudo error data that it actually exhibits such nice properties in learning rules for recognizing various types of error. Finally, based on these findings, we discuss a cost-effective method for detecting grammatical errors with feedback comments explaining relevant grammatical rules to learners.

SGED Papers 2022

  1. On Assessing and Developing Spoken ’Grammatical Error Correction’ Systems
  • Authors: Yiting Lu, Stefano Bannò, Mark Gales
  • Conference: BEA
  • Link: https://aclanthology.org/2022.bea-1.9/
  • Abstract Spoken ‘grammatical error correction’ (SGEC) is an important process to provide feedback for second language learning. Due to a lack of end-to-end training data, SGEC is often implemented as a cascaded, modular system, consisting of speech recognition, disfluency removal, and grammatical error correction (GEC). This cascaded structure enables efficient use of training data for each module. It is, however, difficult to compare and evaluate the performance of individual modules as preceeding modules may introduce errors. For example the GEC module input depends on the output of non-native speech recognition and disfluency detection, both challenging tasks for learner data.This paper focuses on the assessment and development of SGEC systems. We first discuss metrics for evaluating SGEC, both individual modules and the overall system. The system-level metrics enable tuning for optimal system performance. A known issue in cascaded systems is error propagation between modules.To mitigate this problem semi-supervised approaches and self-distillation are investigated. Lastly, when SGEC system gets deployed it is important to give accurate feedback to users. Thus, we apply filtering to remove edits with low-confidence, aiming to improve overall feedback precision. The performance metrics are examined on a Linguaskill multi-level data set, which includes the original non-native speech, manual transcriptions and reference grammatical error corrections, to enable system analysis and development.

GEC

Index Date Paper Conference Code Note
1* 21/1/6 Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction (Kaneko et al.) ACL-2020 Code Note
2* 21/1/6 GECToR - Grammatical Error Correction: Tag, Not Rewrite (Omelianchuk et al.) ACL-2020 Code Note
3* 21/1/7 MaskGEC: Improving Neural Grammatical Error Correction via Dynamic Masking (Zhao and Wang) AAAI-2020 Note
4 21/1/7 Towards Minimal Supervision BERT-Based Grammar Error Correction (Student Abstract) (Li et al.) AAAI-2020 Note
5* 21/1/7 Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model (Katsumata and Komachi) AACL-2020 Code Note
6 21/1/9 Chinese Grammatical Correction Using BERT-based Pre-trained Model (Wang et al.) IJCNLP-2020 Note
7* 21/1/10 Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction (Chen et al.) EMNLP-2020 Note
8* 21/1/10 Heterogeneous Recycle Generation for Chinese Grammatical Correction (Hinson et al.) COLING-2020 Note
9 21/1/10 TMU-NLP System Using BERT-based Pre-trained Model to the NLP-TEA CGED Shared Task 2020 (Wang and Komachi) AACL-2020 Note
10 21/1/11 Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction (Hotate et al.) COLING-2020 Note
11 21/1/12 Seq2Edits: Sequence Transduction Using Span-level Edit Operations (Stahlberg and Kumar) EMNLP-2020 Code Note
12 21/1/12 Adversarial Grammatical Error Correction (Raheja and Alikaniotis) EMNLP-2020 Note
13* 21/1/17 Pseudo-Bidirectional Decoding for Local Sequence Transduction (Zhou et al.) EMNLP-2020
14 21/1/18 Neural Grammatical Error Correction Systems with Unsupervised Pre-training on Synthetic Data (Grundkiewicz et al.) ACL-2019
15 21/1/18 An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction (Kiyono et al.) ACL-2019
16 21/1/19 Parallel Iterative Edit Models for Local Sequence Transduction (Awasthi et al.) EMNLP-2019
17 21/1/19 Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data (Zhao et al.) NAACL-2019
18 21/1/20 A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning (Choe et al.) ACL-2020
19 21/1/20 The Unreasonable Effectiveness of Transformer Language Models in Grammatical Error Correction (Alikaniotis and Raheja) ACL-2019
20 21/1/20 TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track (Kaneko et al.) ACL-2019
21 21/1/21 Noisy Channel for Low Resource Grammatical Error Correction (Flachs et al.) ACL-2019
22 21/1/22 The BLCU System in the BEA 2019 Shared Task (Yang et al.) ACL-2019
23 21/1/22 The AIP-Tohoku System at the BEA-2019 Shared Task (Asano et al.) ACL-2019
24 21/1/22 CUNI System for the Building Educational Applications 2019 Shared Task: Grammatical Error Correction (Náplava and Straka) ACL-2019
25 21/1/27 Cross-Sentence Grammatical Error Correction (Chollampatt et al.) ACL-2019

GED

Index Date Paper Conference Code Note
1* 21/1/6 基于数据增强和多任务特征学习的中文语法错误检测方法 (Xie et al.) CCL-2020 Note
2 21/1/11 Integrating BERT and Score-based Feature Gates for Chinese Grammatical Error Diagnosis (Cao et al.) AACL-2020 Note
3 21/1/11 CYUT Team Chinese Grammatical Error Diagnosis System Report in NLPTEA-2020 CGED Shared (Wu and Wang) AACL-2020 Note
4 21/1/11 Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis (Wang et al.) AACL-2020 Note
5 21/1/11 Chinese Grammatical Errors Diagnosis System Based on BERT at NLPTEA-2020 CGED Shared Task (Zan et al.) AACL-2020 Note
6 21/1/11 Chinese Grammatical Error Detection Based on BERT Model (Cheng and Duan) AACL-2020 Note
7 21/1/21 Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection (Kaneko et al.) CICLING-2019

DA

Index Date Paper Conference Code Note
1 21/1/11 A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction (Mita et al.) EMNLP-2020

Related

Index Date Paper Conference Code Note
1 21/1/5 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al.) NAACL-2019
2* 21/1/5 Incorporating BERT into Neural Machine Translation (Zhu et al.) ICLR-2020 Code Note
3 21/1/17 Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning (Liu et al.) AAAI-2016
4 21/1/17 Agreement on Target-bidirectional Neural Machine Translation (Liu et al.) NAACL-2016
5* 21/1/17 Edinburgh Neural Machine Translation Systems for WMT 16 (Sennrich et al.) WMT-2016
6 21/1/22 LIMIT-BERT: Linguistic Informed Multi-Task BERT (Zhou et al.) EMNLP-2020
7 21/1/23 Distilling Knowledge Learned in BERT for Text Generation (Chen et al.) ACL-2020
8 21/1/23 Towards Making the Most of BERT in Neural Machine Translation (Yang et al.) AAAI-2020
9 21/1/23 Acquiring Knowledge from Pre-Trained Model to Neural Machine Translation (Weng et al.) AAAI-2020
10 21/1/26 Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting (Zhou et al.) -

Seq2Seq

  1. [ACL-2020] Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction
    Applied the BERT-fused model for GEC. The BERT is finetuned with MLM and GED to fix the inconsistent input distribution between the raw data for BERT training and the GEC data. Pseudo-data and R2L are also used for performance boosting.
    https://github.com/kanekomasahiro/bert-gec

  2. [IJCNLP-2020] Chinese Grammatical Correction Using BERT-based Pre-trained Model
    Tries BERT-init (BERT-encoder in the papar) and BERT-fused for Chinese GEC. The Chinese GEC ver. of Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction, even less techniques used.

  3. [AACL-2020] Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model
    Used BART for GEC and says that BART can be a baseline for GEC, which can reach high performance by simple finetuning with GEC data instead of pseudo-data pretraining.
    https://github.com/Katsumata420/generic-pretrained-GEC

  4. [EMNLP-2020] Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction
    Combines a sequence tagging model for erroneous span detection and a seq2seq model for erroneous span correction to make the GEC process more efficient. The sequence tagging model (BERT-like) looks for spans needing to be corrected by outputting binary vectors, and the seq2seq model receives inputs annotated according to the outputs of the sequence tagging model and only produces outputs corresponding to the detected spans. Pseudo-data is used for pre-training the ESD and ESC models.

Seq2Edits

  1. [EMNLP-2020] Seq2Edits: Sequence Transduction Using Span-level Edit Operations
    Proposes a method for tasks containing many overlaps such as GEC. Uses Transformer with the decoder modified. The model receives a source sentence and at each inference time-step outputs a 3-tuple which corresponds to an edit operation (error tag, source span end position, replacement). The error tag provides clear explanation. The paper conducts experiments on 5 NLP tasks containing many overlaps. Experiments with and without pretraining are conducted.
    (Not very clear about the modified decoder.)
    https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/research/transformer_seq2edits.py

Seq Labeling

  1. [ACL-2020] GECToR - Grammatical Error Correction: Tag, Not Rewrite
    Used a BERT sequence tagger. Developed custom task-specific g-transformations such as CASE, MERGE and so on. Since each time a token in the source sentence can only map an edit, iterative correction may be required. A 3-stage training strategy is used: data-aug pretraining - finetuning on err data - finetuning on err and err-free data.
    https://github.com/grammarly/gector

  2. [AAAI-2020] Towards Minimal Supervision BERT-Based Grammar Error Correction (Student Abstract)
    Divides the GEC task into two stages: error identification and error correction. The first stage is a sequence tagging (remain, substitution, ...) task and a BERT is used for the second stage (correction).
    (Not very clear about the method proposed by the paper.)

Pipeline

  1. [COLING-2020] Heterogeneous Recycle Generation for Chinese Grammatical Correction
    Makes use of a sequence editing model, a seq2seq model and a spell checker to correct different kinds of errors (small scale errors, large scale errors and spell errors respectively). Iterative decoding is applied on (sequence editing model, seq2seq model). The proposed method needs not data-aug but still achieves comparable performance.

Multi-Task Learning

  1. [GED] [CCL-2020] 基于数据增强和多任务特征学习的中文语法错误检测方法
    Implements Chinese GED through data-augmentation and pretrained BERT finetuned using multi-task learning. The data-augmentation method applied here is simple, including manipulations such as insertions, deletions and so on. Some rules are designed to maintain the meanings of sentences. The Chinese BERT is used for GED with a CRF layer on top. It is finetuned through multi-task learning: pos tagging, parsing and grammar error detection.

Beam Search

  1. [COLING-2020] Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction
    Proposes a local beam search method to output diverse outputs. The proposed method generates more diverse outputs than the plain beam search, and only modifies where should be corrected rather than changing the whole sequence as the global beam search. The copy factor in the copy-augmented Transformer is used as a penalty score.

Adversarial Training

  1. [EMNLP-2020] Adversarial Grammatical Error Correction
    The first approach to use adversarial training for GEC. Uses a seq2seq model as the generator and a sentence-pair classification model for the discriminator. The discriminator basically acts as a novel evaluation method for evaluating the outputs generated by the generator, which directly models the task. No other technique such as data augmentation is used.
    (Not very clear about the adversarial training.)

Dynamic Masking

  1. [AAAI-2020] MaskGEC: Improving Neural Grammatical Error Correction via Dynamic Masking
    Proposed a dynamic masking method for data-augmentation and generalization boosting. In each epoch each sentence is introduced noises with a prob by some manipulations, including padding substitution, random substution, word frequency substitution and so on.

NLPTEA

  1. [AACL-2020] TMU-NLP System Using BERT-based Pre-trained Model to the NLP-TEA CGED Shared Task 2020
    Uses BERT-init as in Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction, which is also the same as the BERT-encoder in Chinese Grammatical Correction Using BERT-based Pre-trained Model.

  2. [GED] [AACL-2020] Integrating BERT and Score-based Feature Gates for Chinese Grammatical Error Diagnosis
    Uses BiLSTM-CRF for GED, whose input is features concat composed of output of BERT, POS, POS score and PMI score. The scores are incorporated using a gating mechanism to avoid losing partial-order relationships when embedding continuous feature items.
    (Not very clear about the features used and the purpose of the gating mechanism.)

  3. [GED] [AACL-2020] CYUT Team Chinese Grammatical Error Diagnosis System Report in NLPTEA-2020 CGED Shared]
    Uses BERT + CRF.

  4. [GED] [AACL-2020] Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis
    Applies res on BERT for GED. The encoded hidden repr is added with the emd and fed into the output layer.
    (Also related to GEC but not detailed, thus catogorize as GED.)

  5. [GED] [AACL-2020] Chinese Grammatical Errors Diagnosis System Based on BERT at NLPTEA-2020 CGED Shared Task
    Uses BERT-BiLSTM-CRF for GED. Uses a hybrid system containing a 3-gram and a seq2seq for GEC.

  6. [GED] [AACL-2020] Chinese Grammatical Error Detection Based on BERT Model
    Uses BERT finetuned on GEC datasets.

Related

  1. [NMT] [ICLR-2020] Incorporating BERT into Neural Machine Translation
    Proposed a BERT-fused model. Comparing with the Vanilla Transformer, the proposed model has additionally one BERT-Enc Attention module in the encoder and a BERT-Dec Attention module in the decoder. Both of the additional modules are for incorporating features extracted by BERT whose weights are fixed. A Vanilla Transformer is trained in the first training stage, and in the second stage the BERT and additional modules are trained together.
    https://github.com/bert-nmt/bert-nmt

gec-papers's People

Contributors

chunngai avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.