GithubHelp home page GithubHelp logo

awasthiabhijeet / pie Goto Github PK

View Code? Open in Web Editor NEW
226.0 9.0 40.0 2.43 MB

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models for Local Sequence Transduction": www.aclweb.org/anthology/D19-1435.pdf (EMNLP-IJCNLP 2019)

License: MIT License

Python 33.13% Shell 1.36% Macaulay2 65.51%
grammatical-error-correction sequence-transduction bert bert-model bert-models natural-language-processing sequence-editing sequence-labeling nlp post-editing

pie's People

Contributors

awasthiabhijeet avatar gurunathparasaram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pie's Issues

About Synthetic training

Hi, I am working on PIE-BERT-Base, when I use the released synthetic data ( actually, I chose a2 ), I performed 2 epochs of training on a2 and fine-tune on the Lang8+fce+nucle for 2 epochs, using PIE-Base. I found the synthetic-training didn't improve the model.
During the synthetic training stage, I observed that the loss function started to fall and then jump between about 10~14 until the 2 epoch training finished. I use the pickles you mentioned in another issue during both synthetic training and fine-tune. Hyperparameters were copied from Appendix(A.5) for PIE-Base.

  • I wonder if you have experiment results showing synthetic training boosts the PIE-Base, and by how many F-0.5 scores? In the ablation study part of the paper, there is only one result about PIE-Base(56.6). I want to know how many F-0.5 scores is this result be boosted by synthetic training. And whether you have observed the same phenomenon about loss values during synthetic training?

  • Is there anything else to note about synthetic training?

Thank u so much

which source of correct sentences did you used to make the errorful sentences?

Hi you mentioned in readme that in order to construct errorful sentences we need to specify the path to a correct file along with an output path.My question is " from which source did you extracted the correct sentences to form the erraneous dataset provided in the repository?" Since i also want to construct an erraneous dataset of preposition errors but first i need a correct dataset for that. Also Kindly provide your suggestions on how i can proceed in constructing a dataset with just preposition errors.
Thanks in advance.

usage for last_dot_first_capital?

what is the usecase for the following check:

def last_dot_first_capital(text):

If the sentece ends in a Capital word, and has a dot, these will be considered as a single token, and another dot will be added:
"My name is John." -> "My", "name", "is", "John.", "."

Strange thing is that this is only used with ".", but not with "!", "?"

@awasthiabhijeet can you tell me what was the purpose for this?

Try to export estimator for more efficient deployment

Hi,
I noticed that using estimator.evaluate is a very inefficient way to use this model on new datapoints. The best method i know of is to export the model with estimator.export_savedmodel

i tried by adding a few flags and this lines of code:

if FLAGS.do_export: estimator._export_to_tpu = False estimator.export_savedmodel(FLAGS.export_dir, serving_input_fn)

where serving_input_fn is

def serving_input_fn(): edit_sequence = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='edit_sequence') input_ids = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='input_ids') input_mask = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='input_mask') segment_ids = tf.placeholder(tf.int32, [None, FLAGS.max_seq_length], name='segment_ids') input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn({ 'edit_sequence': edit_sequence, 'input_ids': input_ids, 'input_mask': input_mask, 'segment_ids': segment_ids, })() return input_fn

However i was unable to do so. I think it would be a very good addition to your code. but i get an error:ValueError: Couldn't find trained model at PIE_ckpt.

Im very confused by this error.

Bad results trying to train using end_to_end.sh

Hi,

I am trying to run end_to_end.sh, following the instructions in the example_scripts README, however, I am getting bad predictions and therefore cannot reproduce anywhere near the 26.6 F-score mentioned.

I have corrected the typo in end_to_end.sh (./m2eval.sh --> ./m2_eval.sh) and am using https://storage.googleapis.com/bert_models/2018_10_18/cased_L-24_H-1024_A-16.zip on my end, so the code runs, but for example I see results like this:

Test data : Keeping the Secret of Genetic Testing

Prediction round 1: . . . . . . . keeping on a on for the Good , and Secret . . So . . , and of , the , and the , and genetic , and , and for the Testing . . . . . . . . [SEP] . . . . . . . [SEP] .

Any tips on where I'm going wrong?

Hi dear authors, I want the script of producing synthetic data, much grateful if you will release it

Hi dear authors of PIE, I am a student from Peking University, learning NAR-GEC task now. This is a so excellent and impressive work. To reproduce the result you achieved, I want to know how many sentence pairs in the One-Billion-word corpus were used to pre-train the model. And I will be much grateful if you will release the script for producing synthetic data. My email is [email protected] Thank u so much.

Attention mask for computation of replace and append operation

Hi, you mentioned in the papar that we calculate r_{i}^{l} over h_{j}^{l} for all j except i, but calculate a_{i}^{l} over h_{j}^{l} for all j including i.
Why there is such a difference that we can't have information about the current token x_{i} when dealing with the replace operation but have access to the current token for append operation on the contrary?

Releasing weights

Hi authors,
Great work with the paper! I'm interested if you're going to release weights, not of the single best model, but the one mentioned in example_scripts/README.md with F_{0.5} score close to 26.6.
Thank you

Small bug in errorify/error.py

Hi, first of all, thank you for releasing your code.

I just found a small bug in errorify/error.py - in the function readn at line 53, the extra "yield clist" outside the for loop will yield a duplicate batch if the number of lines in the file is divisible by the batch size.

(so for a source file of 1,000 lines with batch size 200, corr_sentences.txt and incorr_sentences.txt would have 1,200 lines).

A simple fix is removing the boolean "start" checks and setting "clist = []" after every "yield clist" in the for loop, so if the number of lines in the file is divisible by the batch size, the yield outside of the loop will just return an empty list.

Running pretrained PIE model gets stuck at INFO:tensorflow:Done running local_init_op.

Hi there, I am trying to run the pretrained PIE model as per the instruction from the repo. However, I get stuck at the point where the output says "INFO:tensorflow:Done running local_init_op." after just the first call of the pie_infer.sh file. I have tried solutions such as commenting out the d.repeat() in the word_edit_model.py but I am still encountering the problem, would really appreciate if I could get some help on this issue, thank you!

EDIT: I have managed to get past that line, but it takes quite long (about 5 minutes). So far it takes about 3 minutes per 10 iterations of the enumeration when enumerating wem_utils.timer(result), is there any way to speed this up? Thank you!

Trying to use the pretrained model

Hi thank you for contributing this model. I am currently trying to use it with your pretrained PIE. Is it correct to download the pretrained PIE into the directory which then I will pass to the "output_dir" argument? Thank you in advance.

about the Edit-factorized BERT Architecture

image

for replace , when we calculate attention score of position i , we don't consider the token w(i).

at the first layer , I think it is no problem, but we use the info of w(i) indirectly at the seconder or upper layers.

Is it ok ?

Pretrained model bad correction performance?

Hi, I am testing out your model and i noticed that if i run your pretrained model on the conll_test.txt file you have it get very poor performances. the output is something of the sort:

Day I think I think I think remember I think I think I
Day What I think is here and I think and I
Day I think large refers every day large your every day chance every day of I think every daying every day large I think I am disease every day every day

which has no resemblance with the input. do you maybe know what might be going on? I am just running multi_round_infer.sh. the only flags i have changed are "use_tpu" to false (cause i dont have a tpu and im runnign on gpu).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.