GithubHelp home page GithubHelp logo

nusnlp / geccl Goto Github PK

View Code? Open in Web Editor NEW
4.0 0.0 1.0 3.84 MB

Grammatical Error Correction with Contrastive Learning in Low Error Density Domains

Shell 1.44% Makefile 0.02% Python 73.15% Batchfile 0.03% C++ 0.62% Cuda 1.43% Cython 0.35% Lua 0.16% Macaulay2 22.79%
deep-learning gec pytorch grammatical-error-correction

geccl's People

Contributors

michaelcaohn avatar

Stargazers

 avatar  avatar  avatar  avatar

Forkers

michaelcaohn

geccl's Issues

What's the meaning of "n_list"?

Hello.

Thank you for releasing the code of the paper.

I have a question about the variable n_list found in label_smoothed_cross_entropy.py and new_max_margin_loss.py of GEC-PD (gec-pseudo).

It seems that each elem e_i in n_list means only the first e_i of the neg targets are used for the loss calculation. But what's its purpose?

fine-tuning after post-trained

Hi : )

I'm trying to introduce CL into CGEC (Chinese GEC) task. You used post-trained model (trained on non-native learner data) and then fine-tune (trained on native leaner data) with two strategies (NLL & CL) if I remember correctly...

I did the almost same steps use NLL strategy, but the fine-tuned model got a lower score than post-trained model in test-set (their $F_{0.5}$ scores were 5 and 25 respectively)

I think the reason might be the different data distribution, and I wanna know how you can make NLL better than DI (in paper)

I hope I made myself clear.

thanks

Rerun results lower than what's reported

Hello. I reran the GEC-PD experiment with the provided data and code in the repo. However, the results I got were lower then what are reported in the repo.

Results of the repo:

S0: 41.48 | 21.44 | 34.94
S1: 31.11 | 19.37 | 27.74
G0: 42.41 | 23.01 | 36.29
G1: 32.00 | 23.28 | 29.77

S avg: 36.30 | 20.40 | 31.34
G avg: 37.21 | 23.15 | 33.03

Rerun results:

S0: 38.54 | 19.10 | 31.99
S1: 30.33 | 18.09 | 26.69
G0: 42.38 | 21.19 | 35.30
G1: 32.06 | 21.50 | 29.17

S avg: 34.43 | 18.60 | 29.34
G avg: 37.22 | 21.35 | 32.24

Environment:

  • OS: Ubuntu 18.04.1 64 bits
  • Python version 3.7.11
  • Pytorch version 1.7.1
  • CUDA Version 11.2

Here are several possible reasons I guess that led to the performance gap:

  1. Choice of the best model for generating predictions with the test sets and for evaluation (calculating precision / recall / $F_{0.5}$). I used the best checkpoint during training (checkpoint_best.pt generated by fairseq). In the sample code of the repo it is checkpoint3.pt but why?

  2. ERRANT version. I used errant==2.3.0.

  3. Random seeds. I used [10, 20, 30] and took the average.

Since the evaluation script was not released by the repo, I am not sure how the trained models in the paper were evaluated. Could you kindly provide more details, such as releasing the evaluation script?

Thank you very much.

Training script of nll.

Hello. Thanks for your work and kindly releasing the code. Can you also provide the fine-tuning script of NLL except CL- and CL?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.