GithubHelp home page GithubHelp logo

nlp2ct / wrd Goto Github PK

View Code? Open in Web Editor NEW

This project forked from baosongyang/wrd

0.0 1.0 0.0 203.74 MB

This task is to study how well the word order information learned by different neural networks.

Logos 84.92% Yacc 3.46% Shell 0.11% Python 11.52%

wrd's Introduction

Word Reordering Detection

This word reordering detection task (WRD) is based on the following paper:

Baosong Yang, Longyue Wang, Derek F. Wong, Lidia S. Chao and Zhaopeng Tu. In ACL 2019.

Introduction

The main purpose is to study how well the word order information learned by different neural networks. Specifically, we randomly move one word to another position, and examine whether a trained model can detect both the original and inserted positions. Our codes were built upon THUMT-MT. We compare self-attention networks (SAN, Vaswani et al., 2017) with re-implemented RNN (Chen et al., 2018), as well as directional SAN (DiSAN,Shen et al., 2018) that augments SAN with recurrence modeling.

Citation

Please cite the following paper:

@inproceedings{yang2019assessing,
  author    = {Baosong Yang  and  Longyue Wang  and  Derek F. Wong  and Lidia S. Chao and Zhaopeng Tu},
  title     = {Assessing the Ability of Self-Attention Networks to Learn Word Order},
  booktitle = {ACL},
  year      = {2019}
}

Data

We conduct this task on the English sentences, which are extracted from the WMT14 En⇒De data with maximum length to 80. For each sentence in different sets (i.e. training, validation, and test sets), we construct an instance by randomly moving a word to another position. Finally we construct 7M, 10Kand 10K samples for training, validating and testing, respectively. Note that a sentence can be sampled multiple times, thus each dataset in the WRD data contains more instances than in the machine translation data.

For other languages, we provide scripts for generating such kind of corpus "./script/reorder_word.py" and recover it to the original format "./script/recover_order.py".

Usage

  • This program is based on THUMT-MT. We add options for running RNN- and DiSAN-based models which are named "rnnp" and "transformer_di", respectively. To run machine translation models, you may read the documentation of the original implementation.
  • To examine pre-trained MT encoders on WRD task: 1. put your model checkpoint files under the "eval" folder; 2. we provide an example script "word_order_MT.sh" to assess the ability of SAN to learn word order, you can evaluate other models by modifying the example script.
  • To examine randomly initialized encoders on WRD task: 1. put your well-trained MT models under the "eval" folder (merely use word embeddings, you can also choose other well-trained word embeddings); 2. we provide an example script "word_order_MT.sh" to assess the ability of SAN to learn word order, you can evaluate other models by modifying the example script. Note that, if you use word embeddings in pre-trained MT models, please remember to rename the scope name in the model file, making the WRD model fail to load existing parameters and re-initialize new parameters, for example: modify: ./thumt/models/transformer.py:
Line 48: "encoder" => "encoder2"
  • To assess the accuracy of models: you can use our scripts released in ./scripts/
  • Effect of wrong word order noises: we make erroneous word order noises on WMT14 En-De development set by moving one word to another position, and evaluate the drop of the translation quality of each model. The data and script can be found in "./robustness"

wrd's People

Contributors

baosongyang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.