GithubHelp home page GithubHelp logo

threelittlemonkeys / rnn-encoder-decoder-pytorch Goto Github PK

View Code? Open in Web Editor NEW
37.0 4.0 3.0 9.16 MB

RNN Encoder-Decoder in PyTorch

Python 100.00%
rnn-encoder-decoder attention sequence-to-sequence pytorch

rnn-encoder-decoder-pytorch's Introduction

RNN Encoder-Decoder in PyTorch

A minimal PyTorch implementation of RNN Encoder-Decoder for sequence to sequence learning.

Supported features:

  • Mini-batch training with CUDA
  • Lookup, CNNs, RNNs and/or self-attentive encoding in the embedding layer
  • Input feeding (Luong et al 2015)
  • Attention mechanism (Bahdanau et al 2014, Luong et al 2015)
  • CopyNet, copying mechanism (Gu et al 2016)
  • Beam search decoding
  • Attention visualization

Usage

Training data should be formatted as below:

source_sequence \t target_sequence
source_sequence \t target_sequence
...

To prepare data:

python3 prepare.py training_data

To train:

python3 train.py model vocab.src vocab.tgt training_data.csv num_epoch

To predict:

python3 predict.py model.epochN vocab.src vocab.tgt test_data

References

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473.

Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le. 2017. Massive Exploration of Neural Machine Translation Architectures. arXiv:1703.03906.

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078.

Jiatao Gu, Zhengdong Lu, Hang Li, Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. arXiv:1603.06393.

Jiwei Li. 2017. Teaching Machines to Converse. Doctoral dissertation. Stanford University.

Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, Qi Su. 2018. Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation. arXiv:1808.07374.

Minh-Thang Luong, Hieu Pham, Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. arXiv:1507.04025.

Chan Young Park, Yulia Tsvetkov. Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation.

Sam Wiseman, Alexander M. Rush. Sequence-to-Sequence Learning as Beam-Search Optimization. arXiv:1606.02960.

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144.

rnn-encoder-decoder-pytorch's People

Contributors

threelittlemonkeys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rnn-encoder-decoder-pytorch's Issues

Issue with line.split

I don't think the way the line.split was inplemented in the prepare.py is correct. Is there a sample data set that you have that works?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.