GithubHelp home page GithubHelp logo

awesome-archive / bytenet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from buriburisuri/bytenet

0.0 2.0 0.0 329 KB

A tensorflow implementation of French-to-English machine translation using DeepMind's ByteNet .

License: MIT License

Python 100.00%

bytenet's Introduction

ByteNet - Fast Neural Machine Translation

A tensorflow implementation of French-to-English machine translation using DeepMind's ByteNet from the paper Nal et al's Neural Machine Translation in Linear Time. This paper proposed the fancy method which replaced the traditional RNNs with conv1d dilated and causal conv1d, and they achieved fast training and state-of-the-art performance on character-level translation.

The architecture ( from the paper )

Dependencies

  1. tensorflow >= rc0.11
  2. sugartensor >= 0.0.1.7
  3. nltk >= 3.0

Datasets

I've used NLTK's comtrans English-French parallel corpus for convenience. You can easily download it as follows:


python
>>>> import nltk
>>>> nltk.download_shell()
NLTK Downloader
---------------------------------------------------------------------------
    d) Download   l) List    u) Update   c) Config   h) Help   q) Quit
---------------------------------------------------------------------------
Downloader> d

Download which package (l=list; x=cancel)?
  Identifier> comtrans
  

Implementation differences from the paper.

  1. I've replaced the Sub Batch Normal with Layer Normalization for convenience.
  2. No bags of characters applied for simplicity.
  3. Latent dimension is 400 because Comtrans corpus in NLTK is small. ( 892 in the paper )
  4. Generation code not optimized.

Training the network

Execute


python train.py

to train the network. You can see the result ckpt files and log files in the 'asset/train' directory. Launch tensorboard --logdir asset/train/log to monitor training process.

I've trained this model on a single Titan X GPU during 10 hours until 50 epochs. If you don't have a Titan X GPU, reduce batch_size in the train.py file from 16 to 8.

Translate sample French sentences

Execute


python translate.py

to translate sample French sentences to English. The result will be printed on the console.

Sample translations

The result looks messy but promising. Though Comtrans corpus in NLTK is very small(in my experiment only 17,163 pairs used), the model have learned English words structures and syntax by character level.
I think that the translation accuracy will be better if we use big corpus.

French (sources) English (translated by ByteNet) English (translated by Google translator)
Et pareil phénomène ne devrait pas occuper nos débats ? And there is no need to play in this is in this way , " ) And such a phenomenon should not occupy our debates?
Mais nous devons les aider sur la question de la formation . However , we must reach an agreement with the points of our of the Union . But we need help on the issue of training.
Les videurs de sociétés sont punis . The van on life this situation regarding the third juricum . Corporate bouncers are punished.
Après cette période , ces échantillons ont été analysés et les résultats illustrent bien la quantité de dioxine émise au cours des mois écoulés . After this period , this sample is meant been knowledge basid only in the initiative which should drime life that has been played . After this period, the samples were analyzed and the results illustrate the amount of dioxins emitted during the past months.
Merci beaucoup , Madame la Commissaire . . Thank you very much to balancinity that I development is a police on Ferelation . Thank you very much, Commissioner.
Le Zimbabwe a beaucoup à gagner de l ' accord de partenariat et a un urgent besoin d ' aide et d ' allégement de la dette . There are many language for the getting which offence in this area the internal here , which it is something that is to say . Zimbabwe has much to gain from the Partnership Agreement and urgently needs aid and debt relief.
Le gouvernement travailliste de Grande-Bretagne a également des raisons d ' être fier de ses performances . The Structural Funds Repulation also so doing up in respect for human rights and democratic principles . The Labour government in Britain also has reason to be proud of its performance.
La plupart d' entre nous n' a pas l' intention de se vanter des 3 millions d' euros . Most of us here we need to give the main in the next player aid . Most of us do not have the intention to boast of 3 million euros.
Si le Conseil avait travaillé aussi vite que ne l' a fait M. Brok , nous serions effectivement bien plus avancés . If the Council fails to add the fact that there is nothing else what the European Union is a facing here . If the Council had worked as quickly as did the did Mr Brok, we would indeed well advanced.
Le deuxième thème important concerne la question de la gestion des contingents tarifaires . The second important area is the issue of managing taking place . The second important issue concerns the question of the management of tariff quotas.

pre-trained models

You can translate French sentences to English sentences with the pre-trained model on the Comtrans corpus in NLTK. Extract the following zip file in 'asset/train/ckpt'. And try another sample French sentences in the 'translate.py' file.

Other resources

  1. ByteNet language model tensorflow implementation

My other repositories

  1. SugarTensor
  2. EBGAN tensorflow implementation
  3. Timeseries gan tensorflow implementation
  4. Supervised InfoGAN tensorflow implementation
  5. AC-GAN tensorflow implementation
  6. SRGAN tensorflow implementation

Authors

Namju Kim ([email protected]) at Jamonglabs Co., Ltd.

bytenet's People

Contributors

buriburisuri avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.