GithubHelp home page GithubHelp logo

takuma1229 / en_ja_translator_pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yadayuki/en_ja_translator_pytorch

0.0 0.0 0.0 138 KB

English to Japanese Translator by PyTorch ๐Ÿ™Š (Transformer from scratch)

Home Page: https://zenn.dev/yukiyada/articles/59f3b820c52571

License: MIT License

Python 100.00%

en_ja_translator_pytorch's Introduction

English to Japanese Translator by pytorch ๐Ÿ™Š (Transformer from scratch)

Overview

  • English to Japanese translator by Pytorch.
  • The neural network architecture is Transformer.
  • The layers for Transfomer are implemented from scratch by pytorch. (you can find them under layers/transformer/)
  • Parallel corpus(dataset) is kftt.

Transformer

image

  • Transformer is a neural network model proposed in the paper โ€˜Attention Is All You Needโ€™

  • As the paper's title said, transformer is a model based on Attention mechanism. Transformer does not use recursive calculation when training like RNN,LSTM

  • Many of the models that have achieved high accuracy in various tasks in the NLP domain in recent years, such as BERT, GPT-3, and XLNet, have a Transformer-based structure.

Requirements

Setup

Install dependencies & create a virtual environment in project by running:

$ poetry install

set PYTHONPATH

export PYTHONPATH="$(pwd)"

Download & unzip parallel corpus(kftt) by running:

$ poetry run python ./utils/download.py

Directories

The directory structure is as below.

.
โ”œโ”€โ”€ const
โ”‚ย ย  โ””โ”€โ”€ path.py
โ”œโ”€โ”€ corpus
โ”‚ย ย  โ””โ”€โ”€ kftt-data-1.0
โ”œโ”€โ”€ figure
โ”œโ”€โ”€ layers
โ”‚ย ย  โ””โ”€โ”€ transformer
โ”‚ย ย      โ”œโ”€โ”€ Embedding.py
โ”‚ย ย      โ”œโ”€โ”€ FFN.py
โ”‚ย ย      โ”œโ”€โ”€ MultiHeadAttention.py
โ”‚ย ย      โ”œโ”€โ”€ PositionalEncoding.py
โ”‚ย ย      โ”œโ”€โ”€ ScaledDotProductAttention.py
โ”‚ย ย      โ”œโ”€โ”€ TransformerDecoder.py
โ”‚ย ย      โ””โ”€โ”€ TransformerEncoder.py
โ”œโ”€โ”€ models
โ”‚ย ย  โ”œโ”€โ”€ Transformer.py
โ”‚ย ย  โ””โ”€โ”€ __init__.py
โ”œโ”€โ”€ mypy.ini
โ”œโ”€โ”€ pickles
โ”‚ย ย  โ””โ”€โ”€ nn/
โ”œโ”€โ”€ poetry.lock
โ”œโ”€โ”€ poetry.toml
โ”œโ”€โ”€ pyproject.toml
โ”œโ”€โ”€ tests
โ”‚ย ย  โ”œโ”€โ”€ conftest.py
โ”‚ย ย  โ”œโ”€โ”€ layers/
โ”‚ย ย  โ”œโ”€โ”€ models/
โ”‚ย ย  โ””โ”€โ”€ utils/
โ”œโ”€โ”€ train.py
โ””โ”€โ”€ utils
    โ”œโ”€โ”€ dataset/
    โ”œโ”€โ”€ download.py
    โ”œโ”€โ”€ evaluation/
    โ””โ”€โ”€ text/

How to run

You can train model by running:

$ poetry run python train.py

epoch: 1
--------------------Train--------------------

train loss: 10.104473114013672, bleu score: 0.0,iter: 1/4403

train loss: 9.551202774047852, bleu score: 0.0,iter: 2/4403

train loss: 8.950608253479004, bleu score: 0.0,iter: 3/4403

train loss: 8.688143730163574, bleu score: 0.0,iter: 4/4403

train loss: 8.4220552444458, bleu score: 0.0,iter: 5/4403

train loss: 8.243291854858398, bleu score: 0.0,iter: 6/4403

train loss: 8.187620162963867, bleu score: 0.0,iter: 7/4403

train loss: 7.6360859870910645, bleu score: 0.0,iter: 8/4403

....
  • For each epoch, the model at that point is saved under pickles/nn/
  • When the training is finished, loss.png is saved under figure/

Reference

Licence

MIT

en_ja_translator_pytorch's People

Contributors

yadayuki avatar lwisteria avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.