GithubHelp home page GithubHelp logo

lixiangnlp / softalignments Goto Github PK

View Code? Open in Web Editor NEW

This project forked from m4t1ss/softalignments

0.0 3.0 0.0 18.94 MB

Soft alignment visualisations for command line

Home Page: http://attention.lielakeda.lv/

License: MIT License

Python 25.35% Shell 0.34% JavaScript 43.92% PHP 27.59% CSS 2.80%

softalignments's Introduction

NMT Attention Alignment Visualizations

An attention alignment visualization tool for command line and web.

A part of the web version was borrowed from Nematus utils

Build Status

Usage

  • Train a neural MT system (like Neural Monkey or Nematus)
  • Translate text and get word or subword level alignments
  • Visualize the alignments
    • in the command line standard output
    • in a web browser (PHP required)

Requirements

  • Python 2 or 3

  • PHP 5.4 or newer (for web visualization)

How to get alignment files from NMT systems

  • Nematus

  • Neural Monkey

    • In the training.ini file add
     [alignment_saver]
     class=runners.word_alignment_runner.WordAlignmentRunner
     output_series="ali"
     encoder=<encoder>
     decoder=<decoder>

    and add alignment_saver to the runners in main

     runners=[<runner_greedy>, <alignment_saver>]
    • In the translation.ini file in eval_data add
     s_ali_out="out.alignment"
  • AmuNMT

    • In the config.yml file add
     return-alignment: yes
  • OpenNMT Run translate.lua to translate with the -save_attention parameter to save attentions to a file

  • Sockeye Run sockeye.translate to translate with the --output-type parameter set to translation_with_alignment_matrix to save attentions to a file

Publications

If you use this tool, please cite the following paper:

Matīss Rikters, Mark Fishel, Ondřej Bojar (2017). "Visualizing Neural Machine Translation Attention and Confidence." In The Prague Bulletin of Mathematical Linguistics volume 109 (2017).

@inproceedings{Rikters-EtAl2017PBML,
	author = {Rikters, Matīss and Fishel, Mark and Bojar, Ond\v{r}ej},
	journal={The Prague Bulletin of Mathematical Linguistics},
	volume={109},
	pages = {1--12},
	title = {{Visualizing Neural Machine Translation Attention and Confidence}},
	address={Lisbon, Portugal},
	year = {2017}
}

Examples

  • in the command line as shaded blocks. Example with Neural Monkey alignments (separate source and target subword unit files are required)

    python process_alignments.py \
    -i test_data/neuralmonkey/alignments.npy  \
    -o color \
    -s test_data/neuralmonkey/src.en.bpe \
    -t test_data/neuralmonkey/out.lv.bpe \
    -f NeuralMonkey
  • the same with Nematus alignments (source and target subword units are in the same file)

    python process_alignments.py \
    -i test_data/nematus/alignments.txt \
    -o color \
    -f Nematus
  • in a text file as Unicode block elements

    python process_alignments.py \
    -i test_data/neuralmonkey/alignments.npy  \
    -o block \
    -s test_data/neuralmonkey/src.en.bpe \
    -t test_data/neuralmonkey/out.lv.bpe \
    -f NeuralMonkey

    or

    python process_alignments.py \
    -i test_data/neuralmonkey/alignments.npy  \
    -o block2 \
    -s test_data/neuralmonkey/src.en.bpe \
    -t test_data/neuralmonkey/out.lv.bpe \
    -f NeuralMonkey
    
  • in the browser as links between words (demo here)

    python process_alignments.py \
    -i test_data/amunmt/amu.out.en \
    -s test_data/amunmt/amu.src.de \
    -o web \
    -f AmuNMT

Parameters for process_alignments.py

Option Description Required Possible Values Default Value
-i input alignment file yes Path to file
-o output alignment matrix type No 'web', 'color', 'block', 'block2' 'web'
-s source sentence subword units For Neural Monkey Path to file
-t target sentence subword units For Neural Monkey Path to file
-f Where are the alignments from No 'NeuralMonkey', 'Nematus', 'AmuNMT', 'OpenNMT', 'Sockeye' 'NeuralMonkey'
-n Number of a specific sentence No Integer -1 (show all)

Screenshots

Color, Block, Block2
N|Solid N|Solid N|Solid

Web N|Solid

softalignments's People

Contributors

m4t1ss avatar

Watchers

James Cloos avatar Xiang Li avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.