An attention alignment visualization tool for command line and web.
A part of the web version was borrowed from Nematus utils
- Train a neural MT system (like Neural Monkey or Nematus)
- Translate text and get word or subword level alignments
- Visualize the alignments
- in the command line standard output
- in a web browser (PHP required)
-
Python 2 or 3
-
PHP 5.4 or newer (for web visualization)
-
- Run nematus/translate.py with the --output_alignment or -a parameter
-
- In the training.ini file add
[alignment_saver] class=runners.word_alignment_runner.WordAlignmentRunner output_series="ali" encoder=<encoder> decoder=<decoder>
and add alignment_saver to the runners in main
runners=[<runner_greedy>, <alignment_saver>]
- In the translation.ini file in eval_data add
s_ali_out="out.alignment"
-
- In the config.yml file add
return-alignment: yes
-
OpenNMT Run translate.lua to translate with the
-save_attention
parameter to save attentions to a file -
Sockeye Run sockeye.translate to translate with the
--output-type
parameter set totranslation_with_alignment_matrix
to save attentions to a file
If you use this tool, please cite the following paper:
Matīss Rikters, Mark Fishel, Ondřej Bojar (2017). "Visualizing Neural Machine Translation Attention and Confidence." In The Prague Bulletin of Mathematical Linguistics volume 109 (2017).
@inproceedings{Rikters-EtAl2017PBML,
author = {Rikters, Matīss and Fishel, Mark and Bojar, Ond\v{r}ej},
journal={The Prague Bulletin of Mathematical Linguistics},
volume={109},
pages = {1--12},
title = {{Visualizing Neural Machine Translation Attention and Confidence}},
address={Lisbon, Portugal},
year = {2017}
}
-
in the command line as shaded blocks. Example with Neural Monkey alignments (separate source and target subword unit files are required)
python process_alignments.py \ -i test_data/neuralmonkey/alignments.npy \ -o color \ -s test_data/neuralmonkey/src.en.bpe \ -t test_data/neuralmonkey/out.lv.bpe \ -f NeuralMonkey
-
the same with Nematus alignments (source and target subword units are in the same file)
python process_alignments.py \ -i test_data/nematus/alignments.txt \ -o color \ -f Nematus
-
in a text file as Unicode block elements
python process_alignments.py \ -i test_data/neuralmonkey/alignments.npy \ -o block \ -s test_data/neuralmonkey/src.en.bpe \ -t test_data/neuralmonkey/out.lv.bpe \ -f NeuralMonkey
or
python process_alignments.py \ -i test_data/neuralmonkey/alignments.npy \ -o block2 \ -s test_data/neuralmonkey/src.en.bpe \ -t test_data/neuralmonkey/out.lv.bpe \ -f NeuralMonkey
-
in the browser as links between words (demo here)
python process_alignments.py \ -i test_data/amunmt/amu.out.en \ -s test_data/amunmt/amu.src.de \ -o web \ -f AmuNMT
Option | Description | Required | Possible Values | Default Value |
---|---|---|---|---|
-i | input alignment file | yes | Path to file | |
-o | output alignment matrix type | No | 'web', 'color', 'block', 'block2' | 'web' |
-s | source sentence subword units | For Neural Monkey | Path to file | |
-t | target sentence subword units | For Neural Monkey | Path to file | |
-f | Where are the alignments from | No | 'NeuralMonkey', 'Nematus', 'AmuNMT', 'OpenNMT', 'Sockeye' | 'NeuralMonkey' |
-n | Number of a specific sentence | No | Integer | -1 (show all) |