This is the sourcecode for JMaxAlign: An open-source maximum entropy parallel sentence alignment tool written in Java. You can read the original research paper here(http://aclweb.org/anthology//C/C12/C12-3035.pdf)
JMaxAlign was originally developed by Joseph Kaufmann at Decisive Analytics (http://www.dac.us/)
JMaxAlign depends on:
- Berkeley Aligner (https://code.google.com/p/berkeleyaligner/)
- Stanford CoreNLP (http://nlp.stanford.edu/software/corenlp.shtml)
- Stanford Classifier (http://nlp.stanford.edu/software/classifier.shtml)
-
Out of the 3 dependencies, Stanford CoreNLP is the only one that is part of the Maven central repository. Add the Maven dependency for CoreNLP to your pom file. The crucial thing to know is that CoreNLP needs its models to run (most parts beyond the tokenizer) and so you need to specify both the code jar and the models jar in your pom.xml, as follows:
edu.stanford.nlp stanford-corenlp 3.3.1 edu.stanford.nlp stanford-corenlp 3.3.1 models -
Download the Stanford Classifier and Berkeley Aligner jar files to your local machine.
-
Install the local jar files to your local Maven repository as follows:
mvn install:install-file -Dfile={path/to/file}.jar -DgroupId={put.groupid.here} -DartifactId={artifactname} -Dversion={version} -Dpackaging=jar
-
Add the new dependencies to your pom file as follows:
{put.groupid.here} {artifactname} {version} models
You are now done.