GithubHelp home page GithubHelp logo

stanfordnlp / phrasal Goto Github PK

View Code? Open in Web Editor NEW
208.0 33.0 90.0 245.68 MB

A large-scale statistical machine translation system written in Java.

Home Page: http://nlp.stanford.edu/

License: GNU General Public License v3.0

HTML 0.55% Shell 1.89% Python 5.71% PLpgSQL 0.13% CSS 0.11% JavaScript 0.62% Perl 3.12% Ruby 0.01% C++ 22.67% XSLT 0.02% C 17.10% Makefile 0.03% Java 46.91% Batchfile 0.39% Yacc 0.38% CMake 0.35%
java-nlp statistical-machine-translation java natural-language-processing

phrasal's Issues

gradle installDist error

I am having ubuntu 14.04.

When i run gradle installDist I got following error.

FAILURE: Build failed with an exception.

  • Where:
    Build file '/pathToDirectory/phrasal-master/build.gradle' line: 151

  • What went wrong:
    A problem occurred evaluating root project 'phrasal-master'.

Could not find method jcenter() for arguments [] on repository container.

  • Try:
    Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output.

BUILD FAILED

Total time: 4.773 secs

java.lang.Boolean cannot be cast to java.lang.String -- Phrasal.java:1213

hi all thanks for the awesome work!

i was trying to run the webservice, which I successfully did after changing 759bb65 back to what it was before. the problem stems from this line, where the property is set as a Boolean. Is there some other reason you changed it, or a different way to run the webservice?

The full error:

java.lang.ClassCastException: java.lang.Boolean cannot be cast to java.lang.String
        at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1213) ~[phrasal-3.4.1.jar:3.4.1]
        at edu.stanford.nlp.mt.service.handlers.TranslationRequestHandler$DecoderService.process(TranslationRequestHandler.java:177) [phrasal-3.4.1.jar:3.4.1]
        at edu.stanford.nlp.mt.service.handlers.TranslationRequestHandler$DecoderService.process(TranslationRequestHandler.java:117) [phrasal-3.4.1.jar:3.4.1]
        at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:249) [stanford-corenlp-3.5.2.jar:3.5.2]
        at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:230) [stanford-corenlp-3.5.2.jar:3.5.2]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_45]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_45]

my test query, which works once I revert 759bb65
(I built a Spanish --> English system)

http://127.0.0.1:8017/x?tReq={"src":"ES", "tgt":"EN", "text": "el parlamento de ucrania", "tgtPrefix":"ukraine", "n": 6}

[gradle] Task.leftShift(Closure) method has been deprecated

My current gradle


------------------------------------------------------------
Gradle 4.3.1
------------------------------------------------------------

Build time:   2017-11-08 08:59:45 UTC
Revision:     e4f4804807ef7c2829da51877861ff06e07e006d

Groovy:       2.4.12
Ant:          Apache Ant(TM) version 1.9.6 compiled on June 29 2015
JVM:          1.8.0_151 (Oracle Corporation 25.151-b12)
OS:           Linux 4.10.0-42-generic amd64

When I build with

gradle compileKenLMtools

I got a message:

> Configure project : 
The Task.leftShift(Closure) method has been deprecated and is scheduled to be removed in Gradle 5.0. Please use Task.doLast(Action) instead.
        at build_8pp33np0p9752ezmkum94m96i.run(/home/cpu11453local/workspace/study/phrasal/build.gradle:96)
        (Run with --stacktrace to get the full stack trace of this deprecation warning.)

Need Help to Setup in Escllipse

I want to run this project in eclipse I read the instructions but did not get it it will be great if someone help to run this project in eclipse

build src-extra failed

I would like to use web-service.sh
However, I try to compile web service with gradle compileExtraJava
Here is full log from Gradle


> Configure project : 
The Task.leftShift(Closure) method has been deprecated and is scheduled to be removed in Gradle 5.0. Please use Task.doLast(Action) instead.
        at build_61jfp5nokcncjri2p7rblqs0e.run(/home/phrasal/build.gradle:96)
        (Run with --stacktrace to get the full stack trace of this deprecation warning.)

> Task :compileExtraJava FAILED
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:99: error: method getRules in interface TranslationModel<TK,FV> cannot be applied to given types;
            .getRules(source, inputProperties, null, qId.incrementAndGet(), scorer);
            ^
  required: Sequence<IString>,InputProperties,int,Scorer<String>
  found: Sequence<IString>,InputProperties,<null>,int,Scorer<String>
  reason: actual and formal argument lists differ in length
  where TK,FV are type-variables:
    TK extends Object declared in interface TranslationModel
    FV extends Object declared in interface TranslationModel
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:100: error: incompatible types: boolean cannot be converted to int
        RuleGrid<IString,String> ruleGrid = new RuleGrid<IString,String>(ruleList, source, true);
                                                                                           ^
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:104: error: cannot find symbol
        Sequence<IString> queryString = Sequences.concatenate(sourceContext, source);
                                                 ^
  symbol:   method concatenate(Sequence<IString>,Sequence<IString>)
  location: class Sequences
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:106: error: method getRules in interface TranslationModel<TK,FV> cannot be applied to given types;
            .getRules(queryString, inputProperties, null, qId.incrementAndGet(), scorer);
            ^
  required: Sequence<IString>,InputProperties,int,Scorer<String>
  found: Sequence<IString>,InputProperties,<null>,int,Scorer<String>
  reason: actual and formal argument lists differ in length
  where TK,FV are type-variables:
    TK extends Object declared in interface TranslationModel
    FV extends Object declared in interface TranslationModel
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:107: error: incompatible types: boolean cannot be converted to int
        RuleGrid<IString,String> ruleGrid = new RuleGrid<IString,String>(ruleList, queryString, true);
                                                                                                ^
/home/phrasal/src-extra/edu/stanford/nlp/mt/service/handlers/RuleQueryRequestHandler.java:134: error: cannot find symbol
          target = Sequences.concatenate(bestLeftContext.abstractRule.target, target);
                            ^
  symbol:   method concatenate(Sequence<IString>,Sequence<IString>)
  location: class Sequences
/home/phrasal/src-extra/edu/stanford/nlp/mt/tools/TranslationModelComparator.java:57: error: constructor TranslationModelFeaturizer in class TranslationModelFeaturizer cannot be applied to given types;
    RuleFeaturizer<IString,String> feat = new TranslationModelFeaturizer(6);
                                          ^
  required: no arguments
  found: int
  reason: actual and formal argument lists differ in length
/home/phrasal/src-extra/edu/stanford/nlp/mt/tools/TranslationModelComparator.java:65: error: cannot find symbol
      RuleGrid<IString,String> dynRules = dynTM.getRuleGrid(source, null, null, sourceId, scorer);
                                               ^
  symbol:   method getRuleGrid(Sequence<IString>,<null>,<null>,int,Scorer<String>)
  location: variable dynTM of type TranslationModel<IString,String>
/home/phrasal/src-extra/edu/stanford/nlp/mt/tools/TranslationModelComparator.java:66: error: cannot find symbol
      RuleGrid<IString,String> compRules = compiledTM.getRuleGrid(source, null, null, sourceId, scorerComp);
                                                     ^
  symbol:   method getRuleGrid(Sequence<IString>,<null>,<null>,int,Scorer<String>)
  location: variable compiledTM of type TranslationModel<IString,String>
Note: Some messages have been simplified; recompile with -Xdiags:verbose to get full output
9 errors


FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':compileExtraJava'.
> Compilation failed; see the compiler error output for details.

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 1s
5 actionable tasks: 1 executed, 4 up-to-date

Bug in README.md

The first step of the Linux install instructions is:

Switch to the root of the Phrasal repository and execute: gradle installDist

But there is no such target in build.gradle.

[gradle] How to build src-extra ?

I need to run web-service.sh, which need edu.stanford.nlp.mt.service.PhrasalService class.
That class is not build into build/libs/phrasal-3.6.0.jar when I run gradle installDist

I see that it is add in build.gradle

// Configure build targets
sourceSets {
  main {
    java.srcDirs = ['src/' ]
    resources.srcDirs = ['resources/']
  }
  test {
    java.srcDirs = ['test/']
    resources.srcDirs = ['test-resources/','src-cc']
  }
  extra {
    java.srcDirs = ['src-extra/']
    resources.srcDirs = ['resources/']
  }
}

However, I am new to gradle.
I need to know the command to build include src-extra.

Missing RuleGrid Constructor

RuleGrid<IString,String> ruleGrid = new RuleGrid<IString,String>(ruleList, queryString, true);
These parameters are not in any of the constructors provided.

TranslationModelComparator bug

Change:
List<ConcreteRule<IString,String>> dynRules = dynTM.getRules(source, null, sourceId, scorer);
List<ConcreteRule<IString,String>> compRules = compiledTM.getRules(source, null, sourceId, scorerComp);
To:
List<ConcreteRule<IString,String>> dynRulesList = dynTM.getRules(source, null, sourceId, scorer);
List<ConcreteRule<IString,String>> compRulesList = compiledTM.getRules(source, null, sourceId, scorerComp);
RuleGrid<IString,String> dynRules = new RuleGrid<IString,String>(dynRulesList, source);
RuleGrid<IString,String> compRules = new RuleGrid<IString,String>(compRulesList, source);

Gradle reports success despite compilation actually failing

gradle compileKenLM was failing, but Gradle's "BUILD SUCCESSFUL" output lead me to think it was working:

image

I solved this on my Ubuntu-based system by exporting JAVA_HOME prior to building: export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

This then resulted in a clean compileKenLM build with no errors:
image

error in gradle compileKenLM

My spec

$ gradle --version

------------------------------------------------------------
Gradle 4.3.1
------------------------------------------------------------

Build time:   2017-11-08 08:59:45 UTC
Revision:     e4f4804807ef7c2829da51877861ff06e07e006d

Groovy:       2.4.12
Ant:          Apache Ant(TM) version 1.9.6 compiled on June 29 2015
JVM:          1.8.0_151 (Oracle Corporation 25.151-b12)
OS:           Linux 4.10.0-40-generic amd64

$ java -version
openjdk version "1.8.0_151"
OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.17.04.2-b12)
OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

$ javac -version
javac 1.8.0_151

I am trying to Build the KenLM loader:
(With lastest commit be69585 on master branch)

$ gradle compileKenLM
Starting a Gradle Daemon, 1 incompatible Daemon could not be reused, use --status for details

> Configure project : 
The Task.leftShift(Closure) method has been deprecated and is scheduled to be removed in Gradle 5.0. Please use Task.doLast(Action) instead.
        at build_8pp33np0p9752ezmkum94m96i.run(/home/cpu11453local/workspace/study/phrasal/build.gradle:96)
        (Run with --stacktrace to get the full stack trace of this deprecation warning.)

> Task :compileKenLM 
You must use ./bjam if you want language model estimation, filtering, or support for compressed files (.gz, .bz2, .xz)
Compiling with g++ -DNDEBUG -O3 -fPIC -DHAVE_ZLIB -I. -O3 -DNDEBUG -DKENLM_MAX_ORDER=7
In file included from /usr/include/c++/6/stdlib.h:36:0,
                 from /usr/lib/gcc/x86_64-linux-gnu/6/include/mm_malloc.h:27,
                 from /usr/lib/gcc/x86_64-linux-gnu/6/include/xmmintrin.h:34,
                 from /usr/lib/gcc/x86_64-linux-gnu/6/include/emmintrin.h:31,
                 from util/integer_to_string.cc:72:
/usr/include/c++/6/cstdlib:124:11: error: ‘::div_t’ has not been declared
   using ::div_t;
           ^~~~~
/usr/include/c++/6/cstdlib:125:11: error: ‘::ldiv_t’ has not been declared
   using ::ldiv_t;
           ^~~~~~
/usr/include/c++/6/cstdlib:127:11: error: ‘::abort’ has not been declared
   using ::abort;
           ^~~~~
/usr/include/c++/6/cstdlib:128:11: error: ‘::abs’ has not been declared
   using ::abs;
           ^~~
/usr/include/c++/6/cstdlib:129:11: error: ‘::atexit’ has not been declared
   using ::atexit;
           ^~~~~~
/usr/include/c++/6/cstdlib:132:11: error: ‘::at_quick_exit’ has not been declared
   using ::at_quick_exit;
           ^~~~~~~~~~~~~
/usr/include/c++/6/cstdlib:135:11: error: ‘::atof’ has not been declared
   using ::atof;
           ^~~~
/usr/include/c++/6/cstdlib:136:11: error: ‘::atoi’ has not been declared
   using ::atoi;
           ^~~~
/usr/include/c++/6/cstdlib:137:11: error: ‘::atol’ has not been declared
   using ::atol;
           ^~~~
/usr/include/c++/6/cstdlib:138:11: error: ‘::bsearch’ has not been declared
   using ::bsearch;
           ^~~~~~~
/usr/include/c++/6/cstdlib:139:11: error: ‘::calloc’ has not been declared
   using ::calloc;
           ^~~~~~
/usr/include/c++/6/cstdlib:140:11: error: ‘::div’ has not been declared
   using ::div;
           ^~~
/usr/include/c++/6/cstdlib:141:11: error: ‘::exit’ has not been declared
   using ::exit;
           ^~~~
/usr/include/c++/6/cstdlib:142:11: error: ‘::free’ has not been declared
   using ::free;
           ^~~~
/usr/include/c++/6/cstdlib:143:11: error: ‘::getenv’ has not been declared
   using ::getenv;
           ^~~~~~
/usr/include/c++/6/cstdlib:144:11: error: ‘::labs’ has not been declared
   using ::labs;
           ^~~~
/usr/include/c++/6/cstdlib:145:11: error: ‘::ldiv’ has not been declared
   using ::ldiv;
           ^~~~
/usr/include/c++/6/cstdlib:146:11: error: ‘::malloc’ has not been declared
   using ::malloc;
           ^~~~~~
/usr/include/c++/6/cstdlib:148:11: error: ‘::mblen’ has not been declared
   using ::mblen;
           ^~~~~
/usr/include/c++/6/cstdlib:149:11: error: ‘::mbstowcs’ has not been declared
   using ::mbstowcs;
           ^~~~~~~~
/usr/include/c++/6/cstdlib:150:11: error: ‘::mbtowc’ has not been declared
   using ::mbtowc;
           ^~~~~~
/usr/include/c++/6/cstdlib:152:11: error: ‘::qsort’ has not been declared
   using ::qsort;
           ^~~~~
/usr/include/c++/6/cstdlib:155:11: error: ‘::quick_exit’ has not been declared
   using ::quick_exit;
           ^~~~~~~~~~
/usr/include/c++/6/cstdlib:158:11: error: ‘::rand’ has not been declared
   using ::rand;
           ^~~~
/usr/include/c++/6/cstdlib:159:11: error: ‘::realloc’ has not been declared
   using ::realloc;
           ^~~~~~~
/usr/include/c++/6/cstdlib:160:11: error: ‘::srand’ has not been declared
   using ::srand;
           ^~~~~
/usr/include/c++/6/cstdlib:161:11: error: ‘::strtod’ has not been declared
   using ::strtod;
           ^~~~~~
/usr/include/c++/6/cstdlib:162:11: error: ‘::strtol’ has not been declared
   using ::strtol;
           ^~~~~~
/usr/include/c++/6/cstdlib:163:11: error: ‘::strtoul’ has not been declared
   using ::strtoul;
           ^~~~~~~
/usr/include/c++/6/cstdlib:164:11: error: ‘::system’ has not been declared
   using ::system;
           ^~~~~~
/usr/include/c++/6/cstdlib:166:11: error: ‘::wcstombs’ has not been declared
   using ::wcstombs;
           ^~~~~~~~
/usr/include/c++/6/cstdlib:167:11: error: ‘::wctomb’ has not been declared
   using ::wctomb;
           ^~~~~~
/usr/include/c++/6/cstdlib:220:11: error: ‘::lldiv_t’ has not been declared
   using ::lldiv_t;
           ^~~~~~~
/usr/include/c++/6/cstdlib:226:11: error: ‘::_Exit’ has not been declared
   using ::_Exit;
           ^~~~~
/usr/include/c++/6/cstdlib:230:11: error: ‘::llabs’ has not been declared
   using ::llabs;
           ^~~~~
/usr/include/c++/6/cstdlib:236:11: error: ‘::lldiv’ has not been declared
   using ::lldiv;
           ^~~~~
/usr/include/c++/6/cstdlib:247:11: error: ‘::atoll’ has not been declared
   using ::atoll;
           ^~~~~
/usr/include/c++/6/cstdlib:248:11: error: ‘::strtoll’ has not been declared
   using ::strtoll;
           ^~~~~~~
/usr/include/c++/6/cstdlib:249:11: error: ‘::strtoull’ has not been declared
   using ::strtoull;
           ^~~~~~~~
/usr/include/c++/6/cstdlib:251:11: error: ‘::strtof’ has not been declared
   using ::strtof;
           ^~~~~~
/usr/include/c++/6/cstdlib:252:11: error: ‘::strtold’ has not been declared
   using ::strtold;
           ^~~~~~~
/usr/include/c++/6/cstdlib:260:22: error: ‘__gnu_cxx::lldiv_t’ has not been declared
   using ::__gnu_cxx::lldiv_t;
                      ^~~~~~~
/usr/include/c++/6/cstdlib:262:22: error: ‘__gnu_cxx::_Exit’ has not been declared
   using ::__gnu_cxx::_Exit;
                      ^~~~~
/usr/include/c++/6/cstdlib:264:22: error: ‘__gnu_cxx::llabs’ has not been declared
   using ::__gnu_cxx::llabs;
                      ^~~~~
/usr/include/c++/6/cstdlib:265:22: error: ‘__gnu_cxx::div’ has not been declared
   using ::__gnu_cxx::div;
                      ^~~
/usr/include/c++/6/cstdlib:266:22: error: ‘__gnu_cxx::lldiv’ has not been declared
   using ::__gnu_cxx::lldiv;
                      ^~~~~
/usr/include/c++/6/cstdlib:268:22: error: ‘__gnu_cxx::atoll’ has not been declared
   using ::__gnu_cxx::atoll;
                      ^~~~~
/usr/include/c++/6/cstdlib:269:22: error: ‘__gnu_cxx::strtof’ has not been declared
   using ::__gnu_cxx::strtof;
                      ^~~~~~
/usr/include/c++/6/cstdlib:270:22: error: ‘__gnu_cxx::strtoll’ has not been declared
   using ::__gnu_cxx::strtoll;
                      ^~~~~~~
/usr/include/c++/6/cstdlib:271:22: error: ‘__gnu_cxx::strtoull’ has not been declared
   using ::__gnu_cxx::strtoull;
                      ^~~~~~~~
/usr/include/c++/6/cstdlib:272:22: error: ‘__gnu_cxx::strtold’ has not been declared
   using ::__gnu_cxx::strtold;
                      ^~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/6/include/mm_malloc.h:27:0,
                 from /usr/lib/gcc/x86_64-linux-gnu/6/include/xmmintrin.h:34,
                 from /usr/lib/gcc/x86_64-linux-gnu/6/include/emmintrin.h:31,
                 from util/integer_to_string.cc:72:
/usr/include/c++/6/stdlib.h:38:12: error: ‘util::std::abort’ has not been declared
 using std::abort;
            ^~~~~
/usr/include/c++/6/stdlib.h:39:12: error: ‘util::std::atexit’ has not been declared
 using std::atexit;
            ^~~~~~
/usr/include/c++/6/stdlib.h:40:12: error: ‘util::std::exit’ has not been declared
 using std::exit;
            ^~~~
/usr/include/c++/6/stdlib.h:43:14: error: ‘util::std::at_quick_exit’ has not been declared
   using std::at_quick_exit;
              ^~~~~~~~~~~~~
/usr/include/c++/6/stdlib.h:46:14: error: ‘util::std::quick_exit’ has not been declared
   using std::quick_exit;
              ^~~~~~~~~~
/usr/include/c++/6/stdlib.h:51:12: error: ‘util::std::div_t’ has not been declared
 using std::div_t;
            ^~~~~
/usr/include/c++/6/stdlib.h:52:12: error: ‘util::std::ldiv_t’ has not been declared
 using std::ldiv_t;
            ^~~~~~
/usr/include/c++/6/stdlib.h:55:12: error: ‘util::std::atof’ has not been declared
 using std::atof;
            ^~~~
/usr/include/c++/6/stdlib.h:56:12: error: ‘util::std::atoi’ has not been declared
 using std::atoi;
            ^~~~
/usr/include/c++/6/stdlib.h:57:12: error: ‘util::std::atol’ has not been declared
 using std::atol;
            ^~~~
/usr/include/c++/6/stdlib.h:58:12: error: ‘util::std::bsearch’ has not been declared
 using std::bsearch;
            ^~~~~~~
/usr/include/c++/6/stdlib.h:59:12: error: ‘util::std::calloc’ has not been declared
 using std::calloc;
            ^~~~~~
/usr/include/c++/6/stdlib.h:61:12: error: ‘util::std::free’ has not been declared
 using std::free;
            ^~~~
/usr/include/c++/6/stdlib.h:62:12: error: ‘util::std::getenv’ has not been declared
 using std::getenv;
            ^~~~~~
/usr/include/c++/6/stdlib.h:63:12: error: ‘util::std::labs’ has not been declared
 using std::labs;
            ^~~~
/usr/include/c++/6/stdlib.h:64:12: error: ‘util::std::ldiv’ has not been declared
 using std::ldiv;
            ^~~~
/usr/include/c++/6/stdlib.h:65:12: error: ‘util::std::malloc’ has not been declared
 using std::malloc;
            ^~~~~~
/usr/include/c++/6/stdlib.h:67:12: error: ‘util::std::mblen’ has not been declared
 using std::mblen;
            ^~~~~
/usr/include/c++/6/stdlib.h:68:12: error: ‘util::std::mbstowcs’ has not been declared
 using std::mbstowcs;
            ^~~~~~~~
/usr/include/c++/6/stdlib.h:69:12: error: ‘util::std::mbtowc’ has not been declared
 using std::mbtowc;
            ^~~~~~
/usr/include/c++/6/stdlib.h:71:12: error: ‘util::std::qsort’ has not been declared
 using std::qsort;
            ^~~~~
/usr/include/c++/6/stdlib.h:72:12: error: ‘util::std::rand’ has not been declared
 using std::rand;
            ^~~~
/usr/include/c++/6/stdlib.h:73:12: error: ‘util::std::realloc’ has not been declared
 using std::realloc;
            ^~~~~~~
/usr/include/c++/6/stdlib.h:74:12: error: ‘util::std::srand’ has not been declared
 using std::srand;
            ^~~~~
/usr/include/c++/6/stdlib.h:75:12: error: ‘util::std::strtod’ has not been declared
 using std::strtod;
            ^~~~~~
/usr/include/c++/6/stdlib.h:76:12: error: ‘util::std::strtol’ has not been declared
 using std::strtol;
            ^~~~~~
/usr/include/c++/6/stdlib.h:77:12: error: ‘util::std::strtoul’ has not been declared
 using std::strtoul;
            ^~~~~~~
/usr/include/c++/6/stdlib.h:78:12: error: ‘util::std::system’ has not been declared
 using std::system;
            ^~~~~~
/usr/include/c++/6/stdlib.h:80:12: error: ‘util::std::wcstombs’ has not been declared
 using std::wcstombs;
            ^~~~~~~~
/usr/include/c++/6/stdlib.h:81:12: error: ‘util::std::wctomb’ has not been declared
 using std::wctomb;
            ^~~~~~
g++: error: kenlm/lm/*.o: No such file or directory


How to debug: Tuning step (step 2 in phrasal.sh) running too long (~10 hours) but empty .binwts file

Step 2 run too long (for more than 10 hours). I cancel before it finish. It does not generate binwts (empty file). The only binwts file is the file i copy in example folder, which is still empty. I did not see any other .binwts

The process take all CPU and ~ 7GB of RAM (almost all I have) until I cancel. But below is all the logs. No binwts.

CONFIG

.vars

#
# Online parameter tuning with with phrasal-train-tune.sh
#

# General parameters
#
HOST=`hostname -s`
MEM=7g
JAVA_OPTS="-server -ea -Xmx${MEM} -Xms${MEM} -XX:+UseParallelGC -XX:+UseParallelOldGC"
DECODER_OPTS="-Djava.library.path=/home/me/phrasal.ver/src-cc"

# Set if you want to receive an email when a run completes.
# Assumes that the 'mail' unix program is installed and
# configured on your system.
[email protected]

# Resource locations
#
REFDIR=/data/refdir
CORPUSDIR=/data/corpusdir
CORPUS_SRC=${CORPUSDIR}/train.src.filt.gz
CORPUS_TGT=${CORPUSDIR}/train.dest.filt.gz
CORPUS_EF=${CORPUSDIR}/dest_src.A3.final.merge
CORPUS_FE=${CORPUSDIR}/src_dest.A3.final.merge



# Directory for reporting system.
#REPORTING_DIR=
#RESULTS_FILE=$REPORTING_DIR/results.html

#
# Phrase extraction parameters
#

# Mandatory extraction set format. See Usage of mt.train.PhraseExtract
# for the several different extraction set formats
EXTRACT_SET="-fCorpus $CORPUS_SRC -eCorpus $CORPUS_TGT -feAlign $CORPUS_FE -efAlign $CORPUS_EF -symmetrization grow-diag"
THREADS_EXTRACT=8
MAX_PHRASE_LEN=5
# DEBUG_PROPERTY=true
# DETAILED_DEBUG_PROPERTY=true
OTHER_EXTRACT_OPTS="-phiFilter 1e-4 -maxELen $MAX_PHRASE_LEN"

# Feature extractors
EXTRACTORS=edu.stanford.nlp.mt.train.MosesPharoahFeatureExtractor=phrase-table.gz:edu.stanford.nlp.mt.train.CountFeatureExtractor=phrase-table.gz:edu.stanford.nlp.mt.train.LexicalReorderingFeatureExtractor=lo-hier.msd2-bidirectional-fe.gz
EXTRACTOR_OPTS=""

# Lexicalized re-ordering models
LO_ARGS="-hierarchicalOrientationModel true -orientationModelType msd2-bidirectional-fe"

# Online tuning parameters
TUNE_MODE=online
TUNE_SET_NAME=dev_data
TUNE_SET=$CORPUSDIR/$TUNE_SET_NAME.dest
TUNE_REF=$REFDIR/$TUNE_SET_NAME/ref0
INITIAL_WTS=20171212.binwts
TUNE_NBEST=100

#Options to pass directly to OnlineTuner
METRIC=bleu-smooth
# default
# ONLINE_OPTS="-e 8 -ef 20 -b 20 -uw -m $METRIC -o pro-sgd -of 1,5000,50,0.5,Infinity,0.02,adagradl1f,0.1"
ONLINE_OPTS="-e 1 -ef 10 -b 20 -uw -m $METRIC -o pro-sgd -of 1,5000,50,0.5,Infinity,0.02,adagradl1f,0.1"



# Decoding parameters for dev/test set
DECODE_SET_NAME=test_data
DECODE_SET=$CORPUSDIR/$DECODE_SET_NAME.dest
NBEST=1

.ini

# Example Phrasal ini file
# These options are described by the usage statement
# that is shown on the command line (use the "-help" option).
#
# phrasal.sh will modify this template depending on the steps
# selected to run.
#

# phrasal.sh replaces the token SETID with the
# dev or test set name.
[ttable-file]
SETID.tables/phrase-table.gz

# The 'kenlm:' enables the KenLM loader. Remove the
# prefix for the standard Java ARPA loader.
[lmodel-file]
/data/kenlm.arpa

[ttable-limit]
20

[distortion-limit]
5


# The dense Moses feature set is loaded by default.
# Also load the hierarchical re-ordering model of Galley and Manning (2008)
[reordering-model]
hierarchical
SETID.tables/lo-hier.msd2-bidirectional-fe.gz
msd2-bidirectional-fe
hierarchical
hierarchical
bin

# Number of decoding threads
[threads]
3

LOG

.online.stdout log

Done loading phrase table: /data/dev_data.tables/phrase-table.gz (mem used: 465 MiB time: 0.737 s)
Longest foreign phrase: 5
Loading extended Moses Lexical Reordering Table: dev_data.tables/lo-hier.msd2-bidirectional-fe.gz
Done loading reordering table: dev_data.tables/lo-hier.msd2-bidirectional-fe.gz (mem used: 573 MiB time: 0.716s)
Hierarchical reordering model:
Distinguish between left and right discontinuous: true
Use containment orientation: false
Forward orientation: hierarchical
Backward orientation: hierarchical
Reading 262144 1-grams...
Reading 8388608 2-grams...
Reading 67108864 3-grams...
Reading 134217728 4-grams...
Reading 134217728 5-grams...
Done loading arpa lm: /data/kenlm.arpa (order: 5) (mem used: 4595 MiB time: 172.626 s)

phrasal.log

[INFO ] 2017-12-13 16:37:07.993 [main] OnlineTuner - Phrasal Online Tuner
[INFO ] 2017-12-13 16:37:08.143 [main] OnlineTuner - Options:  /data/dev_data.src /data/refdir/dev_data/ref0 dev_data.20171212baseline.ini 20171212.binwts b 20 e 1 ef 10 m bleu-smooth n dev_data.20171212baseline o pro-sgd of 1,5000,50,0.5,Infinity,0.02,adagradl1f,0.1 uw true
[INFO ] 2017-12-13 16:37:08.174 [main] Phrasal - Number of threads: 8
[INFO ] 2017-12-13 16:37:08.174 [main] Phrasal - Phrase table rule query limit: 20
[INFO ] 2017-12-13 16:37:08.174 [main] Phrasal - Translation model options []
[INFO ] 2017-12-13 16:37:08.949 [main] Phrasal - Translation model mode: static
[INFO ] 2017-12-13 16:37:09.684 [main] Phrasal - Language model: /data/kenlm.arpa

Lattice generation

Phrasel can consume the plf lattice format. But how do we get this lattice format from a list of possible sentences?

kenlm State length mis-match

I replace newest kenlm (clone from their github) and build gradle compileKenLM

When I run step 2 with kenlm, read .online.stdout


Done loading phrase table: /data/20171214/config/dev.tables/phrase-table.gz (mem used: 71 MiB time: 0.253 s)
Longest foreign phrase: 5
Loading extended Moses Lexical Reordering Table: dev.tables/lo-hier.msd2-bidirectional-fe.gz
Done loading reordering table: dev.tables/lo-hier.msd2-bidirectional-fe.gz (mem used: 71 MiB time: 0.137s)
Hierarchical reordering model:
Distinguish between left and right discontinuous: true
Use containment orientation: false
Forward orientation: hierarchical
Backward orientation: hierarchical
Non-NPLM /data/trained_model/kenlm/20171124/20171124_lm_train_data.bin
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-1] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-3] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-4] KenLMState - State length mis-match: 1 vs. 205
[ERROR] 2017-12-14 16:46:32.184 [pool-2-thread-2] KenLMState - State length mis-match: 1 vs. 205
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
java.lang.RuntimeException: Bad state length returned from KenLM query
	at edu.stanford.nlp.mt.lm.KenLMState.<init>(KenLMState.java:39)
	at edu.stanford.nlp.mt.lm.KenLanguageModel.score(KenLanguageModel.java:167)
	at edu.stanford.nlp.mt.decoder.feat.base.NGramLanguageModelFeaturizer.ruleFeaturize(NGramLanguageModelFeaturizer.java:162)
	at edu.stanford.nlp.mt.decoder.feat.FeatureExtractor.ruleFeaturize(FeatureExtractor.java:196)
	at edu.stanford.nlp.mt.tm.ConcreteRule.<init>(ConcreteRule.java:91)
	at edu.stanford.nlp.mt.tm.AbstractPhraseGenerator.getRules(AbstractPhraseGenerator.java:60)
	at edu.stanford.nlp.mt.tm.CombinedTranslationModel.getRules(CombinedTranslationModel.java:201)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.getRules(AbstractBeamInferer.java:115)
	at edu.stanford.nlp.mt.decoder.CubePruningDecoder.decode(CubePruningDecoder.java:130)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:193)
	at edu.stanford.nlp.mt.decoder.AbstractBeamInferer.nbest(AbstractBeamInferer.java:95)
	at edu.stanford.nlp.mt.Phrasal.decode(Phrasal.java:1425)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:493)
	at edu.stanford.nlp.mt.tune.OnlineTuner$GradientProcessor.process(OnlineTuner.java:450)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:255)
	at edu.stanford.nlp.util.concurrent.MulticoreWrapper$CallableJob.call(MulticoreWrapper.java:236)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.