daormar / thot Goto Github PK
View Code? Open in Web Editor NEWThot toolkit for statistical machine translation
Home Page: https://daormar.github.io/thot/
License: GNU Lesser General Public License v3.0
Thot toolkit for statistical machine translation
Home Page: https://daormar.github.io/thot/
License: GNU Lesser General Public License v3.0
Hi, I'm new here and I have some problem about training.
I follow the manual to train a translator of the toy_corpus provided by this project.
However, when I execute those commands :
src_train_corpus=${PREFIX}/share/thot/toy_corpus/sp_tok_lc.train
trg_train_corpus=${PREFIX}/share/thot/toy_corpus/en_tok_lc.train
thot_tm_train -s ${src_train_corpus} -t ${trg_train_corpus} -o tm_outdir
I get many but the same error :
awk: syntax error at source line 1
context is
{print >>> substr($1,length($1)-3)== <<<
awk: illegal statement at source line 1
awk: illegal statement at source line 1
/usr/local/bin//thot_pbs_gen_batch_sw_model: line 346: [: -eq: unary operator expected
I ignore those errors and keep going on executing the commands in the manual.
In the end I still can get a set of output, but I can't tell it is correct or not.
Is there any thing I do wrong? Any respond is appreciated.
hi
i am trying to use the thot_tokenize and noted the following issues:
and also when i try out the => thot_tm_train -s ${src_train_corpus} -t ${trg_train_corpus} -o tm
there is only folder created but there is no desciptor files in it.
hello, i am currently having issues with
thot_tm_train -s ${src_train_corpus} -t ${trg_train_corpus} -o tm_outdir
It keeps giving me this
cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n2.log: No such file or directory
cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n3.log: No such file or directory
cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n4.log: No such file or directory
cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/models_per_chunk/__proc_n5.log: No such file or directory
cat: /home/oluwasegun/thot_pbs_gen_batch_sw_model_sdir_3814_3821/curr_tables/generate_final_model.log: No such file or directory
Error during the execution of thot_pbs_gen_batch_sw_model (proc_chunk)
File /home/oluwasegun/tm_outdir/main/src_trg_swm.genswm_err contains information for error diagnosing
Any help?
Thanks ;)
Hi, I'm having an issue running the make command on installation. Seems to be an ar command with no file inputs somewhere. Running on macOS Sierra. Output is below:
$ make /Library/Developer/CommandLineTools/usr/bin/make all-recursive Making all in src Making all in nlp_common make[3]: Nothing to be done for
all'.
Making all in incr_models
make[3]: Nothing to be done for all'. Making all in sw_models make[3]: Nothing to be done for
all'.
Making all in phrase_models
make[3]: Nothing to be done for all'. Making all in smt_preproc make[3]: Nothing to be done for
all'.
Making all in error_correction
make[3]: Nothing to be done for all'. Making all in downhill_simplex make[3]: Nothing to be done for
all'.
Making all in stack_dec
make[3]: Nothing to be done for all'. Making all in exper make[3]: Nothing to be done for
all'.
Making all in testing
make[3]: Nothing to be done for all'. Making all in hat_trie make[3]: Nothing to be done for
all'.
/bin/sh ../libtool --tag=CC --mode=link gcc -W -Wno-deprecated -I./nlp_common -I./incr_models -I./sw_models -I./phrase_models -I./smt_preproc -I./error_correction -I./downhill_simplex -I./stack_dec -I./hat_trie -DTHOT_MASTER_INI_PATH="/usr/local/share/thot/ini_files/master.ini" -DTHOT_LIBDIR="/usr/local/lib" -Ino/src/include -g -O2 -g -Wall -O2 -o libhattrie.la -rpath /usr/local/lib -lm
libtool: link: gcc -dynamiclib -Wl,-undefined -Wl,dynamic_lookup -o .libs/libhattrie.0.dylib -lm -g -O2 -g -O2 -install_name /usr/local/lib/libhattrie.0.dylib -compatibility_version 1 -current_version 1.0 -Wl,-single_module
libtool: link: (cd ".libs" && rm -f "libhattrie.dylib" && ln -s "libhattrie.0.dylib" "libhattrie.dylib")
libtool: link: ar cru .libs/libhattrie.a
ar: no archive members specified
usage: ar -d [-TLsv] archive file ...
ar -m [-TLsv] archive file ...
ar -m [-abiTLsv] position archive file ...
ar -p [-TLsv] archive [file ...]
ar -q [-cTLsv] archive file ...
ar -r [-cuTLsv] archive file ...
ar -r [-abciuTLsv] position archive file ...
ar -t [-TLsv] archive [file ...]
ar -x [-ouTLsv] archive [file ...]
make[3]: *** [libhattrie.la] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
`
Currently, Thot has support for IBM1, IBM2, and HMM models, none of which models fertility. Are there any plans to support IBM3, IBM4, IBM5, or one of the extensions to HMM that models/simulates fertility? Obviously, the IBM models 3-5 are complex and might be difficult to support. Some of the fertility extensions to HMM seem simpler and would improve accuracy.
Hello,
Trying to install on high sierra, at the make step I got the error:
/bin/sh ../libtool --tag=CXX --mode=link g++ -W -Wno-deprecated -I./nlp_common -I./incr_models -I./sw_models -I./phrase_models -I./smt_preproc -I./error_correction -I./downhill_simplex -I./stack_dec -I./hat_trie -I./picojson -DTHOT_MASTER_INI_PATH="/Users/bilge/Desktop/thot/share/thot/ini_files/master.ini" -DTHOT_LIBDIR="/Users/bilge/Desktop/thot/lib" -Ino/src/include -g -Wall -Wno-deprecated -O2 -std=c++11 libthot.la -o thot_lm_perp incr_models/thot_lm_perp.o -lgmp -lz -lpthread -ldl -lm
libtool: error: cannot find the library 'libthot.la' or unhandled argument 'libthot.la'
make[3]: *** [thot_lm_perp] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
Any idea for how to solve this ?
Thanks
After installation of the Thot package on a newly installed Ubuntu18, 'make installcheck' threw an error when performing 'Tuning log-linear model weights', file 'thot_get_nblist_segm_info', line 99, print line.encode('utf-8'), syntax error. I modified print statements in all of the files containing Python code in the source files directory to comply with Python3. After that I ran installation again and then 'make installcheck' showed that all tests passed.
Hello,
How does thot compare to a solution like ModernMT ?
Do you know of any other solutions that can adapt to human post-editing?
I open this ticket for porting stuff to windows native since it might be useful tool in Win32 panorama. Please assign this to me.
The port:
MY command is : thot_tm_train -s /usr/local/share/thot/toy_corpus/train.sp -t /usr/local/share/thot/toy_corpus/train.en -o /home/gtct/yuchao/mt/tm_outdir/
And I just get a "main" directory with nothing.I can find tm_desc nowhere.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.