bplank / bilstm-aux Goto Github PK
View Code? Open in Web Editor NEWBidirectional Long-Short Term Memory tagger (bi-LSTM) (in DyNet) -- hierarchical (with word and character embeddings)
License: Other
Bidirectional Long-Short Term Memory tagger (bi-LSTM) (in DyNet) -- hierarchical (with word and character embeddings)
License: Other
When running bilty.py and loading trained model, the following error occurs:
python3 src/bilty.py --dynet-mem 5000 --model "models/test1.model" --dev data/da-ud-dev.conllu --pred_layer 1 --dynet-gpus 1 --output predictions/pred1 --save models/test1
[dynet] random seed: 2913040757
[dynet] allocating memory: 5000MB
[dynet] memory allocation done.
loading model from file models/test1.model
Traceback (most recent call last):
File "src/bilty.py", line 767, in <module>
main()
File "src/bilty.py", line 113, in main
tagger = load(args)
File "src/bilty.py", line 188, in load
myparams = pickle.load(open(model_path+".params.pickle", "rb"))
TypeError: unsupported operand type(s) for +: 'Namespace' and 'str'
It looks like line 113 in bilty.py tagger = load(args)
should be tagger = load(args.model)
- is this correct?
to fix: need to reload tagger from model_path (if patience is used)
Hi,
I am applying a self-trained model to Twitter messages which might contain unicode emojis. The lstm tagger seems to have problems with those emojis.
for instance:
lol X
" .
πππ― X
I can help myself with substituting those cases with a text constant before I call the tagger but I wonder if this problem is known?
I am using python3
on a linux
machine.
Best,
code contains some old stuff from bilty, fails when using --save. Reported by @hectormartinez
File "src/simplebilty.py", line 78, in main save(tagger, args) File "src/simplebilty.py", line 135, in save "tasks_ids": nntagger.tasks_ids, AttributeError: 'SimpleBiltyTagger' object has no attribute 'tasks_ids'
Hi Barbara,
I tried playing with your bi-LSTM tagger. [..] However, after loading the source files successfully following
your example, it throws an error:
Traceback (most recent call last):
File "src/bilty.py", line 567, in
main()
File "src/bilty.py", line 86, in main
tagger.fit(args.train, args.iters, args.trainer, dev=args.dev)
File "src/bilty.py", line 222, in fit
self.predictors, self.char_rnn, self.wembeds, self.cembeds =
self.build_computation_graph(num_words, num_chars)
File "src/bilty.py", line 261, in build_computation_graph
wembeds = self.model.add_lookup_parameters("lookup_wembeds", (num_words,
self.in_dim))
TypeError: add_lookup_parameters() takes exactly one argument (2 given)
Dynet's library has changed. Although the documentation moved, the source code of the tagger hasn't. Working on it.
The link in the readme for downloading the polyglot embeddings does not work any more.
Hello, and thank you for making this tagger available!
I tried running the tagger with the --raw
and --output
options, using an input file with one sentence per line and with space-separated tokens. But it seems that after prediction, once the output_preds
function is called, the pred_tags
for each sequence remains empty, and the output file stores only newline characters.
My current workaround is simply to reformat the input file to have one line per token and add my own dummy tags before parsing. But as far as I can see, the tagger successfully parses raw inputs anyway, and correctly stores the words
and tags
per sequence. This leads me to believe that the difference lies in the predict
function behavior.
i use python3 embeds/bert.prep.py ,then it generate a folder named bert. Then i run the command "python3 embeds/bert.py".Unfortunately, it output a sentence"please provide embeddings , conl file and port". But the folder bert contain five files.( bert_config ,bert_model.ckpt.index, bert_model.ckpt.data, bert_model.ckpt.meta, vocab)Which one i should input?
βthe option of running a CRF has been addedβ---- how to run the crf version by the common?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.