GithubHelp home page GithubHelp logo

Running time with GPU about tagger HOT 6 CLOSED

glample avatar glample commented on June 17, 2024
Running time with GPU

from tagger.

Comments (6)

glample avatar glample commented on June 17, 2024

Hi,

This will be slower on GPU than on CPU. Mostly because of the operations on the CRF layer I guess, and also because the implementation does not support mini-batch.

from tagger.

Rabia-Noureen avatar Rabia-Noureen commented on June 17, 2024

Hi @glample @HaniehP I am new to python can you please help me out with training the model using GoogleNews word embeddings? I am trying to train using the script

python train.py --train dataset/eng.train --dev dataset/eng.testa --test dataset/eng.testb --lr_method=adam --tag_scheme=iob --pre_emb=GoogleNews-vectors-negative300.bin --all_emb=300

I got this error:
image

I am stuck with this issue for about 2 months and couldn't resolve it. Thanks in advance.

from tagger.

HaniehP avatar HaniehP commented on June 17, 2024

Try 'ISO-8859-1' instead of 'UTF-8'. That helped me in another project.

from tagger.

Rabia-Noureen avatar Rabia-Noureen commented on June 17, 2024

@HaniehP Thank you so much for your response, for the time being when i tried to train the model with out the word embedding i got an other error, there is something wrong it seems:
(env_name27) C:\Users\Acer\tagger-master>python train.py --train dataset/eng.train --dev dataset/eng.testa --test dataset/eng.testb
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: GeForce GT 620M (CNMeM is enabled with initial size: 85.0% of memory, cuDNN not available)
Model location: ./models
Found 23624 unique words (203621 in total)
Found 84 unique characters
Found 17 unique named entity tags
14041 / 3250 / 3453 sentences in train / dev / test.
Saving the mappings to disk...
Compiling...
Starting epoch 0...
50, cost average: 15.406189
100, cost average: 11.704297
150, cost average: 10.767459
200, cost average: 13.812738
250, cost average: 11.460194
300, cost average: 13.207466
350, cost average: 12.146099
400, cost average: 12.428576
450, cost average: 10.977689
500, cost average: 12.830771
550, cost average: 10.062991
600, cost average: 9.834551
650, cost average: 11.481623
700, cost average: 9.460655
750, cost average: 9.907359
800, cost average: 10.251657
850, cost average: 10.405848
900, cost average: 14.113665
950, cost average: 10.436158
'.' is not recognized as an internal or external command,
operable program or batch file.
ID NE Total O S-LOC B-PER E-PER S-ORG S-MISC B-ORG E-ORG S-PER I-ORG B-LOC E-LOC B-MISC E-MISC I-MISC I-PER I-LOC Percent
0 O 42759 42759 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100.000
1 S-LOC 1603 1603 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
2 B-PER 1234 1234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
3 E-PER 1234 1234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
4 S-ORG 891 891 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
5 S-MISC 665 665 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
6 B-ORG 450 450 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
7 E-ORG 450 450 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
8 S-PER 608 608 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
9 I-ORG 301 301 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
10 B-LOC 234 234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
11 E-LOC 234 234 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
12 B-MISC 257 257 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
13 E-MISC 257 257 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
14 I-MISC 89 89 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
15 I-PER 73 73 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
16 I-LOC 23 23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.000
42759/51362 (83.25026%)
Traceback (most recent call last):
File "train.py", line 220, in
dev_data, id_to_tag, dico_tags)
File "C:\Users\Acer\tagger-master\utils.py", line 282, in evaluate
return float(eval_lines[1].strip().split()[-1])
IndexError: list index out of range

from tagger.

Rabia-Noureen avatar Rabia-Noureen commented on June 17, 2024

@HaniehP please guide me how can i convert my word embedding in .txt file to 'ISO-8859-1'?

from tagger.

HaniehP avatar HaniehP commented on June 17, 2024

In your python code, replace "codecs.open(path, 'r', 'utf8')" with "codecs.open(path, 'r', 'ISO-8859-1')"

from tagger.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.