GithubHelp home page GithubHelp logo

Comments (15)

masakuri avatar masakuri commented on July 20, 2024

I'm sorry, I typed incorrect command.
The error was solved.
I still have same error...

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

Ok, please let me know your command.

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024
$ deep-crf train input_train_jp.txt --delimiter=" " --dev_file input_dev_jp.txt --save_dir save_jpmodel_dir --save_name bilstm-cnn-crf_adam_jp --optimizer adam --word_emb_file jp_word_emb300.txt --word_emb_vocab_type replace_only --gpu 0

Thank you.

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

I think this error since your training file format input_train_jp.txt is wrong.
Invalid input feature sizes.

I just fix code, please use recent version and please let me know the result.
I think input_train_jp.txt should be:

彼 O
は O
オバマ大統領 S-PERSON
です O

彼 O
は O

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

I got the following error.
ValueError: Invalid input feature sizes: "3". Please check at line [1298]

I checked at line 1298 in input_train_jp.txt and I understood that the "word" has space like:

ほげ[space]ほげ[space]O

"ほげ[space]ほげ" is proper noun.

Thank you for your help to know this error cause.
Is it OK to solve this problem by using --delimiter="\t" and input_train_jp.txt format is like ほげ[space]ほげ[tab]O ?

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

I fix input_train_jp.txt format and I run the command ($ deep-crf train input_train_jp.txt --delimiter="\t" --dev_file input_dev_jp.txt --save_dir save_jpmodel_dir --save_name bilstm-cnn-crf_adam_jp --optimizer adam --word_emb_file jp_word_emb300.txt --word_emb_vocab_type replace_only --gpu 0), I got following error:

  File "build/bdist.linux-x86_64/egg/deepcrf/__init__.py", line 66, in train
  File "build/bdist.linux-x86_64/egg/deepcrf/main.py", line 102, in run
ValueError: Invalid training sizes: 0 sentences.

Any ideas?

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

Is it OK to solve this problem by using --delimiter="\t" and input_train_jp.txt format is like ほげ[space]ほげ[tab]O ?

Yes! I think it is a good solution.

Each sentence must be split by a blank line (empty line \n) in input_train_jp.txt.

Note that you should put empty line (\n) between sentences. This format is called CoNLL format.

I mean if you have two sentences,

$ cat input_file.txt
Barack  B−PERSON 
Hussein I−PERSON 
Obama   E−PERSON
is      O 
a       O 
man     O 
.       O

Yuji   B−PERSON 
Matsumoto E−PERSON 
is     O 
a      O 
man    O 
.      O

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

My input_train_jp.txt file has blank line ("\n") between sentences (more precisely, between tweets) but I got the error...

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

Now your input_train_jp.txt seems following?

あああ[tab]O

あ[tab]O
い[tab]O
う[tab]O

お[space]お[tab]O
お[tab]O

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

Now your input_train_jp.txt seems following?

あああ[tab]O

あ[tab]O
い[tab]O
う[tab]O

お[space]お[tab]O
お[tab]O

Yes.

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

OK. Can you send me your input file via e-mail if you are ok.
nanigashi03[at] gmail.com

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

Or, please try replace [tab] to [space] :

お[space]お   =>    お_お

[tab]   => [space]

and please use --delimiter=" ".

Maybe [tab] unicode causes this error?

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

replace [tab] to [space]:

お[space]お => お_お

[tab] => [space]
use --delimiter=" "

It worked!!!
Thank you very much for your help!!!

from deep-crf.

aonotas avatar aonotas commented on July 20, 2024

OK.
It seems our code or input format with [tab] will cause that error.

from deep-crf.

masakuri avatar masakuri commented on July 20, 2024

I see. Thank you very much.
I changed the issue title to know the content.

from deep-crf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.