Comments (6)
You're right. It is completely wrong. But there is not enough information to debug. Could you try the command line version (analyzeText), please and paste (or attach) here both the text and all the console output?
from lima.
As you can see, the results are quite better under Linux:
# sent_id = 1
# text = February 23 - A revolt against the government of King Joseph I of Portugal takes place in the city of Oporto.
1 February February PROPN _ NUMBER=SING _ _ _ NE=DateTime.DATE|Pos=1|Len=8
2 23 23 NUM _ _ _ _ _ NE=DateTime.DATE|Pos=10|Len=2
3 - - COLON _ _ 3 Dummy _ Pos=13|Len=1
4 A a DET _ _ 4 det _ Pos=15|Len=1
5 revolt revolt NOUN _ NUMBER=SING 13 SUJ_V _ Pos=17|Len=6
6 against against ADP _ _ 7 PREPSUB _ Pos=24|Len=7
7 the the DET _ _ 7 det _ Pos=32|Len=3
8 government government NOUN _ NUMBER=SING 4 COMPDUNOM _ Pos=36|Len=10
9 of of ADP _ _ 10 PREPSUB _ Pos=47|Len=2
10 King king NOUN _ NUMBER=SING 10 ADJPRENSUB _ Pos=50|Len=4
11 Joseph Joseph PROPN _ NUMBER=SING _ _ _ NE=Person.PERSON|Pos=55|Len=6
12 I I PRON _ _ _ _ _ NE=Person.PERSON|Pos=62|Len=1
13-14 joseph _ _ _ _ _ _ _ _
13 of of ADP _ _ 12 PREPSUB _ Pos=64|Len=2
14 Portugal Portugal PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=67|Len=8
15 takes take VERB _ _ 0 _ _ Pos=76|Len=5
16 place place NOUN _ NUMBER=SING 13 COD_V _ Pos=82|Len=5
17 in in ADP _ _ 17 PREPSUB _ Pos=88|Len=2
18 the the DET _ _ 17 det _ Pos=91|Len=3
19 city city NOUN _ NUMBER=SING 14 COMPDUNOM _ Pos=95|Len=4
20 of of ADP _ _ 19 PREPSUB _ Pos=100|Len=2
21 Oporto Oporto PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=103|Len=6
22 . . SENT _ _ 0 _ _ Pos=109|Len=1
We need more information to understand what happens under Windows.
from lima.
I get this, I don't know if I am supposed to set something to print more logs ?
H:\test_lima_windows>analyzeText -l eng joseph_I.txt
Analyzing 1/1 (100.00%) 'joseph_I.txt'# global.columns = ID FORM LEMMA UPOS XPOS FEATS HEAD DEPREL DEPS MISC
# sent_id = 1
# text = February 23 - A revolt against the government of King Joseph I of Portugal takes place in the city of Oporto.
1 February February PROPN _ NUMBER=SING _ _ _ NE=DateTime.DATE|Pos=1|Len=8
2 23 23 NUM _ _ _ _ _ NE=DateTime.DATE|Pos=10|Len=2
3 - - COMMA _ _ 3 Dummy _ Pos=13|Len=1
4 A A PROPN _ NUMBER=SING 4 ADJPRENSUB _ Pos=15|Len=1
5 revolt revolt NOUN _ NUMBER=SING 5 ADJPRENSUB _ Pos=17|Len=6
6 against against NOUN _ NUMBER=SING 6 ADJPRENSUB _ Pos=24|Len=7
7 the the NOUN _ NUMBER=SING 7 ADJPRENSUB _ Pos=32|Len=3
8 government government NOUN _ NUMBER=SING 8 ADJPRENSUB _ Pos=36|Len=10
9 of of NOUN _ NUMBER=SING 10 ADJPRENSUB _ Pos=47|Len=2
10 King King PROPN _ NUMBER=SING 10 SUBSUBJUX _ Pos=50|Len=4
11 Joseph Joseph PROPN _ NUMBER=SING _ _ _ NE=Person.PERSON|Pos=55|Len=6
12 I i NUM _ NUMBER=SING _ _ _ NE=Person.PERSON|Pos=62|Len=1
13 of of NOUN _ NUMBER=SING 12 ADJPRENSUB _ Pos=64|Len=2
14 Portugal Portugal PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=67|Len=8
15 takes takes NOUN _ NUMBER=SING 14 ADJPRENSUB _ Pos=76|Len=5
16 place place NOUN _ NUMBER=SING 15 ADJPRENSUB _ Pos=82|Len=5
17 in in NOUN _ NUMBER=SING 16 ADJPRENSUB _ Pos=88|Len=2
18 the the NOUN _ NUMBER=SING 17 ADJPRENSUB _ Pos=91|Len=3
19 city city NOUN _ NUMBER=SING 18 ADJPRENSUB _ Pos=95|Len=4
20 of of NOUN _ NUMBER=SING 19 ADJPRENSUB _ Pos=100|Len=2
21 Oporto Oporto PROPN _ NUMBER=SING _ _ _ NE=Location.LOCATION|Pos=103|Len=6
22 . . SENT _ _ 0 _ _ Pos=109|Len=1
from lima.
@victorbocharov , you are the last developer having ensured a successful Windows build. Have you noticed problems like that ?
from lima.
No, I haven't. Moreover, I don't have Windows computers, so I won't be able to reproduce this. I can only suggest a few guesses:
- PoS tags are given according to some tokenization rules: starts from capital => PROPN, digits => NUM, ...
- lemmatization doesn't work (takes -> takes)
- NER works
Looks like English dictionary isn't used or it is empty. @kleag : How to check this?
@ebarbot : Is the pipeline "main" unchanged?
@ebarbot : How old is the version of LIMA?
from lima.
I downloaded the 3.0.0.20210912222206-0c3404de version, and if I explicitely write analyzeText -l eng -p main joseph_I.txt
I get the same result
from lima.
Related Issues (20)
- GenericDocumentProperties multipleStringValues : append, or merge HOT 3
- [LIMA GUI] allow to build LIMA without the GUI HOT 7
- List of installed models HOT 3
- analyzeXml and analyzeText are not consistent HOT 6
- Wrong entity string output by the BratDumper HOT 1
- Conflict between libtorch3-dev and lima packages HOT 3
- CI should build a binary version of LIMA based on /pypa/manylinux HOT 13
- Package a Modex and its resources together to facilitate their deployment
- XmlReader fails in case of XML-entities HOT 13
- SVM PosTagger fails on document without recovery on error HOT 4
- Shoul port to Qt 6 HOT 2
- Compilation error: fail on test XTestXmlReader0 HOT 7
- Error with AnalyzeText command HOT 6
- Error with pipelines HOT 2
- tvx tests silently fail during GitHub Actions build
- Pb with the TL;DR HOT 3
- Wrong interpretation of xml files analysis configuration HOT 3
- [refactoring] Factories should produce shared pointers instead of raw ones HOT 1
- Deeplima dp train: The list of expected tasks should not be hard-coded
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lima.