GithubHelp home page GithubHelp logo

ziqizhang / sti Goto Github PK

View Code? Open in Web Editor NEW
19.0 8.0 8.0 319.86 MB

Implementation of algorithms for semantic table implementation, including the TableMiner+ method

Java 98.36% JavaScript 0.58% HTML 0.94% CSS 0.10% Shell 0.02%
semantic-table-interpretation entity-linking web-table webtable classification relation-extraction dbpedia freebase semantic-web

sti's People

Contributors

ir-ischool-uos avatar ziqizhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sti's Issues

AttributeValueMatcher not stripping datatype string hence not matching values correctly

TODO

(Thanks to Josef Janoušek from the Odalic project)

"The AttributeValueMatcher in the method score ( https://github.com/ziqizhang/sti/blob/master/sti-main/src/uk/ac/shef/dcs/sti/core/scorer/AttributeValueMatcher.java#L104 ) was not able to match the input cell value 694 (of datatype NUMBER) and the attribute which has the value "694"^^http://www.w3.org/2001/XMLSchema#positiveInteger - so in the DBpedia knowledge base the text representation of the value of the literal attribute contains also the data type (according to XML Schema) - as shown at https://dbpedia.org/sparql?default-graph-uri=http%3A%2F%2Fdbpedia.org&query=select+distinct+%3Fp+%3Fo+where+{%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FA_Game_of_Thrones%3E+%3Fp+%3Fo}&format=text%2Fhtml&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on . So because it was not matched, all attributes had the score 0.0 and relation was not discovered.
So I made changes in collected attributes used for matching ( https://github.com/ziqizhang/sti/blob/master/sti-main/src/uk/ac/shef/dcs/sti/core/algorithm/tmp/TColumnColumnRelationEnumerator.java#L65 ) - when the attribute value contains "^^", then I cut the datatype part of the string and set only the number (e.g. 694) as value of the attribute, and also I set the valueURI of the attribute to null, because otherwise the method classifyAttributeValueDataType of AttributeValueMatcher ( https://github.com/ziqizhang/sti/blob/master/sti-main/src/uk/ac/shef/dcs/sti/core/scorer/AttributeValueMatcher.java#L185 ) sets datatype to named_entity. After these changes the value of the attribute is just 694 and datatype is set to NUMBER, so the score method of AttributeValueMatcher is able to match it with the input cell value and the relation is discovered."

Upgrade to Any23 2.0

Hi folks, we recently released Any23 2.0 which has lots of improvements. Artifacts are available on Maven central and there are no API breakages IIRC. Please get us over on user @ any23.apache.org if you have any problems upgrading.

Exceptions after the LEARNING phase

Hi,
I installed the project (with NodeJS UI) following the instructions.

But I have some issues during/after the LEARNING phase; I tried two different tables, but the annotation process can't proceed further. With this table https://en.wikipedia.org/wiki/Commedia_all%27italiana I get an HttpException (detailed log following).
The KG endpoint is the default dbpedia.org/sparql and the parser is Wikipedia tables.
The process ends up with the following message:

"Your task is complete. Visit http://localhost:3000/user1/index.htm for your output. Thanks for using TableMiner+"

but index.htm is missing (error 404).
Inside the folder ui/tmp/user1 there are just 3 files:

_wiki_Commedia_all_27italiana.download.html (html page with red boxes)
_wiki_Commedia_all_27italiana.download.html.original (original html page)
xpaths.json (empty)
2019-07-25 09:39:24 INFO TableMinerPlusBatch:46 - Initializing entity cache...

2019-07-25 09:39:26 INFO TableMinerPlusBatch:50 - Initializing KBSearch...

2019-07-25 09:39:27 INFO TableMinerPlusBatch:67 - Initializing SUBJECT COLUMN DETECTION components ...

Thu Jul 25 09:39:27 UTC 2019 loading exception data for lemmatiser...

Thu Jul 25 09:39:27 UTC 2019 loading done

2019-07-25 09:39:28 INFO TableMinerPlusBatch:94 - Initializing LEARNING components ...

2019-07-25 09:39:28 INFO TableMinerPlusBatch:136 - Initializing UPDATE components ...

2019-07-25 09:39:28 INFO TableMinerPlusBatch:149 - Initializing RELATIONLEARNING components ...

2019-07-25 09:39:28 INFO TMPInterpreter:49 - >    PHASE: Detecting subject column...

2019-07-25 09:39:29 INFO TMPInterpreter:65 - >    PHASE: LEARNING ...

2019-07-25 09:39:29 INFO TMPInterpreter:82 - >> Column=0

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:70 - >> (LEANRING) Preliminary Column Classification begins

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [12]/66,(The Last Judgement) NAMED_ENTITY

2019-07-25 09:39:29 INFO TCellDisambiguator:34 - >> (cold start disamb), candidates=10

2019-07-25 09:39:29 INFO TColumnClassifier:38 - >> update candidate clazz on column, existing=0

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [18]/66,(Il diavolo) NAMED_ENTITY

2019-07-25 09:39:29 INFO TCellDisambiguator:34 - >> (cold start disamb), candidates=2

2019-07-25 09:39:29 INFO TColumnClassifier:38 - >> update candidate clazz on column, existing=2

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:117 - >> (LEARNING) Preliminary Column Classification converged, rows:2/66

2019-07-25 09:39:29 INFO LEARNINGPreliminaryDisamb:39 - >> (LEARNING) Preliminary Disambiguation begins

2019-07-25 09:39:29 INFO LEARNINGPreliminaryDisamb:46 - >> re-annotate cells involved in cold start disambiguation

2019-07-25 09:39:29 INFO LEARNINGPreliminaryDisamb:50 - >> constrained cell disambiguation for the rest cells in this column

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([14]/66,0) (March on Rome) DATE candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([59]/66,0) (La stanza del vescovo) SHORT_TEXT candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([58]/66,0) (The Career of a Chambermaid) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([35]/66,0) (A Question of Honour) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([50]/66,0) (Vogliamo i colonnelli) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([57]/66,0) (Brutti, sporchi e cattivi) SHORT_TEXT candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([60]/66,0) (Un borghese piccolo piccolo) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([27]/66,0) (The Birds, the Bees and the Italians) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([41]/66,0) (The Libertine) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([48]/66,0) (The Seduction of Mimi) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([19]/66,0) (Il Boom) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([13]/66,0) (Mafioso) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([6]/66,0) (A Difficult Life) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([1]/66,0) (The Great War) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([47]/66,0) (Bello, onesto, emigrato Australia sposerebbe compaesana illibata) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([54]/66,0) (Profumo di donna) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([38]/66,0) (La ragazza con la pistola) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([3]/66,0) (Love and Larceny) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([34]/66,0) (L'ombrellone) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([56]/66,0) (Amici miei) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([7]/66,0) (Audace colpo dei soliti ignoti) SHORT_TEXT candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([43]/66,0) (Brancaleone alle Crociate) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([5]/66,0) (Adua e le compagne) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([46]/66,0) (Secret Fantasy) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([9]/66,0) (Divorce, Italian Style) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([15]/66,0) (The Conjugal Bed) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([22]/66,0) (Seduced and Abandoned) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([24]/66,0) (Il successo) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([45]/66,0) (In nome del popolo italiano) SHORT_TEXT candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([49]/66,0) (Lo scopone scientifico) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([17]/66,0) (Alta Infedeltà) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([26]/66,0) (Casanova 70) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([40]/66,0) (Il Commissario Pepe) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([25]/66,0) (Le bambole) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([63]/66,0) (Caro papà) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([31]/66,0) (The Man, the Woman and the Money) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([20]/66,0) (Yesterday, Today and Tomorrow) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([36]/66,0) (The Tiger and the Pussycat) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([44]/66,0) (Between Miracles) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([64]/66,0) (Amici miei Atto II) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([39]/66,0) (Il medico della mutua) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([52]/66,0) (C'eravamo tanto amati) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([2]/66,0) (Il vedovo) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([55]/66,0) (Romanzo popolare) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([23]/66,0) (Se permettete parliamo di donne) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([30]/66,0) (Io la conoscevo bene) SHORT_TEXT candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([4]/66,0) (Everybody Go Home) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([0]/66,0) (Big Deal on Madonna Street) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([11]/66,0) (The Easy Life) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([51]/66,0) (Pane e cioccolata) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([28]/66,0) (I complessi) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([33]/66,0) (L'Armata Brancaleone) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([61]/66,0) (Traffic Jam (film)) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([8]/66,0) (The Fascist) NAMED_ENTITY candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([29]/66,0) (Il Gaucho) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([42]/66,0) (Vedo nudo) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([65]/66,0) (Café Express) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([16]/66,0) (I mostri) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([10]/66,0) (Boccaccio '70) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([37]/66,0) (The Witches) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([21]/66,0) (The Reunion) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([53]/66,0) (Swept Away by an Unusual Destiny in the Blue Sea of August) UNKNOWN candidates=0

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([32]/66,0) (Me, Me, Me... and the Others) NAMED_ENTITY candidates=1

2019-07-25 09:39:29 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([62]/66,0) (Signore e signori, buonanotte) SHORT_TEXT candidates=0

2019-07-25 09:39:29 INFO LEARNINGPreliminaryDisamb:87 - >> constrained cell disambiguation complete 40/66 rows
2019-07-25 09:39:29 INFO LEARNINGPreliminaryDisamb:88 - >> reset candidate column class annotations

2019-07-25 09:39:29 INFO TMPInterpreter:82 - >> Column=2

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:70 - >> (LEANRING) Preliminary Column Classification begins

2019-07-25 09:39:29 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [2, 3, 6, 11, 14, 16, 29, 34, 36, 42, 45, 54, 58, 59, 63]/27,(Dino Risi) NAMED_ENTITY

2019-07-25 09:39:29 INFO TCellDisambiguator:34 - >> (cold start disamb), candidates=2

2019-07-25 09:39:30 INFO TColumnClassifier:38 - >> update candidate clazz on column, existing=0

2019-07-25 09:39:30 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [0, 1, 26, 33, 38, 43, 50, 55, 56, 60, 64]/27,(Mario Monicelli) NAMED_ENTITY

2019-07-25 09:39:30 INFO TCellDisambiguator:34 - >> (cold start disamb), candidates=2

2019-07-25 09:39:30 INFO TColumnClassifier:38 - >> update candidate clazz on column, existing=47

2019-07-25 09:39:30 INFO LEARNINGPreliminaryColumnClassifier:117 - >> (LEARNING) Preliminary Column Classification converged, rows:26/27

2019-07-25 09:39:30 INFO LEARNINGPreliminaryDisamb:39 - >> (LEARNING) Preliminary Disambiguation begins

2019-07-25 09:39:30 INFO LEARNINGPreliminaryDisamb:46 - >> re-annotate cells involved in cold start disambiguation

2019-07-25 09:39:30 INFO LEARNINGPreliminaryDisamb:50 - >> constrained cell disambiguation for the rest cells in this column

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([23, 40, 52, 57]/27,2) (Ettore Scola) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([4, 49, 61]/27,2) (Luigi Comencini) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([35, 39, 47]/27,2) (Luigi Zampa) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([9, 22, 27]/27,2) (Pietro Germi) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([12, 19, 20]/27,2) (Vittorio De Sica) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([5, 30]/27,2) (Antonio Pietrangeli) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([41, 46]/27,2) (Pasquale Festa Campanile) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([7, 65]/27,2) (Nanni Loy) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([48, 53]/27,2) (Lina Wertmüller) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([13]/27,2) (Alberto Lattuada) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([18]/27,2) (Gian Luigi Polidoro) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([44]/27,2) (Nino Manfredi) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([31]/27,2) (Eduardo De Filippo, Marco Ferreri, Luciano Salce) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([24]/27,2) (Mauro Morassi, Dino Risi) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([25]/27,2) (Mauro Bolognini, Luigi Comencini, Dino Risi, Franco Rossi) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([15]/27,2) (Marco Ferreri) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([8]/27,2) (Luciano Salce) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([17]/27,2) (Mario Monicelli, Franco Rossi, Elio Petri, Luciano Salce) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([51]/27,2) (Franco Brusati) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([28]/27,2) (Dino Risi, Luigi Filippo D'Amico, Franco Rossi) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([62]/27,2) (Luigi Comencini, Nanni Loy, Mario Monicelli, Ettore Scola, Luigi Magni) UNKNOWN candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([37]/27,2) (Luchino Visconti, Pier Paolo Pasolini, Vittorio De Sica, Franco Rossi, Mauro Bolognini) UNKNOWN candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([10]/27,2) (Mario Monicelli, Federico Fellini, Luchino Visconti, Vittorio De Sica) NAMED_ENTITY candidates=0

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([21]/27,2) (Damiano Damiani) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO TCellDisambiguator:64 - >> (constrained disambiguation in LEARNING) , position at ([32]/27,2) (Alessandro Blasetti) NAMED_ENTITY candidates=1

2019-07-25 09:39:30 INFO LEARNINGPreliminaryDisamb:87 - >> constrained cell disambiguation complete 31/27 rows

2019-07-25 09:39:30 INFO LEARNINGPreliminaryDisamb:88 - >> reset candidate column class annotations

2019-07-25 09:39:30 INFO TMPInterpreter:82 - >> Column=3

2019-07-25 09:39:30 INFO LEARNINGPreliminaryColumnClassifier:70 - >> (LEANRING) Preliminary Column Classification begins

2019-07-25 09:39:30 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [0]/63,(Marcello Mastroianni, Vittorio Gassman, Totò) NAMED_ENTITY

uk.ac.shef.dcs.sti.STIException: uk.ac.shef.dcs.kbsearch.KBSearchException: HttpException: 500

at uk.ac.shef.dcs.sti.core.algorithm.tmp.TMPInterpreter.start(TMPInterpreter.java:105)
at uk.ac.shef.dcs.sti.experiment.STIBatch.process(STIBatch.java:326)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.process(TableMinerPlusSingle.java:58)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.main(TableMinerPlusSingle.java:147)
Caused by: uk.ac.shef.dcs.kbsearch.KBSearchException: HttpException: 500
at uk.ac.shef.dcs.kbsearch.sparql.DBpediaSearch.findEntityCandidates(DBpediaSearch.java:133)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.LEARNINGPreliminaryColumnClassifier.runPreliminaryColumnClassifier(LEARNINGPreliminaryColumnClassifier.java:95)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.LEARNING.learn(LEARNING.java:27)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.TMPInterpreter.start(TMPInterpreter.java:83)
... 3 more
Caused by: HttpException: 500
at org.apache.jena.sparql.engine.http.HttpQuery.rewrap(HttpQuery.java:411)
at org.apache.jena.sparql.engine.http.HttpQuery.execGet(HttpQuery.java:355)
at org.apache.jena.sparql.engine.http.HttpQuery.exec(HttpQuery.java:292)
at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execResultSetInner(QueryEngineHTTP.java:359)
at org.apache.jena.sparql.engine.http.QueryEngineHTTP.execSelect(QueryEngineHTTP.java:351)
at uk.ac.shef.dcs.kbsearch.sparql.SPARQLSearch.queryByLabel(SPARQLSearch.java:143)
at uk.ac.shef.dcs.kbsearch.sparql.DBpediaSearch.findEntityCandidates(DBpediaSearch.java:105)
... 6 more

missed: 0_https://en.wikipedia.org/wiki/Commedia_all%27italiana

java.io.FileNotFoundException: resources/failed.txt (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:133)
at java.io.FileWriter.(FileWriter.java:78)
at uk.ac.shef.dcs.sti.experiment.STIBatch.recordFailure(STIBatch.java:354)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.process(TableMinerPlusSingle.java:80)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.main(TableMinerPlusSingle.java:147)

Instead with this other table https://it.wikipedia.org/wiki/Gand%C3%ADa_Shore the exception is org.apache.jena.query.QueryParseException because in the table there is " character.

2019-07-22 13:51:51 INFO TableMinerPlusBatch:46 - Initializing entity cache...

2019-07-22 13:51:53 INFO TableMinerPlusBatch:50 - Initializing KBSearch...

2019-07-22 13:51:54 INFO TableMinerPlusBatch:67 - Initializing SUBJECT COLUMN DETECTION components ...

Mon Jul 22 13:51:54 UTC 2019 loading exception data for lemmatiser...

Mon Jul 22 13:51:54 UTC 2019 loading done

2019-07-22 13:51:55 INFO TableMinerPlusBatch:94 - Initializing LEARNING components ...

2019-07-22 13:51:55 INFO TableMinerPlusBatch:136 - Initializing UPDATE components ...

2019-07-22 13:51:55 INFO TableMinerPlusBatch:149 - Initializing RELATIONLEARNING components ...

2019-07-22 13:51:55 INFO TMPInterpreter:49 - > PHASE: Detecting subject column...

2019-07-22 13:51:56 INFO TMPInterpreter:65 - > PHASE: LEARNING ...

2019-07-22 13:51:56 INFO TMPInterpreter:82 - >> Column=0

2019-07-22 13:51:56 INFO LEARNINGPreliminaryColumnClassifier:70 - >> (LEANRING) Preliminary Column Classification begins

2019-07-22 13:51:56 INFO LEARNINGPreliminaryColumnClassifier:81 - >> cold start disambiguation, row(s) [0]/8,(José "Labrador" Sancho) NAMED_ENTITY

uk.ac.shef.dcs.sti.STIException: uk.ac.shef.dcs.kbsearch.KBSearchException: org.apache.jena.query.QueryParseException: Lexical error at line 2, column 44. Encountered: "\"" (34), after : "Labrador"
at uk.ac.shef.dcs.sti.core.algorithm.tmp.TMPInterpreter.start(TMPInterpreter.java:105)
at uk.ac.shef.dcs.sti.experiment.STIBatch.process(STIBatch.java:326)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.process(TableMinerPlusSingle.java:58)

at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.main(TableMinerPlusSingle.java:147)
Caused by: uk.ac.shef.dcs.kbsearch.KBSearchException: org.apache.jena.query.QueryParseException: Lexical error at line 2, column 44. Encountered: "\"" (34), after : "Labrador"
at uk.ac.shef.dcs.kbsearch.sparql.DBpediaSearch.findEntityCandidates(DBpediaSearch.java:133)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.LEARNINGPreliminaryColumnClassifier.runPreliminaryColumnClassifier(LEARNINGPreliminaryColumnClassifier.java:95)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.LEARNING.learn(LEARNING.java:27)
at uk.ac.shef.dcs.sti.core.algorithm.tmp.TMPInterpreter.start(TMPInterpreter.java:83)
... 3 more
Caused by: org.apache.jena.query.QueryParseException: Lexical error at line 2, column 44. Encountered: "\"" (34), after : "Labrador"
at org.apache.jena.sparql.lang.ParserSPARQL11.perform(ParserSPARQL11.java:110)
at org.apache.jena.sparql.lang.ParserSPARQL11.parse$(ParserSPARQL11.java:52)
at org.apache.jena.sparql.lang.SPARQLParser.parse(SPARQLParser.java:34)
at org.apache.jena.query.QueryFactory.parse(QueryFactory.java:147)
at org.apache.jena.query.QueryFactory.create(QueryFactory.java:79)
at org.apache.jena.query.QueryFactory.create(QueryFactory.java:52)
at org.apache.jena.query.QueryFactory.create(QueryFactory.java:40)
at uk.ac.shef.dcs.kbsearch.sparql.SPARQLSearch.queryByLabel(SPARQLSearch.java:139)
at uk.ac.shef.dcs.kbsearch.sparql.DBpediaSearch.findEntityCandidates(DBpediaSearch.java:105)
... 6 more
java.io.FileNotFoundException: resources/failed.txt (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at java.io.FileOutputStream.(FileOutputStream.java:133)
at java.io.FileWriter.(FileWriter.java:78)
at uk.ac.shef.dcs.sti.experiment.STIBatch.recordFailure(STIBatch.java:354)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.process(TableMinerPlusSingle.java:80)
at uk.ac.shef.dcs.sti.ui.TableMinerPlusSingle.main(TableMinerPlusSingle.java:147)

missed: 0_ https://it.wikipedia.org/wiki/Gandía_Shore

I also manually created the file resources/failed.txt, but I obtained the same FileNotFoundException.

Could you help us to solve those issues?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.