dasmith / stanford-corenlp-python Goto Github PK

View Code? Open in Web Editor NEW

610.0 610.0 266.0 188.76 MB

Python wrapper for Stanford CoreNLP tools v3.4.1

License: GNU General Public License v2.0

Python 100.00%

stanford-corenlp-python's People

Contributors

Stargazers

Watchers

Forkers

ranjithtenz rybesh underspecified abhaga jcccf chasebro marcovzla berdon jeremyjbowers jaeroong mathewsbabu gutelius adkatrit pflaquerre abijith-kp maxis1718 charnugagoo jac2130 relwell aped netconstructor scraping-xx big-data bigdata-tools hithertolabs chanrom davidajohnston cryptolab rahmaniacc boblannon azizur77 dx88968 sibghatullahsheikh man27382210 vambati silverasm gthandavam ashbt julosaure icedwater imclab bruce2xkwang sarvesh-ranjan knowsis johnconnelly75 ivanvladimir curzona karimkhanp lendormi arne-cl orazaro jchou24 recski isandeep killix vinodrajendran001 redsk taylorhxu d1ma hitalex rocipher kchennen mongolia19 raunakmanjani zhongliangong gotoc huskyeder alphadx ababook pombredanne kevinlee315 rishavbajoria72 joswinkj superxiaoqiang simms21 milesqli vparikh10 denglizong danielravina emilmont likaiguo netherash largepanda maverick2789 hayj vasu5235 xiliangsong bsatts luciasalar javelir byronallen qiqipipioioi jhnlp wachihi1 saikswaroop chabhishek123 rksksm chaitanyacixlive chenhaot kyusonglee

stanford-corenlp-python's Issues

Error while launching the server, i.e. running the command python corenlp.py

This is the error:
Traceback (most recent call last):
File "", line 1, in
File "corenlp.py", line 176, in init
self.corenlp.expect("done.", timeout=200) # Loading PCFG (~3sec)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 327, in expect
timeout, searchwindowsize, async_)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 355, in expect_list
return exp.expect_loop(timeout)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 102, in expect_loop
return self.eof(e)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 49, in eof
raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform.
<pexpect.pty_spawn.spawn object at 0x10ca092d0>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:./stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:./stanford-corenlp-full-2014-08-27/joda-time.jar:./stanford-corenlp-full-2014-08-27/xom.jar:./stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
buffer (last 100 chars): ''
before (last 100 chars): 'aders.java:185)\r\n\tat java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)\r\n\t... 34 more\r\n'
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 46580
child_fd: 6
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile("done.")

I have verified that all the jar files are of the same version that is specified in the corenlp.py code, earlier I had used a latest version of it and appropriately updated it in corenlp.py, in either cases, getting the same error. Not able to figure it out, kindly look into this and please suggest a solution.

Very long texts

I am trying to parse a text which is 1297 characters long but it returns an empty sentence. If I use a different timeout value in the file client.py, let's say 200.0, after that time passes the code raises an jsonrpc.RPCTransportError: timed out exception.

Could you tell me what I am supposed to modify in the code to make client.py work with longer texts?

Thanks,
michele.

How can I use the -nthreads argument?

I read on the corenlp page that multithreading is supported for the parser by use of the -nthreads k argument. How can I implement this with the python wrapper?

Arabic language

is this corenlp can be used for Arabic?

jsonrpc.py randomly fails

I am processing large paragraphs using this python interface. If it matters, I have set the encoding to UTF8 because of some characters in the data, and the paragraphs/sentences are fairly large . When I try to execute a script, and make a request to the running core-nlp server, it fails randomly by throwing the error:

jsonrpc.RPCParseError: <RPCFault -32700: 'Parse error.' (u'No valid JSON. (Unterminated string starting at: line 1 column 50 (char 49))')>

And I use the word "randomly" because if and when it fails and I simply try it 3-4 times more, it starts working perfectly. This is a problem if I want to iteratively make calls to the server, as it throws an error in the randomly at any point in the loop and fails.

Does it have anything to do with the fact that

a) The paragraph/sentence size is fairly large(usually 200-400 words).
OR
b) I am using UTF8 encoding.

Or is it something completely else?

Note: I am using Python 2.7.12 (if that matters)

How to add chinese models to this module?

~400ms latency problems

I noticed a parse through the json-rpc takes 400ms longer than using the java interactive shell.

What's the best way to cut this down? Is it a python issue?

Happy to work on this for a pull request.

Python 3 support

I could not find this documented, but as far as I see, this module works only with python 2. Any chance to use it with python 3/anyone already forked such version?

Sentiment Analysis Confidence Scores

Hello,

For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.

Thank you.

weird UnicodeDecodeError in StanfordCoreNLP.parse()

Hi Dustin,

I just found a really weird error. While corenlp can parse '100 dollars' just fine, '100 yen' causes it to crash.

Python 2.7.3 (default, Feb 27 2014, 19:37:34) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import corenlp
>>> c = corenlp.StanfordCoreNLP()
Loading Models: 5/5                                                                                                                                                                                                                         
>>> c.parse('100 dollars')
'{"sentences": [{"parsetree": "(ROOT (X (NP (CD 100) (NNS dollars))))", "text": "100 dollars", "dependencies": [["root", "ROOT", "dollars"], ["num", "dollars", "100"]], "words": [["100", {"NormalizedNamedEntityTag": "$100.0", "Lemma": "100", "CharacterOffsetEnd": "3", "PartOfSpeech": "CD", "CharacterOffsetBegin": "0", "NamedEntityTag": "MONEY"}], ["dollars", {"NormalizedNamedEntityTag": "$100.0", "Lemma": "dollar", "CharacterOffsetEnd": "11", "PartOfSpeech": "NNS", "CharacterOffsetBegin": "4", "NamedEntityTag": "MONEY"}]]}]}'

>>> c.parse('100 yen')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/corenlp-3.4.1-py2.7.egg/corenlp.py", line 240, in parse
    response = self._parse(text)
  File "/usr/local/lib/python2.7/dist-packages/corenlp-3.4.1-py2.7.egg/corenlp.py", line 230, in _parse
    raise e
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 169: ordinal not in range(128)

Any ideas?

Windows run of "python corenlp.py" Error

Use Windows 7 machine,
Python 2.7.11
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

Traceback (most recent call last):
File "corenlp.py", line 257, in
nlp = StanfordCoreNLP()
File "corenlp.py", line 163, in init
self.corenlp = pexpect.spawn(start_corenlp)
AttributeError: 'module' object has no attribute 'spawn'

Corenlp.py does not loading any modules

Traceback (most recent call last):
File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 281, in
nlp = StanfordCoreNLP()
File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 173, in init
self.corenlp.expect("done.", timeout=20) # Load pos tagger model (~5sec)
File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 341, in expect
timeout, searchwindowsize, async_)
File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 369, in expect_list
return exp.expect_loop(timeout)
File "C:\Python27\lib\site-packages\pexpect\expect.py", line 117, in expect_loop
return self.eof(e)
File "C:\Python27\lib\site-packages\pexpect\expect.py", line 63, in eof
raise EOF(msg)
EOF: End Of File (EOF).
<pexpect.popen_spawn.PopenSpawn object at 0x021863B0>
searcher: searcher_re:
0: re.compile('done.')

Import stanford-corenlp-python as a module

When I try importing the corenlp class from a python script (exampleRun.py) that is not in the same stanford-corenlp-pyhton directory like this:

from corenlp import *
corenlp = StanfordCoreNLP("path_to_stanford-corenlp-full-2014-08-27/")

the following error is raised from pexpect:

Loading Models: 0/5
Traceback (most recent call last):
File "/home/matteorr/Project1/exampleRun.py", line 4, in
corenlp = StanfordCoreNLP("/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/")
File "/home/matteorr/stanford-corenlp-pyhton/corenlp.py", line 168, in init
self.corenlp.expect("done.", timeout=20) # Load pos tagger model (~5sec)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1311, in expect
return self.expect_list(compiled_pattern_list, timeout, searchwindowsize)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1325, in expect_list
return self.expect_loop(searcher_re(pattern_list), timeout, searchwindowsize)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1396, in expect_loop
raise EOF (str(e) + '\n' + str(self))
pexpect.EOF: End Of File (EOF) in read_nonblocking(). Exception style platform.
<pexpect.spawn object at 0x7f7106fb3650>
version: 2.3 ($Revision: 399 $)
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', '/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/joda-time.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/xom.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
searcher: searcher_re:
0: re.compile("done.")
buffer (last 100 chars):
before (last 100 chars): va:448)
at edu.stanford.nlp.util.StringUtils.argsToProperties(StringUtils.java:869)
... 2 more

after: <class 'pexpect.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 28392
child_fd: 3
closed: False
timeout: 30
delimiter: <class 'pexpect.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

The same script, run in the same directory as corenlp works fine.
Is this expected behavior or is something wrong?

Thanks in advance for your help.
I apologize if this was not the correct place post this issue.

Best regards,

matteorr

Installation error due to hard coding in corenlp.py

In class StanfordCoreNLP in 'corenlp.py', the jars version are hard coded, so any jars which are of updated version are not accepted hence produces an error while launching the server.

Needs change in the lookup manner.

How can Stanford corenlp be used for Sentiment Analysis?

Hi I am planning to use python wrapper for corenlp for sentiment analysis in my lectures. Can someone please point me to some right documentation? It is a bit urgent. Thanks in advance

Is it possible to use this lib to train a model?

Hi,

Thus the standford-corenlop-python assume that the model was trained beforehand? Or it provides some way to do it?

Very best regards,
Emanuel

Getting sentiment value via server implementation

Hi, i am interested in using the server implementation of your wrapper but it seems it doesn't seem to output the sentiment score while in the package implementation, there is a field for the same. What is the cause of this difference?

RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

/Users/danielsampetethiyagu/github/image_caption_using_attention/coreNlpUtil.pyc in parseText(sentences)
22 def parseText(sentences):
23
---> 24 parseResult = nlp.parse(sentences)
25
26 if len(parseResult['sentences']) == 1:

/Users/danielsampetethiyagu/github/image_caption_using_attention/coreNlpUtil.pyc in parse(self, text)
16
17 def parse(self, text):
---> 18 return json.loads(self.server.parse(text))
19
20

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in call(self, *args, **kwargs)
932 return _method(self.__req, "%s.%s" % (self.__name, name))
933 def call(self, *args, **kwargs):
--> 934 return self.__req(self.__name, args, kwargs)
935
936 #=========================================

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in __req(self, methodname, args, kwargs, id)
905 except Exception,err:
906 raise RPCTransportError(err)
--> 907 resp = self.__data_serializer.loads_response( resp_str )
908 return resp[0]
909

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in loads_response(self, string)
624 raise RPCInvalidMethodParams(error_data)
625 elif data["error"]["code"] == INTERNAL_ERROR:
--> 626 raise RPCInternalError(error_data)
627 elif data["error"]["code"] == PROCEDURE_EXCEPTION:
628 raise RPCProcedureException(error_data)

RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

jsonrpc import error: ValueError, err :

Hello, So iv been trying to use corenlp as a wrapper for Stanfordnlp for coreference resolution. But im having issues with the corenlp.py file. There was one error in the downloaded file which was

Exception, err:
needs to writted as
Exception as err:

But when i correct this the jsonrpc import doesnt work as a method within the import is throwing this error.

Traceback (most recent call last):
File "corenlp.py", line 24, in
import jsonrpc, pexpect
File "D:\NLP\NaturalLanguageProcessing\stanford-corenlp-python\jsonrpc.py", line 376
except ValueError, err:
^
SyntaxError: invalid syntax

Any help would be much appreciated, Thanks in advance. Also would be agreat help if you could suggest any known API's for coreference resolution or a wrapper for stanfordnlp that has coreference resolution

hardcoded lib and jar versions

I noticed some hardcoded lib and jar versions within the python source code itself. Are these libraries only compatible with certain versions of corenlp or are we expected to search through the code and change every reference to specific filenames and jars whenever update our local corenlp?

Could you please help me that explain what the result of coreference resolution means?

I tried the tools and get the result like:
Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008.
"coref": [[[["He", 1, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]], [["the president", 1, 3, 2, 4], ["Barack Obama", 0, 1, 0, 2]], [["Obama", 2, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]]]]
So could you please help that what it means? Especially what the indices in the list mean?
Thank you very much!

Python3.5.3 issues

python doesn't handle:
except ValueError, err:
^
SyntaxError: invalid syntax

Needs to be "as" format. Further issues with print statements.

I can push version for py3, if you'd like. Just let me know.

-EV

pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.

python corenlp/corenlp.py -H ip -p 3456
Traceback (most recent call last):
  File "corenlp/corenlp.py", line 592, in <module>
    main()
  File "corenlp/corenlp.py", line 580, in main
    nlp = StanfordCoreNLP(options.corenlp, properties=options.properties, serving=True)
  File "corenlp/corenlp.py", line 435, in __init__
    self._spawn_corenlp()
  File "corenlp/corenlp.py", line 424, in _spawn_corenlp
    self.corenlp.expect("\nNLP> ")
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 315, in expect
    timeout, searchwindowsize, async)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 339, in expect_list
    return exp.expect_loop(timeout)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 102, in expect_loop
    return self.eof(e)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 49, in eof
    raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7ff999081510>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx3g', '-cp', 'stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:stanford-corenlp-full-2014-08-27/xom.jar:stanford-corenlp-full-2014-08-27/joda-time.jar:stanford-corenlp-full-2014-08-27/jollyday.jar:stanford-corenlp-full-2014-08-27/ejml-0.23.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', '/root/corenlp-python/corenlp/default.properties']
searcher: None
buffer (last 100 chars): ''
before (last 100 chars): ' ner\r\nLoading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... '
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 5804
child_fd: 6
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 8192
ignorecase: False
searchwindowsize: 80
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

corenlp.py fails for 3.9.0

I'm aware that the repo mentions the code for stanford-corenlp-3.4.1, but I had 3.9.0, and changed the path and models in corenlp.py accordingly.

Then it gets stuck on Loading models 4/5, and then throws a timeout error. Please look into this.

Certain characters lead to Internal Error

I am trying to parse the sentence

WASHINGTON — Republicans on Thursday vowed a swift and forceful response to the executive action on immigration that President Obama is to announce in a prime-time address, accusing the president of exceeding the power of his office and promising a legislative fight when they take full control of Congress next year.

but I keep getting the error

Traceback (most recent call last):
  File "client.py", line 19, in <module>
    result = nlp.parse(text2)
  File "client.py", line 12, in parse
    return json.loads(self.server.parse(text))
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 934, in __call__
    return self.__req(self.__name, args, kwargs)
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 907, in __req
    resp = self.__data_serializer.loads_response( resp_str )
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 626, in     loads_response
    raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

The error doesn't appear though when I remove the EM Dash (—) in the first sentence. The same goes for curly single and double quotes like “”. Is there any way I can still parse these characters in this wrapper?

Thanks

TokensRegex or regexner annotators in corenlp Python

I am wondering if there is any documentation of how to use regexner and TokensRegex annotators in Python wrapper of corenlp. And also, how can I use my own customised regular expression?

Attribute error in client.py

Hi, I have nltk version 3.0.3 and i am getting this error.

tree = Tree.parse(result['sentences'][0]['parsetree'])
AttributeError: type object 'Tree' has no attribute 'parse'

How do I change models for NER?

How do set the model:

ner.model.3class = /u/nlp/data/ner/goodClassifiers/all.3class.distsim.crf.ser.gz
ner.model.7class = /u/nlp/data/ner/goodClassifiers/muc.distsim.crf.ser.gz
ner.model.MISCclass = /u/nlp/data/ner/goodClassifiers/conll.distsim.crf.ser.gz

parse returning as a string rather than a dictionary.

I'm trying to follow the instructions:

from corenlp import *
corenlp = StanfordCoreNLP()
corenlp.parse("This is a test.")

When I do this it returns something like this:
'{"coref": [[[["This", 0, 0, 0, 1], ["a test", 0, 3, 2, 4]]]], "sentences": [{"parsetree": "(ROOT (S (NP (DT This)) (VP (VBZ is) (NP (DT a) (NN test))) (. .)))", "text": "This is a test.", "dependencies": [["root", "ROOT", "test"], ["nsubj", "test", "This"], ["cop", "test", "is"], ["det", "test", "a"]], "words": [["This", {"NamedEntityTag": "O", "CharacterOffsetEnd": "4", "Lemma": "this", "PartOfSpeech": "DT", "CharacterOffsetBegin": "0"}], ["is", {"NamedEntityTag": "O", "CharacterOffsetEnd": "7", "Lemma": "be", "PartOfSpeech": "VBZ", "CharacterOffsetBegin": "5"}], ["a", {"NamedEntityTag": "O", "CharacterOffsetEnd": "9", "Lemma": "a", "PartOfSpeech": "DT", "CharacterOffsetBegin": "8"}], ["test", {"NamedEntityTag": "O", "CharacterOffsetEnd": "14", "Lemma": "test", "PartOfSpeech": "NN", "CharacterOffsetBegin": "10"}], [".", {"NamedEntityTag": "O", "CharacterOffsetEnd": "15", "Lemma": ".", "PartOfSpeech": ".", "CharacterOffsetBegin": "14"}]]}]}'

Where it is a dictionary wrapped in quotes making it a string. I'm not sure what I'm doing wrong...

Abount connection refused

When I run
result = loads(server.parse("Hello world. It is so beautiful"))
It is an connection error.

Traceback (most recent call last):
File "", line 1, in
File "jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: [Errno 111] Connection refused

result = loads(server.parse("Hello world. It is so beautiful"))
Traceback (most recent call last):
File "", line 1, in
File "jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: [Errno 111] Connection refused

Corenlp.py does not go further after loading all 5 modules

Traceback (most recent call last):
  File "corenlp.py", line 257, in <module>
    nlp = StanfordCoreNLP()
  File "corenlp.py", line 178, in __init__
    self.corenlp.expect("Entering interactive shell.")
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 116, in expect_loop
    return self.timeout(e)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 80, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f1cbb072050>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1.jar:./stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1-models.jar:./stanford-corenlp-full-2018-02-27/joda-time.jar:./stanford-corenlp-full-2018-02-27/xom.jar:./stanford-corenlp-full-2018-02-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
buffer (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
before (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 7185
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile("Entering interactive shell.")

Support for sentiment analysis

Hi, I was planning to use the python wrapper but I am not sure if it has support for sentiment analysis like the original sanfordCoreNLP. If yes, please share some documentation.

Instanciate StanfordCoreNLP with different annotators

I'm using the StanfordCoreNLP class to do NER on some text. Then somewhere else in my program I only need to do POS tagging, but performance is uselessly slowed down by NER. I see that I can edit the default.properties file to remove the annotators I don't need, but that would change every instance of StanfordCoreNLP, which won't work.

Right now I'm thinking of modifying StanfordCoreNLP's init to allow a custom string for props to be passed, and create several files that contain the annotator lists I need. This might work for now, but I'd like to know if you see a better way, and if you'd be interested in allowing StanfordCoreNLP instances to be created with an optional annotator list.

Dependency Problem

The IDs you stripped from the dependencies in remove_id() should stay there. If two identical words occur in the same sentence, and you strip the word-id from the results, there's no way for us to easily disambiguate them (hence, why Stanford explicitly put them there)

I have an error and, if it's something you're aware of, wondered if you can help me with a fix?

python corenlp.py
Traceback (most recent call last):
File "corenlp.py", line 257, in
nlp = StanfordCoreNLP()
File "corenlp.py", line 163, in init
self.corenlp = pexpect.spawn(start_corenlp)
File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 198, in init
self._spawn(command, args, preexec_fn, dimensions)
File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 271, in _spawn
'executable: %s.' % self.command)
pexpect.exceptions.ExceptionPexpect: The command was not found or was not executable: java.

how can corenlp handle non-ascii string?

I put the word 'Víctor' into corenlp.parse. 'Víctor' contains non-ascii character. I would like to get the lemma of 'Víctor'. But when I put corenlp.parse('Víctor'). It gives error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128).

How can I change corenlp setting, so corenlp can handle non-ascii string?

No Valid JSON error

I am a newbie to these tools (JSON in particular). I am getting a parse error with error code -32700. Please help how to fix?
I have attached the screenshot for the same-

Multiple occurrences of a word not handled properly while creating tuples

If there are multiple occurrences of a word in a sentence, lack of ids makes it impossible to identify the source and target of a dependency correctly.

If you are open to accepting a patch for this, I can submit one. My idea is to keep the ids in the "tuples" and store the dependents of a word in the "words" array.

RPCTransportError: argument must be an int, or have a fileno() method.

Hi Guys,

I am getting this error when i am trying to parse multiple sentences parallely. Everything works fine if i perform parsing sequentially.

parseResult = nlp.parse(sentences)
File "/Users/Vikram/Kiwi/django/app/app/app/coreNlpUtil.py", line 18, in parse
return json.loads(self.server.parse(text))
File "/Users/Vikram/Kiwi/django/app/app/app//jsonrpc.py", line 933, in call
return self.__req(self.__name, args, kwargs)
File "/Users/Vikram/Kiwi/django/app/app/app//jsonrpc.py", line 906, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/Users/Vikram/Kiwi/django/app/app/app/jsonrpc.py", line 594, in loads_response
* raise RPCParseError("No valid JSON. (%s)" % str(err))
RPCParseError: <RPCFault -32700: 'Parse error.' ('No valid JSON. (No JSON object could be decoded)')>*

How can i fix the issue?

Error in client.py

When I ran the client.py ,it says that Tree has no such attribute 'parse'.
Also, I am not getting how to extract the dependencies using this

Parsing Q

I'm not sure why, but when I pass 'Q' to the coreNLP server, it breaks down.

Here is the code I'm using:

>>> import jsonrpc
>>> server = jsonrpc.ServerProxy(jsonrpc.JsonRpc20(),jsonrpc.TransportTcpIp(addr=("127.0.0.1", 8080)))
>>> server.parse('Q')
u'{"sentences": []}'

Here is the server error:

NLP> 
========================================
Q
Annotation pipeline timing information:
PTBTokenizerAnnotator: 0.0 sec.
WordsToSentencesAnnotator: 0.0 sec.
POSTaggerAnnotator: 0.0 sec.
MorphaAnnotator: 0.0 sec.
NERCombinerAnnotator: 0.1 sec.
ParserAnnotator: 0.5 sec.
DeterministicCorefAnnotator: 0.0 sec.
TOTAL: 0.7 sec. for 11 tokens at 16.5 tokens/sec.
Pipeline setup: 13.4 sec.
Total time for StanfordCoreNLP pipeline: 75.1 sec.

I'm not sure, if this is a feature or a bug.

File "corenlp.py", line 226 except Exception, e: Syntax error

Trying to run this in the terminal on my mac, but when I try running python corenlp.py, it gives me a syntax error at line 226 of corenlp.py. Any way to fix this?

I really need to get this working soon.

could you add windows support?

PS F:\gitwork\stanford-corenlp-python> python corenlp.py
Traceback (most recent call last):
  File "corenlp.py", line 257, in <module>
    nlp = StanfordCoreNLP()
  File "corenlp.py", line 163, in __init__
    self.corenlp = pexpect.spawn(start_corenlp)
AttributeError: 'module' object has no attribute 'spawn'

pexpect/pexpect#321
http://pexpect.readthedocs.io/en/stable/overview.html#pexpect-on-windows

could you add windows support?

jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

i am trying to parse arabic text with python and i have got this error
Traceback (most recent call last):
File "client.py", line 16, in
result = nlp.parse(u"ﻊﻗﻮﺘﻤﻟا ﻦﻣ .ﺕﺎﺑﺎﻐﻟﺎﺑ ﻯﺫﻷا ﺕﺎﻄﻗﺎﺴﺘﻟا ﻲﻓﻭ ﺓﺭاﺮﺤﻟا ﻲﻓ ﺕاﺮﻴﻐﺘﻟا ﻖﺤﻠﺗ")
File "client.py", line 13, in parse
return json.loads(self.server.parse(text))
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'

Hi Dustin,

I am not sure if you are aware of the problem, when I try to run the corenlp.py, I get the following error

Starting the Stanford Core NLP parser.
Loading Models: 5/5 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| plays hard to get, smiles from time to time
NLP tools loaded.
Traceback (most recent call last):
  File "corenlp.py", line 295, in <module>
    server.register_function(nlp.parse_imperative)
AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'

Commenting out the line 295 solved the problem. I have quickly scanned the code, and could not locate parse_imperative method. I am not very experienced with Python, may be I have missed something.

I wanted you to know

Thanks for the great work! Keep up.

Error when processing Chinese text

After I start the server (with trained Chinese models and properties file), I test the server with a Chinese sentence by replacing the example English sentence in client.py, i.e.

#result = nlp.parse(u"Hello world!  It is so beautiful.")
result = nlp.parse(u"今天天气真不错啊！")

Traceback (most recent call last):
File "client.py", line 17, in
result = nlp.parse(u"今天天气真不错啊！")
File "client.py", line 13, in parse
return json.loads(self.server.parse(text))
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Could you show me how to fix this?

RPCTransportError: timed out

Hi,
I was trying to use the client.py code to parse a long paragraph. It generates the following error message:
File "/home/mings/Toolkits/stanford-corenlp-python/jsonrpc.py", line 934, in __call__
return self.__req(self.__name, args, kwargs)
File "/home/mings/Toolkits/stanford-corenlp-python/jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: timed out

I find this not very consistent. Sometimes, it is able to parse, but sometimes it is not.

[EDIT]
I changed the default timeouts in jsonrpc.py to 20 secs, it seems to work fine now.

[Errno 10061] No connection could be made because the target machine actively refused it

hi,
when i type server = jsonrpc.ServerProxy(jsonrpc.JsonRpc20(),jsonrpc.TransportTcpIp(addr=("127.0.0.1", 8080))) and then result = loads(server.parse("Hello world. It is so beautiful")) this eeror apears:
Traceback (most recent call last):
File "<pyshell#27>", line 1, in
result = loads(server.parse("Hello world. It is so beautiful"))
RPCTransportError: [Errno 10061] No connection could be made because the target machine actively refused it

i turn off my firewall but can not solve this error.
what should i do?

dasmith / stanford-corenlp-python Goto Github PK

stanford-corenlp-python's People

Contributors

Stargazers

Watchers

Forkers

stanford-corenlp-python's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs