GithubHelp home page GithubHelp logo

stanford-corenlp-python's People

Contributors

abhaga avatar dasmith avatar emilmont avatar jcccf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stanford-corenlp-python's Issues

Error while launching the server, i.e. running the command python corenlp.py

This is the error:
Traceback (most recent call last):
File "", line 1, in
File "corenlp.py", line 176, in init
self.corenlp.expect("done.", timeout=200) # Loading PCFG (~3sec)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 327, in expect
timeout, searchwindowsize, async_)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/spawnbase.py", line 355, in expect_list
return exp.expect_loop(timeout)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 102, in expect_loop
return self.eof(e)
File "/Users/mihir.saxena/virtualenvironment/my_new_project/lib/python2.7/site-packages/pexpect/expect.py", line 49, in eof
raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Empty string style platform.
<pexpect.pty_spawn.spawn object at 0x10ca092d0>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:./stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:./stanford-corenlp-full-2014-08-27/joda-time.jar:./stanford-corenlp-full-2014-08-27/xom.jar:./stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
buffer (last 100 chars): ''
before (last 100 chars): 'aders.java:185)\r\n\tat java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:496)\r\n\t... 34 more\r\n'
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 46580
child_fd: 6
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
0: re.compile("done.")

I have verified that all the jar files are of the same version that is specified in the corenlp.py code, earlier I had used a latest version of it and appropriately updated it in corenlp.py, in either cases, getting the same error. Not able to figure it out, kindly look into this and please suggest a solution.

Very long texts

I am trying to parse a text which is 1297 characters long but it returns an empty sentence. If I use a different timeout value in the file client.py, let's say 200.0, after that time passes the code raises an jsonrpc.RPCTransportError: timed out exception.

Could you tell me what I am supposed to modify in the code to make client.py work with longer texts?

Thanks,
michele.

How can I use the -nthreads argument?

I read on the corenlp page that multithreading is supported for the parser by use of the -nthreads k argument. How can I implement this with the python wrapper?

jsonrpc.py randomly fails

I am processing large paragraphs using this python interface. If it matters, I have set the encoding to UTF8 because of some characters in the data, and the paragraphs/sentences are fairly large . When I try to execute a script, and make a request to the running core-nlp server, it fails randomly by throwing the error:

jsonrpc.RPCParseError: <RPCFault -32700: 'Parse error.' (u'No valid JSON. (Unterminated string starting at: line 1 column 50 (char 49))')>

And I use the word "randomly" because if and when it fails and I simply try it 3-4 times more, it starts working perfectly. This is a problem if I want to iteratively make calls to the server, as it throws an error in the randomly at any point in the loop and fails.

Does it have anything to do with the fact that

a) The paragraph/sentence size is fairly large(usually 200-400 words).
OR
b) I am using UTF8 encoding.

Or is it something completely else?

Note: I am using Python 2.7.12 (if that matters)

~400ms latency problems

I noticed a parse through the json-rpc takes 400ms longer than using the java interactive shell.

What's the best way to cut this down? Is it a python issue?

Happy to work on this for a pull request.

Python 3 support

I could not find this documented, but as far as I see, this module works only with python 2. Any chance to use it with python 3/anyone already forked such version?

Sentiment Analysis Confidence Scores

Hello,

For sentiment analysis I'm able to obtain the score that corresponds to the class with the highest estimated probability, but I'm unable to produce the estimations themselves (e.g. [very_negative = 0.60, negative = 0.25, neutral = 0.10, positive = 0.025, very_positive = 0.025]). I'd like to filter probabilities below a certain confidence threshold.

Thank you.

weird UnicodeDecodeError in StanfordCoreNLP.parse()

Hi Dustin,

I just found a really weird error. While corenlp can parse '100 dollars' just fine, '100 yen' causes it to crash.

Python 2.7.3 (default, Feb 27 2014, 19:37:34) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import corenlp
>>> c = corenlp.StanfordCoreNLP()
Loading Models: 5/5                                                                                                                                                                                                                         
>>> c.parse('100 dollars')
'{"sentences": [{"parsetree": "(ROOT (X (NP (CD 100) (NNS dollars))))", "text": "100 dollars", "dependencies": [["root", "ROOT", "dollars"], ["num", "dollars", "100"]], "words": [["100", {"NormalizedNamedEntityTag": "$100.0", "Lemma": "100", "CharacterOffsetEnd": "3", "PartOfSpeech": "CD", "CharacterOffsetBegin": "0", "NamedEntityTag": "MONEY"}], ["dollars", {"NormalizedNamedEntityTag": "$100.0", "Lemma": "dollar", "CharacterOffsetEnd": "11", "PartOfSpeech": "NNS", "CharacterOffsetBegin": "4", "NamedEntityTag": "MONEY"}]]}]}'

>>> c.parse('100 yen')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/corenlp-3.4.1-py2.7.egg/corenlp.py", line 240, in parse
    response = self._parse(text)
  File "/usr/local/lib/python2.7/dist-packages/corenlp-3.4.1-py2.7.egg/corenlp.py", line 230, in _parse
    raise e
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 169: ordinal not in range(128)

Any ideas?

Windows run of "python corenlp.py" Error

Use Windows 7 machine,
Python 2.7.11
java version "1.7.0_80"
Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)

Traceback (most recent call last):
File "corenlp.py", line 257, in
nlp = StanfordCoreNLP()
File "corenlp.py", line 163, in init
self.corenlp = pexpect.spawn(start_corenlp)
AttributeError: 'module' object has no attribute 'spawn'

Corenlp.py does not loading any modules

Traceback (most recent call last):
File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 281, in
nlp = StanfordCoreNLP()
File "D:\fahma\corefernce resolution\stanford-corenlp-python-master\corenlp.py", line 173, in init
self.corenlp.expect("done.", timeout=20) # Load pos tagger model (~5sec)
File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 341, in expect
timeout, searchwindowsize, async_)
File "C:\Python27\lib\site-packages\pexpect\spawnbase.py", line 369, in expect_list
return exp.expect_loop(timeout)
File "C:\Python27\lib\site-packages\pexpect\expect.py", line 117, in expect_loop
return self.eof(e)
File "C:\Python27\lib\site-packages\pexpect\expect.py", line 63, in eof
raise EOF(msg)
EOF: End Of File (EOF).
<pexpect.popen_spawn.PopenSpawn object at 0x021863B0>
searcher: searcher_re:
0: re.compile('done.')

Import stanford-corenlp-python as a module

When I try importing the corenlp class from a python script (exampleRun.py) that is not in the same stanford-corenlp-pyhton directory like this:

from corenlp import *
corenlp = StanfordCoreNLP("path_to_stanford-corenlp-full-2014-08-27/")

the following error is raised from pexpect:

Loading Models: 0/5
Traceback (most recent call last):
File "/home/matteorr/Project1/exampleRun.py", line 4, in
corenlp = StanfordCoreNLP("/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/")
File "/home/matteorr/stanford-corenlp-pyhton/corenlp.py", line 168, in init
self.corenlp.expect("done.", timeout=20) # Load pos tagger model (~5sec)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1311, in expect
return self.expect_list(compiled_pattern_list, timeout, searchwindowsize)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1325, in expect_list
return self.expect_loop(searcher_re(pattern_list), timeout, searchwindowsize)
File "/usr/lib/python2.7/dist-packages/pexpect.py", line 1396, in expect_loop
raise EOF (str(e) + '\n' + str(self))
pexpect.EOF: End Of File (EOF) in read_nonblocking(). Exception style platform.
<pexpect.spawn object at 0x7f7106fb3650>
version: 2.3 ($Revision: 399 $)
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', '/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/joda-time.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/xom.jar:/home/matteorr/stanford-corenlp-pyhton/stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
searcher: searcher_re:
0: re.compile("done.")
buffer (last 100 chars):
before (last 100 chars): va:448)
at edu.stanford.nlp.util.StringUtils.argsToProperties(StringUtils.java:869)
... 2 more

after: <class 'pexpect.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 28392
child_fd: 3
closed: False
timeout: 30
delimiter: <class 'pexpect.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

The same script, run in the same directory as corenlp works fine.
Is this expected behavior or is something wrong?

Thanks in advance for your help.
I apologize if this was not the correct place post this issue.

Best regards,

matteorr

Installation error due to hard coding in corenlp.py

In class StanfordCoreNLP in 'corenlp.py', the jars version are hard coded, so any jars which are of updated version are not accepted hence produces an error while launching the server.

Needs change in the lookup manner.

Getting sentiment value via server implementation

Hi, i am interested in using the server implementation of your wrapper but it seems it doesn't seem to output the sentiment score while in the package implementation, there is a field for the same. What is the cause of this difference?

RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

/Users/danielsampetethiyagu/github/image_caption_using_attention/coreNlpUtil.pyc in parseText(sentences)
22 def parseText(sentences):
23
---> 24 parseResult = nlp.parse(sentences)
25
26 if len(parseResult['sentences']) == 1:

/Users/danielsampetethiyagu/github/image_caption_using_attention/coreNlpUtil.pyc in parse(self, text)
16
17 def parse(self, text):
---> 18 return json.loads(self.server.parse(text))
19
20

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in call(self, *args, **kwargs)
932 return _method(self.__req, "%s.%s" % (self.__name, name))
933 def call(self, *args, **kwargs):
--> 934 return self.__req(self.__name, args, kwargs)
935
936 #=========================================

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in __req(self, methodname, args, kwargs, id)
905 except Exception,err:
906 raise RPCTransportError(err)
--> 907 resp = self.__data_serializer.loads_response( resp_str )
908 return resp[0]
909

/Users/danielsampetethiyagu/github/image_caption_using_attention/jsonrpc.py in loads_response(self, string)
624 raise RPCInvalidMethodParams(error_data)
625 elif data["error"]["code"] == INTERNAL_ERROR:
--> 626 raise RPCInternalError(error_data)
627 elif data["error"]["code"] == PROCEDURE_EXCEPTION:
628 raise RPCProcedureException(error_data)

RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

jsonrpc import error: ValueError, err :

Hello, So iv been trying to use corenlp as a wrapper for Stanfordnlp for coreference resolution. But im having issues with the corenlp.py file. There was one error in the downloaded file which was

Exception, err:
needs to writted as
Exception as err:

But when i correct this the jsonrpc import doesnt work as a method within the import is throwing this error.

Traceback (most recent call last):
File "corenlp.py", line 24, in
import jsonrpc, pexpect
File "D:\NLP\NaturalLanguageProcessing\stanford-corenlp-python\jsonrpc.py", line 376
except ValueError, err:
^
SyntaxError: invalid syntax

Any help would be much appreciated, Thanks in advance. Also would be agreat help if you could suggest any known API's for coreference resolution or a wrapper for stanfordnlp that has coreference resolution

hardcoded lib and jar versions

I noticed some hardcoded lib and jar versions within the python source code itself. Are these libraries only compatible with certain versions of corenlp or are we expected to search through the code and change every reference to specific filenames and jars whenever update our local corenlp?

Could you please help me that explain what the result of coreference resolution means?

I tried the tools and get the result like:
Barack Obama was born in Hawaii. He is the president. Obama was elected in 2008.
"coref": [[[["He", 1, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]], [["the president", 1, 3, 2, 4], ["Barack Obama", 0, 1, 0, 2]], [["Obama", 2, 0, 0, 1], ["Barack Obama", 0, 1, 0, 2]]]]
So could you please help that what it means? Especially what the indices in the list mean?
Thank you very much!

Python3.5.3 issues

python doesn't handle:
except ValueError, err:
^
SyntaxError: invalid syntax

Needs to be "as" format. Further issues with print statements.

I can push version for py3, if you'd like. Just let me know.

-EV

pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.

python corenlp/corenlp.py -H ip -p 3456
Traceback (most recent call last):
  File "corenlp/corenlp.py", line 592, in <module>
    main()
  File "corenlp/corenlp.py", line 580, in main
    nlp = StanfordCoreNLP(options.corenlp, properties=options.properties, serving=True)
  File "corenlp/corenlp.py", line 435, in __init__
    self._spawn_corenlp()
  File "corenlp/corenlp.py", line 424, in _spawn_corenlp
    self.corenlp.expect("\nNLP> ")
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 315, in expect
    timeout, searchwindowsize, async)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 339, in expect_list
    return exp.expect_loop(timeout)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 102, in expect_loop
    return self.eof(e)
  File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 49, in eof
    raise EOF(msg)
pexpect.exceptions.EOF: End Of File (EOF). Exception style platform.
<pexpect.pty_spawn.spawn object at 0x7ff999081510>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx3g', '-cp', 'stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:stanford-corenlp-full-2014-08-27/xom.jar:stanford-corenlp-full-2014-08-27/joda-time.jar:stanford-corenlp-full-2014-08-27/jollyday.jar:stanford-corenlp-full-2014-08-27/ejml-0.23.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', '/root/corenlp-python/corenlp/default.properties']
searcher: None
buffer (last 100 chars): ''
before (last 100 chars): ' ner\r\nLoading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... '
after: <class 'pexpect.exceptions.EOF'>
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 5804
child_fd: 6
closed: False
timeout: 60
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 8192
ignorecase: False
searchwindowsize: 80
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1

corenlp.py fails for 3.9.0

I'm aware that the repo mentions the code for stanford-corenlp-3.4.1, but I had 3.9.0, and changed the path and models in corenlp.py accordingly.

Then it gets stuck on Loading models 4/5, and then throws a timeout error. Please look into this.

Certain characters lead to Internal Error

I am trying to parse the sentence

WASHINGTON — Republicans on Thursday vowed a swift and forceful response to the executive action on immigration that President Obama is to announce in a prime-time address, accusing the president of exceeding the power of his office and promising a legislative fight when they take full control of Congress next year.

but I keep getting the error

Traceback (most recent call last):
  File "client.py", line 19, in <module>
    result = nlp.parse(text2)
  File "client.py", line 12, in parse
    return json.loads(self.server.parse(text))
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 934, in __call__
    return self.__req(self.__name, args, kwargs)
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 907, in __req
    resp = self.__data_serializer.loads_response( resp_str )
  File "/Users/Pi_Joules/projects/kompact/stanford-corenlp-python/jsonrpc.py", line 626, in     loads_response
    raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

The error doesn't appear though when I remove the EM Dash () in the first sentence. The same goes for curly single and double quotes like “”. Is there any way I can still parse these characters in this wrapper?

Thanks

Attribute error in client.py

Hi, I have nltk version 3.0.3 and i am getting this error.

tree = Tree.parse(result['sentences'][0]['parsetree'])
AttributeError: type object 'Tree' has no attribute 'parse'

How do I change models for NER?

How do set the model:

ner.model.3class = /u/nlp/data/ner/goodClassifiers/all.3class.distsim.crf.ser.gz
ner.model.7class = /u/nlp/data/ner/goodClassifiers/muc.distsim.crf.ser.gz
ner.model.MISCclass = /u/nlp/data/ner/goodClassifiers/conll.distsim.crf.ser.gz

parse returning as a string rather than a dictionary.

I'm trying to follow the instructions:

from corenlp import *
corenlp = StanfordCoreNLP()
corenlp.parse("This is a test.")

When I do this it returns something like this:
'{"coref": [[[["This", 0, 0, 0, 1], ["a test", 0, 3, 2, 4]]]], "sentences": [{"parsetree": "(ROOT (S (NP (DT This)) (VP (VBZ is) (NP (DT a) (NN test))) (. .)))", "text": "This is a test.", "dependencies": [["root", "ROOT", "test"], ["nsubj", "test", "This"], ["cop", "test", "is"], ["det", "test", "a"]], "words": [["This", {"NamedEntityTag": "O", "CharacterOffsetEnd": "4", "Lemma": "this", "PartOfSpeech": "DT", "CharacterOffsetBegin": "0"}], ["is", {"NamedEntityTag": "O", "CharacterOffsetEnd": "7", "Lemma": "be", "PartOfSpeech": "VBZ", "CharacterOffsetBegin": "5"}], ["a", {"NamedEntityTag": "O", "CharacterOffsetEnd": "9", "Lemma": "a", "PartOfSpeech": "DT", "CharacterOffsetBegin": "8"}], ["test", {"NamedEntityTag": "O", "CharacterOffsetEnd": "14", "Lemma": "test", "PartOfSpeech": "NN", "CharacterOffsetBegin": "10"}], [".", {"NamedEntityTag": "O", "CharacterOffsetEnd": "15", "Lemma": ".", "PartOfSpeech": ".", "CharacterOffsetBegin": "14"}]]}]}'

Where it is a dictionary wrapped in quotes making it a string. I'm not sure what I'm doing wrong...

Abount connection refused

When I run
result = loads(server.parse("Hello world. It is so beautiful"))
It is an connection error.

Traceback (most recent call last):
File "", line 1, in
File "jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: [Errno 111] Connection refused

result = loads(server.parse("Hello world. It is so beautiful"))
Traceback (most recent call last):
File "", line 1, in
File "jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: [Errno 111] Connection refused

Corenlp.py does not go further after loading all 5 modules

Traceback (most recent call last):
  File "corenlp.py", line 257, in <module>
    nlp = StanfordCoreNLP()
  File "corenlp.py", line 178, in __init__
    self.corenlp.expect("Entering interactive shell.")
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 341, in expect
    timeout, searchwindowsize, async_)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/spawnbase.py", line 369, in expect_list
    return exp.expect_loop(timeout)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 116, in expect_loop
    return self.timeout(e)
  File "/home/whiskey/.local/lib/python2.7/site-packages/pexpect/expect.py", line 80, in timeout
    raise TIMEOUT(msg)
pexpect.exceptions.TIMEOUT: Timeout exceeded.
<pexpect.pty_spawn.spawn object at 0x7f1cbb072050>
command: /usr/bin/java
args: ['/usr/bin/java', '-Xmx1800m', '-cp', './stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1.jar:./stanford-corenlp-full-2018-02-27/stanford-corenlp-3.9.1-models.jar:./stanford-corenlp-full-2018-02-27/joda-time.jar:./stanford-corenlp-full-2018-02-27/xom.jar:./stanford-corenlp-full-2018-02-27/jollyday.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-props', 'default.properties']
buffer (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
before (last 100 chars): '[0.7 sec].\r\nAdding annotator dcoref\r\n'
after: <class 'pexpect.exceptions.TIMEOUT'>
match: None
match_index: None
exitstatus: None
flag_eof: False
pid: 7185
child_fd: 5
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
searcher: searcher_re:
    0: re.compile("Entering interactive shell.")

Support for sentiment analysis

Hi, I was planning to use the python wrapper but I am not sure if it has support for sentiment analysis like the original sanfordCoreNLP. If yes, please share some documentation.

Instanciate StanfordCoreNLP with different annotators

I'm using the StanfordCoreNLP class to do NER on some text. Then somewhere else in my program I only need to do POS tagging, but performance is uselessly slowed down by NER. I see that I can edit the default.properties file to remove the annotators I don't need, but that would change every instance of StanfordCoreNLP, which won't work.

Right now I'm thinking of modifying StanfordCoreNLP's init to allow a custom string for props to be passed, and create several files that contain the annotator lists I need. This might work for now, but I'd like to know if you see a better way, and if you'd be interested in allowing StanfordCoreNLP instances to be created with an optional annotator list.

Dependency Problem

The IDs you stripped from the dependencies in remove_id() should stay there. If two identical words occur in the same sentence, and you strip the word-id from the results, there's no way for us to easily disambiguate them (hence, why Stanford explicitly put them there)

I have an error and, if it's something you're aware of, wondered if you can help me with a fix?

python corenlp.py
Traceback (most recent call last):
File "corenlp.py", line 257, in
nlp = StanfordCoreNLP()
File "corenlp.py", line 163, in init
self.corenlp = pexpect.spawn(start_corenlp)
File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 198, in init
self._spawn(command, args, preexec_fn, dimensions)
File "/usr/local/lib/python2.7/dist-packages/pexpect/pty_spawn.py", line 271, in _spawn
'executable: %s.' % self.command)
pexpect.exceptions.ExceptionPexpect: The command was not found or was not executable: java.

how can corenlp handle non-ascii string?

I put the word 'Víctor' into corenlp.parse. 'Víctor' contains non-ascii character. I would like to get the lemma of 'Víctor'. But when I put corenlp.parse('Víctor'). It gives error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128).

How can I change corenlp setting, so corenlp can handle non-ascii string?

No Valid JSON error

I am a newbie to these tools (JSON in particular). I am getting a parse error with error code -32700. Please help how to fix?
I have attached the screenshot for the same-

capture

Multiple occurrences of a word not handled properly while creating tuples

If there are multiple occurrences of a word in a sentence, lack of ids makes it impossible to identify the source and target of a dependency correctly.

If you are open to accepting a patch for this, I can submit one. My idea is to keep the ids in the "tuples" and store the dependents of a word in the "words" array.

RPCTransportError: argument must be an int, or have a fileno() method.

Hi Guys,

I am getting this error when i am trying to parse multiple sentences parallely. Everything works fine if i perform parsing sequentially.

parseResult = nlp.parse(sentences)
File "/Users/Vikram/Kiwi/django/app/app/app/coreNlpUtil.py", line 18, in parse
return json.loads(self.server.parse(text))
File "/Users/Vikram/Kiwi/django/app/app/app//jsonrpc.py", line 933, in call
return self.__req(self.__name, args, kwargs)
File "/Users/Vikram/Kiwi/django/app/app/app//jsonrpc.py", line 906, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/Users/Vikram/Kiwi/django/app/app/app/jsonrpc.py", line 594, in loads_response
* raise RPCParseError("No valid JSON. (%s)" % str(err))
RPCParseError: <RPCFault -32700: 'Parse error.' ('No valid JSON. (No JSON object could be decoded)')>
*

parseResult = nlp.parse(sentences)
File "/Users/Vikram/Kiwi/django/app/app/app/coreNlpUtil.py", line 18, in parse
return json.loads(self.server.parse(text))
File "/Users/Vikram/Kiwi/django/app/app/app/jsonrpc.py", line 933, in call
return self.__req(self.__name, args, kwargs)
File "/Users/Vikram/Kiwi/django/app/app/app/jsonrpc.py", line 905, in __req
raise RPCTransportError(err)
RPCTransportError: argument must be an int, or have a fileno() method.

How can i fix the issue?

Error in client.py

When I ran the client.py ,it says that Tree has no such attribute 'parse'.
Also, I am not getting how to extract the dependencies using this

Parsing Q

I'm not sure why, but when I pass 'Q' to the coreNLP server, it breaks down.

Here is the code I'm using:

>>> import jsonrpc
>>> server = jsonrpc.ServerProxy(jsonrpc.JsonRpc20(),jsonrpc.TransportTcpIp(addr=("127.0.0.1", 8080)))
>>> server.parse('Q')
u'{"sentences": []}'

Here is the server error:

NLP> 
========================================
Q
Annotation pipeline timing information:
PTBTokenizerAnnotator: 0.0 sec.
WordsToSentencesAnnotator: 0.0 sec.
POSTaggerAnnotator: 0.0 sec.
MorphaAnnotator: 0.0 sec.
NERCombinerAnnotator: 0.1 sec.
ParserAnnotator: 0.5 sec.
DeterministicCorefAnnotator: 0.0 sec.
TOTAL: 0.7 sec. for 11 tokens at 16.5 tokens/sec.
Pipeline setup: 13.4 sec.
Total time for StanfordCoreNLP pipeline: 75.1 sec.

I'm not sure, if this is a feature or a bug.

jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

i am trying to parse arabic text with python and i have got this error
Traceback (most recent call last):
File "client.py", line 16, in
result = nlp.parse(u"ﻊﻗﻮﺘﻤﻟا ﻦﻣ .ﺕﺎﺑﺎﻐﻟﺎﺑ ﻯﺫﻷا ﺕﺎﻄﻗﺎﺴﺘﻟا ﻲﻓﻭ ﺓﺭاﺮﺤﻟا ﻲﻓ ﺕاﺮﻴﻐﺘﻟا ﻖﺤﻠﺗ")
File "client.py", line 13, in parse
return json.loads(self.server.parse(text))
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/arezki/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'

Hi Dustin,

I am not sure if you are aware of the problem, when I try to run the corenlp.py, I get the following error

Starting the Stanford Core NLP parser.
Loading Models: 5/5 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| plays hard to get, smiles from time to time
NLP tools loaded.
Traceback (most recent call last):
  File "corenlp.py", line 295, in <module>
    server.register_function(nlp.parse_imperative)
AttributeError: 'StanfordCoreNLP' object has no attribute 'parse_imperative'

Commenting out the line 295 solved the problem. I have quickly scanned the code, and could not locate parse_imperative method. I am not very experienced with Python, may be I have missed something.

I wanted you to know

Thanks for the great work! Keep up.

Error when processing Chinese text

After I start the server (with trained Chinese models and properties file), I test the server with a Chinese sentence by replacing the example English sentence in client.py, i.e.

#result = nlp.parse(u"Hello world!  It is so beautiful.")
result = nlp.parse(u"今天天气真不错啊!")

Traceback (most recent call last):
File "client.py", line 17, in
result = nlp.parse(u"今天天气真不错啊!")
File "client.py", line 13, in parse
return json.loads(self.server.parse(text))
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 934, in call
return self.__req(self.__name, args, kwargs)
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 907, in __req
resp = self.__data_serializer.loads_response( resp_str )
File "/home/kqc/github/stanford-corenlp-python/jsonrpc.py", line 626, in loads_response
raise RPCInternalError(error_data)
jsonrpc.RPCInternalError: <RPCFault -32603: 'Internal error.' (None)>

Could you show me how to fix this?

RPCTransportError: timed out

Hi,
I was trying to use the client.py code to parse a long paragraph. It generates the following error message:
File "/home/mings/Toolkits/stanford-corenlp-python/jsonrpc.py", line 934, in __call__
return self.__req(self.__name, args, kwargs)
File "/home/mings/Toolkits/stanford-corenlp-python/jsonrpc.py", line 906, in __req
raise RPCTransportError(err)
jsonrpc.RPCTransportError: timed out

I find this not very consistent. Sometimes, it is able to parse, but sometimes it is not.

[EDIT]
I changed the default timeouts in jsonrpc.py to 20 secs, it seems to work fine now.

[Errno 10061] No connection could be made because the target machine actively refused it

hi,
when i type server = jsonrpc.ServerProxy(jsonrpc.JsonRpc20(),jsonrpc.TransportTcpIp(addr=("127.0.0.1", 8080))) and then result = loads(server.parse("Hello world. It is so beautiful")) this eeror apears:
Traceback (most recent call last):
File "<pyshell#27>", line 1, in
result = loads(server.parse("Hello world. It is so beautiful"))
RPCTransportError: [Errno 10061] No connection could be made because the target machine actively refused it

i turn off my firewall but can not solve this error.
what should i do?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.