ralhei / pyrserve Goto Github PK
View Code? Open in Web Editor NEWA python client for Rserve (network layer to remote R server)
License: Other
A python client for Rserve (network layer to remote R server)
License: Other
I could really use this in my project; I'm struggling to figure out how to add it. Maybe with a little help, I could contribute it?
In rserve news:
Additions in version 1.7
... Another major change is the new, optional object capability mode in which all commands are disabled except for CMD_OCcall. In this mode the server does not send an ID string, but instead sends a regular QAP1 message with CMD_OCinit. This message is guaranteed to have at least 16 bytes of payload so it will satisfy the read for an ID string. The command has been chosen to correspond to "RsOC" (in little-endian) as to identify this mode. The payload is DT_SEXP which holds all initial capabilities that can be used in CMD_OCcall. Each CMD_OCcall is DT_SEXP encoding a call (i.e., LANGSXP) with an OCref object in place of the closure. Rserve will de-reference it before calling eval. The main purpose of this mode is to create a basis for a secure interface where arbitrary evaluation is not possible. Only code exposed by capabilities can be executed.
currently, the S4 data typecode isn’t supported. here a bit of info how they work:
S4 classes are simply data-less objects of a special type (S4SXP) with attributes.
a formal class definition (e.g. getClass('dgeMatrix')
→ .__C__dgeMatrix
) contains data useful for validation (i.e. the types the attributes may take on and validation functions) and the class hierarchy.
attributes present in the class definition are called “slots” and can be accessed by slot()
and the @
operator.
a NULL in a slot is represented by the name
object `\001NULL\001`
(since attributes can’t be NULL
itself)
i propose we simply convert them to a class S4(dict)
that has all the slots as dict entries. if users want the class metadata, they can call getClass
or more specific functions in R, and when you send a S4(slot1=spam, slot2=eggs)
from python to R, R will do the validation.
deserialization would be easy: just extract and convert the attributes, making sure to convert `\001NULL\001`
→ None
Great package. I'd like to be able to go back and forth from TaggedLists to ordered dictionaries. I was thinking that we could potentially throw a warning if the TaggedList has multiple keys the same name as such:
from pyRserve import TaggedList
from collections import OrderedDict
import warnings
class TaggedListKeyDuplicates(Warning):
pass
class TaggedList2(TaggedList):
@classmethod
def from_dict(cls, dict_):
return TaggedList2(list(dict_.items()))
def asdict(self):
if len(set(self.keys)) != len(self.keys):
warnings.warn('Items in list have non-unique names; data may be lost', TaggedListKeyDuplicates)
return OrderedDict(self.astuples())
So you could use it like this:
d = TaggedList2.from_dict({'a':[1,2,3], 'b':2})
d.asdict()
If I packed this up w/ some docs and unit-tests, would you accept a PR?
Thanks for this very useful package, first of all. I can only assume this is a parser error based on the traceback, if it is not sorry.
[127.0.0.1:2200] out: >>> c.eval("packageVersion('RSQLite')")
[127.0.0.1:2200] out: [array([1, 0, 0], dtype=int32)]
# Connect to a DB, or create it in the wd() if it doesn't exist
[127.0.0.1:2200] out: >>> c.eval("con <- dbConnect(RSQLite::SQLite(), dbname='testdb')")
[127.0.0.1:2200] out: Traceback (most recent call last):
[127.0.0.1:2200] out: File "<console>", line 1, in <module>
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rconn.py", line 76, in decoCheckIfClosed
[127.0.0.1:2200] out: return func(self, *args, **kw)
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rconn.py", line 159, in eval
[127.0.0.1:2200] out: message = rparse(src, atomicArray=atomicArray)
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rparser.py", line 606, in rparse
[127.0.0.1:2200] out: return rparser.parse()
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rparser.py", line 414, in parse
[127.0.0.1:2200] out: message = self._parse()
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rparser.py", line 441, in _parse
[127.0.0.1:2200] out: expression = self._parseExpr()
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rparser.py", line 448, in _parseExpr
[127.0.0.1:2200] out: lexeme = self.lexer.nextExprHdr()
[127.0.0.1:2200] out: File "/server/env.example.com/local/lib/python2.7/site-packages/pyRserve/rparser.py", line 265, in nextExprHdr
[127.0.0.1:2200] out: (hex(rTypeCode), startLexpos, length))
[127.0.0.1:2200] out: RParserError: Unknown SEXP type 0x7 found at lexpos 20, length 180
# The connection and the DB is made, despite the warnings
[127.0.0.1:2200] out: >>> c.eval('class(con)')
[127.0.0.1:2200] out: AttrArray(['SQLiteConnection'],
[127.0.0.1:2200] out: dtype='|S16', attr={'package': array(['RSQLite'],
[127.0.0.1:2200] out: dtype='|S7')})
Just for reference, the same thing in R would be:
> con <- dbConnect(RSQLite::SQLite(), dbname='testdb')
> con
<SQLiteConnection>
> class(con)
[1] "SQLiteConnection"
attr(,"package")
[1] "RSQLite"
I could imagine a parser error on <SQLiteConnection>
, but note that there is no output from the command generating the error, con <- dbConnect(RSQLite::SQLite(), dbname='testdb')
.
I believe this should reproduce it:
$ R CMD Rserve --no-save --RS-port 54321
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
[...]
Rserve started in daemon mode.
$ python -c "import pyRserve; import numpy; print(pyRserve.__version__); print(numpy.__version__); pyRserve.connect(port=54321).eval('1')"
0.9.1
1.16.3
[...]/pyRserve/rparser.py:316: DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead
data = numpy.fromstring(raw, dtype=numpyMap[lexeme.rTypeCode])
Exception class EndOfDataError(RserveError)
in rparser.py should instead extend from rexceptions.PyRserveError
.
Somewhere around 32M entries, vectors can’t be transferred anymore
i assume DT_LARGE
would be used for that?
I'm new in pyRserve and Rserve, and I have a problem with data.frame representation. I do have a function that read a data.frame object and print 1 variable content, given by :
### test.R
r_test <- function(dataset, varname) {
print (dataset[, varname])
}
and a main function in python, that uses the previous function on the iris data.frame :
### main.py
import pyRserve
conn = pyRserve.connect()
conn.r.data('iris')
conn.r.source("tmp.R")
conn.r.r_test(conn.r.iris, 'Species')
This return an error :
pyRserve.rexceptions.REvalError: Error in dataset[, varname] : number of dimensions incorrect
I tried to assign a value to a TaggedList element, but I got the following error:
TypeError: 'builtin_function_or_method' object is not subscriptable
The cause appears to be a mistake in how the index
function is called in __setitem__
: square brackets are used instead of parentheses. __delitem__
has the same mistake. Compare to __getitem__
, which works correctly. The following is from the TaggedList definition in taggedContainers.py from master:
def __getitem__(self, i):
if type(i) == str:
i = self.keys.index(i)
return self.values[i]
def __setitem__(self, i, item):
if type(i) == str:
i = self.keys.index[i]
self.values[i] = item
def __delitem__(self, i):
if type(i) == str:
i = self.keys.index[i]
del self.keys[i]
del self.values[i]
I'm trying to use pyRserve to retrieve objects returned by functions in R's survey
package. However, for some objects I get the following error:
...
File "c:\programs\Anaconda3\lib\site-packages\pyRserve\rparser.py", line 547, in xt_array data.attr[tag] = value
TypeError: list indices must be integers, not str
I can successfully retrieve other survey
objects from my R program, so I don't think it's an error in how I'm writing the code. Here's an example of the structure of one of the objects I'm having trouble with:
> str(rowper_marg)
Class 'svystat' atomic [1:4] 0.0216 0.0504 0.2164 0.7116
..- attr(*, "var")= num [1:4, 1:4] 1.14e-06 -2.25e-07 -4.29e-08 -8.77e-07 -2.25e-07 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:4] "RSKPKCIG(1) No risk" "RSKPKCIG(2) Slight risk" "RSKPKCIG(3) Moderate risk" "RSKPKCIG(4) Great risk"
.. .. ..$ : chr [1:4] "RSKPKCIG(1) No risk" "RSKPKCIG(2) Slight risk" "RSKPKCIG(3) Moderate risk" "RSKPKCIG(4) Great risk"
..- attr(*, "statistic")= chr "mean"
I can work around the error by using accessor functions provided by survey
to return simpler objects which do work with pyRserve. For example, I can use the coef
function to get just the coefficients stored in the svystat
object. The structure of that object in R looks like:
> str(coef(rowper_marg))
Named num [1:4] 0.0216 0.0504 0.2164 0.7116
- attr(*, "names")= chr [1:4] "RSKPKCIG(1) No risk" "RSKPKCIG(2) Slight risk" "RSKPKCIG(3) Moderate risk" "RSKPKCIG(4) Great risk"
And when I pull this in through pyRserve, the resulting Python object looks like:
TaggedArray([ 0.02155724, 0.05039401, 0.21641401, 0.71163474], key=['RSKPKCIG(1) No risk', 'RSKPKCIG(2) Slight risk', 'RSKPKCIG(3) Moderate risk', 'RSKPKCIG(4) Great risk'])
I'm using pyRserve 0.8.4 on Windows 7 with R 3.1.2. I could test on Linux if you think it would make a difference. I'm happy to provide additional information if necessary.
import pyRserve
produces an error in Python 3.4.1. The problem is the except
statement in rconn.py
on line 79:
except socket.error, msg:
Using the comma in except
statements is invalid syntax in Python 3. Using as
in place of the comma works in Python 3, but I'm not sure if it will work in the older supported Python versions.
yet setup.py declares it as such.
Could you add pytest in setup.py using
tests_require=['pytest'],
instead?
Just found this package, I think it will be useful. I configured my Rserve to require authentication, but I can't figure out how to reciprocate on the client.
Hi, all
pyRserve is very nice project, and I want to use it in my web app. I have a question that how to make conn.voidEavl() function to be asynchronous, for example combining it with aiohttp.
Thanks in advance.
see here
>>> a = np.array([('a', 1), ('b', 2)], dtype=[('h1', '<U1'), ('h2', '<i4')])
>>> a['h1']
array(['a', 'b'], dtype='<U1')
>>> a[a.dtype.names[0]]
array(['a', 'b'], dtype='<U1')
>>> a[0]
('a', 1)
and if we really need easy indexing by col number, this class is better in every respect (performance, versatility, …)
This should reproduce it:
$ R CMD Rserve --no-save --RS-port 54321
R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
[...]
$ python --version
Python 3.6.8
$ python -c "import pyRserve; print(pyRserve.__version__); pyRserve.connect(port=54321).eval('NA_character_')"
0.9.1
[...]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
If you try to install from the source distribution hosted on PyPI, you get
* Getting build dependencies for wheel...
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
[…]
File "/usr/lib/python3.11/site-packages/setuptools/build_meta.py", line 338, in run_setup
exec(code, locals())
File "<string>", line 7, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
I suggest switching to pyproject.toml and away from setuptools to stop having to deal with its MANIFEST.in sillyness.
Python 3.5.1 (default, Nov 10 2016, 10:33:53)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyRserve
>>> conn = pyRserve.connect()
>>> print(conn)
<Handle to Rserve on localhost:6311>
>>> conn.eval('2+3')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/toto/.pyenv/versions/tata/lib/python3.5/site-packages/pyRserve/rconn.py", line 78, in decoCheckIfClosed
return func(self, *args, **kw)
File "/home/toto/.pyenv/versions/tata/lib/python3.5/site-packages/pyRserve/rconn.py", line 157, in eval
self._reval(aString, void)
File "/home/toto/.pyenv/versions/tata/lib/python3.5/site-packages/pyRserve/rconn.py", line 144, in _reval
rEval(aString, fp=self.sock, void=void)
File "/home/toto/.pyenv/versions/tata/lib/python3.5/site-packages/pyRserve/rserializer.py", line 404, in rEval
return s.finalize()
File "/home/toto/.pyenv/versions/tata/lib/python3.5/site-packages/pyRserve/rserializer.py", line 103, in finalize
self._buffer.write('\x00\x00\x00\x00') # data offset, zero by default
TypeError: a bytes-like object is required, not 'str'
>>> quit()
It works if I comment the following line:
self._buffer.write('\x00\x00\x00\x00') # data offset, zero by default
Any idea?
Dear Sir,
I found the package works well on not heavy calculation. However, in my case, I pass a list of tensor to perform the tensor calculation. It reported an error (An existing connection was forcibly closed by the remote host) when running rparser.py.
The tensor is sparse. I am not sure what work rparser.py is doing. I am looking forward to your reply.
Sincerely
Can one call R graphics functions from Python via pyRserve?
could you please do
git rebase -i d7a999f^
#now change the “pick” in front of the hash d7a999f to “edit”
git commit --amend --author="Phil Schaf <[email protected]>"
git rebase --continue
git push -f
i’d really like to be listed as contributor by github :)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.