uqfoundation / pathos Goto Github PK
View Code? Open in Web Editor NEWparallel graph management and execution in heterogeneous computing
Home Page: http://pathos.rtfd.io
License: Other
parallel graph management and execution in heterogeneous computing
Home Page: http://pathos.rtfd.io
License: Other
both get parent of pid, and get all children of ppid; similar for pgid
better fit in pox?
convert pp-1.6.4
fork to python3
needs test cases for both 'interactive' and 'scripted' behavior (see #12)
Something like:
from pathos.connection import Launcher
from pathos.secure.connection import Launcher
from pathos.secure.copier import Launcher
from pathos.secure.tunnel import Tunnel
and core
to:
from pathos import copy, getpid, getppid # etc
While trying to install processing-0.52-pathos.zip on my Windows-64bit version:
pip install processing-0.52-pathos.zip
... I get an error (see below). It seems like Windows-64-bit is not supported.
Are there any plans to support Windows-64 bit?
Thanks.
Traceback (most recent call last):
File "", line 1, in
File "c:\users<username>\appdata\local\temp\pip-rwarpe-build\setup.py", line 210,
in
'Programming Language :: Python',
File "C:\Python27\lib\distutils\core.py", line 151, in setup
dist.run_commands()
File "C:\Python27\lib\distutils\dist.py", line 953, in run_commands
self.run_command(cmd)
File "C:\Python27\lib\distutils\dist.py", line 972, in run_command
cmd_obj.run()
File "C:\Python27\lib\site-packages\setuptools\command\install.py", line 61, i
n run
return orig.install.run(self)
File "C:\Python27\lib\distutils\command\install.py", line 563, in run
self.run_command('build')
File "C:\Python27\lib\distutils\cmd.py", line 326, in run_command
self.distribution.run_command(command)
File "C:\Python27\lib\distutils\dist.py", line 972, in run_command
cmd_obj.run()
File "C:\Python27\lib\distutils\command\build.py", line 127, in run
self.run_command(cmd_name)
File "C:\Python27\lib\distutils\cmd.py", line 326, in run_command
self.distribution.run_command(command)
File "C:\Python27\lib\distutils\dist.py", line 972, in run_command
cmd_obj.run()
File "C:\Python27\lib\site-packages\setuptools\command\build_ext.py", line 54,
in run
_build_ext.run(self)
File "C:\Python27\lib\distutils\command\build_ext.py", line 337, in run
self.build_extensions()
File "C:\Python27\lib\distutils\command\build_ext.py", line 446, in build_exte
nsions
self.build_extension(ext)
File "C:\Python27\lib\site-packages\setuptools\command\build_ext.py", line 187
, in build_extension
_build_ext.build_extension(self, ext)
File "C:\Python27\lib\distutils\command\build_ext.py", line 496, in build_exte
nsion
depends=ext.depends)
File "C:\Python27\lib\distutils\msvc9compiler.py", line 473, in compile
self.initialize()
File "C:\Python27\lib\distutils\msvc9compiler.py", line 383, in initialize
vc_env = query_vcvarsall(VERSION, plat_spec)
File "C:\Python27\lib\distutils\msvc9compiler.py", line 299, in query_vcvarsal
l
raise ValueError(str(list(result.keys())))
ValueError: [u'path']
Cleaning up...
Command C:\Python27\python.exe -c "import setuptools, tokenize;file='c:\use
rs<username>\appdata\local\temp\pip-rwarpe-build\setup.py';exec(compile(getat
tr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'ex
ec'))" install --record c:\users<username>\appdata\local\temp\pip-iw7i9v-record\ins
tall-record.txt --single-version-externally-managed --compile failed with error
code 1 in c:\users<username>\appdata\local\temp\pip-rwarpe-build
Storing debug log for failure in C:\Users<username>\pip\pip.log
this example could also be a first step at making a simple parallel map through ssh.
should popen-based methods in pathos.core
be extended to meet the pipe and map API?
add tutorial to cover major features; should be built white-paper style to demonstrate solving a problem or set of problems
I have a large sequence of data on disk. I'd like to stream it into memory, distribute it to various processes where expensive work is done, collect the results piece by piece onto a master process and perform a reduction. I.e. I would like the following to work:
from pathos.multiprocessing import Pool
p = Pool(32)
seq = load_lazily_from_disk(...)
out = p.map(func, seq, chunksize=100)
result = reduce(binop, out)
Sadly this doesn't work, because p.map
fully evaluates my sequence. See
def mapAsync(self, func, iterable, chunksize=None, callback=None):
'''
Asynchronous equivalent of `map()` builtin
'''
assert self._state == RUN
if not hasattr(iterable, '__len__'):
iterable = list(iterable) # <---fully evaluate
Maybe we can avoid this somehow? The combination of multiprocessing and streaming would be very helpful for me in particular (and others generally I think.)
Tried to install pathos. I followed the instructions according to the readme file in pathon/external with two exception:
To install the external packages:
I had to use pp-1.6.4.4.zip, otherwise the pathos installation tried to download 1.6.4.4 and failed with an error: ImportError: cannot import name version
$ unzip pp-1.6.4.2.zip
$ cd ../pp-1.6.4.2
$ python setup.py build
$ python setup.py install
The setup.py file was not in the folder pathos/external/pyre but in the folder pathos/external
$ cd ../pyre-0.8-pathos
$ python setup.py build
$ python setup.py install
After these changes the installation of pathos seemed to work.
lawyers...
I love being able to just add ==, e.g. Django==1.7.1, to my requirements file, and have it automatically install whenever i call pip install -r requirements.txt. I was simply wondering if pathos could be added to pip so that I would be able to do that, as pathos seems like a really cool library.
fold in hydra-style graph proxies
pathos.core needs refactoring and extension to provide popen/ssh pipes and maps, fitting with the rest of the package.
>>> from pathos.pp import ParallelPythonPool
>>> pool = ParallelPythonPool()
>>> pool
<pool ParallelPythonPool(ncpus=*, servers=None)>
>>>
>>> f = lambda x:x**2
>>>
>>> # this works, because dill looks for a lambda "named" f
>>> pool.map(f, [1,2,3,4,5])
[1, 4, 9, 16, 25]
>>>
>>> # this fails to find our new lambda, as it's not in the calling namespace.
>>> pool.map(lambda x:x**3, [1,2,3,4,5])
[1, 4, 9, 16, 25]
If the order of the above map
calls was reversed, a NameError
(or some error) should be thrown. However, instead, the pp
session hangs.
multiprocessing
has a known issue where certain global random states (and thus the seeds) are copied to all the spawned processes (most notably for anything depending on numpy.random
). A potential fix for this is to provide processing
(the pathos
fork) or pathos.multiprocessing
with a "set random_state" function. Then optionally, provide some API extension to enable easy triggering of the "special" seeding (i.e. generating a new random state for each process).
after running mystic/examples_UQ/MM2_surrogate_diam_batchgrid.py, the spawned python instances are not shut down… thus requiring killall python
to be run to kill all the zombie jobs.
on linux cluster with python 2.6.4, MM2_surrogate_diam_batchgrid.py fails with:
Traceback (most recent call last):
File "MM2_surrogate_diam_batchgrid.py", line 197, in <module>
diameter = UQ(RVstart,RVend,lower_bounds,upper_bounds)
File "MM2_surrogate_diam_batchgrid.py", line 126, in UQ
results = Pool(nnodes).map(optimize, cf,lb,ub,nb)
File "/home/mmckerns/lib/python2.6/site-packages/pathos-0.2a1.dev-py2.6.egg/pathos/multiprocessing.py", line 108, in map
return __STATE['pool'].map(star(f), zip(*args)) # chunksize
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/pool.py", line 130, in map
return self.mapAsync(func, iterable, chunksize).get()
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/pool.py", line 373, in get
raise self._value
thread.error: can't start new thread
similarly, MPI2_surrogate_diam_batchgrid.py fails with:
Traceback (most recent call last):
File "/home/mmckerns/bin/ezpool.py", line 5, in <module>
pkg_resources.run_script('pyina==0.2a1.dev', 'ezpool.py')
File "/home/mmckerns/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/pkg_resources.py", line 489, in run_script
File "/home/mmckerns/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/pkg_resources.py", line 1207, in run_script
File "/home/mmckerns/lib/python2.6/site-packages/pyina-0.2a1.dev-py2.6.egg/EGG-INFO/scripts/ezpool.py", line 57, in <module>
res = parallel_map(func, *args, **kwds) #XXX: called on ALL nodes ?
File "/home/mmckerns/lib/python2.6/site-packages/pyina-0.2a1.dev-py2.6.egg/pyina/mpi_pool.py", line 65, in parallel_map
pool = MPool(1) #XXX: poor pickling... use iSend/iRecv instead?
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/pool.py", line 100, in __init__
self._task_handler.start()
File "/usr/local/python-2.6.4/lib/python2.6/threading.py", line 471, in start
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread
solved: [104.90247657509542, 1.0639249013840923e-10, 2.2856829028068693, 60.000000000004114]
solved: [98.740640792599351, 0.68621813955341593, 2.1000000001085253, 2.7999999999909924]
Both work fine on mac OSX with python 2.7.8 and 2.6.9
For pathos.multiprocessing
, async_map
, produces the correct results...
dude@hilbert>$ python async_map.py
<pool ThreadingPool(nthreads=4)>
y = busy_add(x1,x2)
x1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x2 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
I'm sleepy...
Z z z z z z z z z z z z z z z z z z z z z z z z z z z
I'm awake
y = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
However, for pathos.pp
, the results are wrong…
dude@hilbert>$ python async_map.py
<pool ParallelPythonPool(ncpus=4, servers=None)>
y = busy_add(x1,x2)
x1 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
x2 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
I'm sleepy...
Z z z z z z z z z z z z z z z z z z z z z z z z z
I'm awake
y = [0, None, None, None, None, None, None, None, None, None]
The design and implementation of the ssh
launchers in IPython.parallel.apps.launcher.
are similar to that of pathos
ssh
launcher objects. These should use a common API (and probably be able to use paramiko
).
some support is provided (e.g. through pp
) for pools that spawn through a mix of remote and local workers. this needs to be generalized. for most cases, pathos
graphs are fixed -- there can be nested parallelism, but in general mixed pools aren't possible right now. Should look more closely at scoop
, as it appears they have some experience in this space.
The design and implementation of pool
and map
in IPython.parallel
is similar to that of pathos
pool and
map` objects. These should use a common API.
possibly better naming is needed, at least some thought.
Migrated from: http://trac.mystic.cacr.caltech.edu/project/pathos/ticket/144
(test-pathos)mrocklin@linux2:~$ pip install pathos
Downloading/unpacking pathos
Could not find a version that satisfies the requirement pathos (from versions: 0.1a1)
Cleaning up...
No distributions matching the version for pathos
Storing complete log in /home/mrocklin/.pip/pip.log
should show an example for both directly and through a ssh-tunnel
migrate fork of 1.5.7 to fork of 1.6.4
convert to python3.x
After checking out master and then running sudo python setup.py install
, I see:
***********************************************************
WARNING: One of the following dependencies is unresolved:
pp >=1.6.4.4
pyre ==0.8.2.0-pathos
dill >=0.2.2
pox >=0.2.1
processing ==0.52-pathos
***********************************************************
Pathos relies on modified distributions of 'processing', 'pp', and 'pyre'.
Please download and install unresolved dependencies here:
http://dev.danse.us/packages/
or from the "external" directory included in the pathos source distribution.
but as far as I can tell, everything is installed:
$ pip freeze | grep processing
processing==0.52-pathos
$ pip freeze | grep pp
pp==1.6.4.4
$ pip freeze | grep pox
pox==0.2.1
$ pip freeze | grep dill
dill==0.2.2
$ pip freeze | grep pyre
pyre==0.8.2.0-pathos
pp-1.6.4.3
was moved to pp-1.6.4.4/python3
, as it has slightly reduced functionality in python2
as compared to pp-1.6.4.1
. pp-1.6.4.1
was slightly updated and moved to pp-1.6.4.4/python2
. Some examples of the reduced functionality can be seen in issue #32.
Most examples work with pp-1.6.4.3
as well as pp-1.6.4.1
. However, a few do not.
These examples only work with pp-1.6.4.1
, and will hang using pp-1.6.4.3
:
(on MacOSX, using python 2.7.8 and 2.6.9)
this would help reduce cleanup when things go badly
On a linux cluster with python 2.6.4, pathos/examples/test_mpmap.py fails (and hangs):
Evaluate 5 items on 10 proc:
Traceback (most recent call last):
File "test_mpmap.py", line 24, in <module>
pool.ncpus = 10
File "/home/mmckerns/lib/python2.6/site-packages/pathos-0.2a1.dev-py2.6.egg/pathos/multiprocessing.py", line 142, in __set_nodes
__STATE['pool'] = Pool(nodes)
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/pool.py", line 92, in __init__
w.start()
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/process.py", line 96, in start
self._popen = Popen(self)
File "/home/mmckerns/lib/python2.6/site-packages/processing-0.52_pathos-py2.6-linux-x86_64.egg/processing/forking.py", line 52, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
However, on mac OSX with python 2.7.8 and python 2.6.9, it works fine.
Kindly help...
I have this simply script that runs on Linux (Ubuntu) without any problem, but gives AuthenticationError: digest sent was rejected on Windows and i know its something to do with multiprocessing.Manager, but don't know how to fix it yet:
Here is more details: http://stackoverflow.com/questions/26600561/python-multiprocessing-pathos-authenticationerror-digest-sent-was-rejected
I would like to know why there isn't any pool.close(); pool.join() methods? Are they missing
recent updates to pathos.core
tend to cause scripts to hang, with kill -9
cleanup to do.
pathos/examples2/optimize_cheby_powell_mpimap.py works as expected on linux_cluster with python 2.6.4. On mac OSX with python 2.7.8 and python 2.6.9, this example also runs to completion with the correct result; however the print to stdout occurs N times when the code is run N-way parallel.
Hi,
I am trying to use pathos with pypy, however getting errors related to ctypes which is used by dill
Traceback (most recent call last):
File "app_main.py", line 75, in run_toplevel
File "adjust_logs_parser.py", line 17, in
import pathos.multiprocessing as mp
File "/usr/local/lib/python2.7/dist-packages/pathos-0.2a.dev_20130811-py2.7.egg/pathos/init.py", line 40, in
import multiprocessing
File "/usr/local/lib/python2.7/dist-packages/pathos-0.2a.dev_20130811-py2.7.egg/pathos/multiprocessing.py", line 69, in
from pathos.helpers.mp_helper import starargs as star
File "/usr/local/lib/python2.7/dist-packages/pathos-0.2a.dev_20130811-py2.7.egg/pathos/helpers/init.py", line 1, in
import pp_helper
File "/usr/local/lib/python2.7/dist-packages/pathos-0.2a.dev_20130811-py2.7.egg/pathos/helpers/pp_helper.py", line 12, in
import dill as pickle
File "/usr/local/lib/python2.7/dist-packages/dill/init.py", line 26, in
from .dill import dump, dumps, load, loads, dump_session, load_session,
File "/usr/local/lib/python2.7/dist-packages/dill/dill.py", line 528, in
ctypes.pythonapi.PyCell_New.restype = ctypes.py_object
AttributeError: 'module' object has no attribute 'pythonapi'
is pathos not compatible with pypy ? is there workaround for the error ?
Thanks,
Apparently it's possible to use multiprocessing for distributed computing -- this should be looked at as an alternative to pp
.
what to do with the Server
abstraction? is it needed? could it be utilized for pp
and others backends? What about the existing XMLRPCserver
?
As fork of parallel python was updated use dill
and run python 2.x
or 3.x
, it now requires six
. However pp
does not install dependencies, so if six
is missing, it is not installed. This blows the pathos
install.
some corner cases broken with pp for pathos
see:
pathos/tests/test_pp.py
math.sin
or __builtin__.abs
I often accidentally import the wrong pool…
from pathos.multiprocessing import Pool
This currently is multiprocessing.Pool
, and not
from pathos.multiprocessing import ProcessingPool
which is the pathos
version that's the intended import.
Some cleanup and strategic import restructuring is probably needed.
migrate fork from 0.52 to latest 2.x (and 3.x) series
could leverage shared memory using ctypes
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.