Comments (4)
from ipyparallel.
Got it. I don't think there's any feature request here, but perhaps an example or some docs. I don't know much about torch or what ddp is, but if you can provide some examples of how workers are started with it, I may be able to give a hint to how to get them into IPython Parallel. Usually the easiest way is to start IPython Parallel and then run whatever startup code the workers use in the IPython session.
The second way is to use some code injection to launch an IPython interpreter in each worker after they are launched in whatever way your tool usually does. That's all the information I can give without knowing anything about the situation.
from ipyparallel.
Is there a task here? I'm not sure why this has been opened as an issue on this repo.
You can use IPython Parallel for a certain kind of debugging in parallel (it's not a debugger and certainly not a parallel debugger, which is very challenging), but it can be used to get an interactive interpreter in each of your worker processes for poking around.
from ipyparallel.
Having an interactive interpreter during parallel training is already quite appealing, especially now that deep learning increasingly relies on a significant amount of resources. This could be a scenario where ipyparallel can shine.
This issue primarily serves as a feature request or question because initiating torch ddp requires some additional setup, which remains challenging for regular users like myself.
from ipyparallel.
Related Issues (20)
- No module named 'jupyter_server' HOT 2
- Transition from `CompositeError` to builtin `ExceptionGroup` HOT 1
- ipcluster nbextension enable not working after notebook upgrade HOT 2
- Print in multiprocessing.Process crashing the engine HOT 7
- Windows ssh support by ipcluster HOT 33
- map_sync with pandas operation function does not finish. HOT 1
- Py3.10 code serialization does not work on PyPy3.10
- sync_imports not working as intended HOT 9
- ipyparallel and pymoo doesn't work HOT 2
- AsyncResult.join doesn't work
- AsyncResult.abort() call hangs if not all jobs can be stopped HOT 1
- Question: engines and databases HOT 1
- BroadcastView map Not Implemented HOT 3
- Cannot run ipythonparallel with openmpi HOT 7
- 60s timeout on get_connection_info() is not configurable HOT 1
- please release/tag/pypi the current version as it supports JupyterLab 4.x HOT 2
- SSHEngineLauncher does not work as expected HOT 2
- Outstanding task on client but hub says completed when using broadcast view
- Entrypoints should be phased out
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ipyparallel.