Comments (11)
Jiajun and I talked, and we have a theory of the cause of the bug. It also seems to be confirmed by an experiment.
We believe that the MVAPICH libraries have constructors that create sockets even before main(). Because DMTCP is first in library search order, it is initialized last. So, the MVAPICH constructors run before DMTCP. Later, the DMTCP constructor (dmtcpWorker) calls scanForPreexisting(), which looks for things like named pipes, UNIX domain sockets (corresponding to filenames on disk), pty's, etc. These are declared as preexisting devices. So, at the time of DMTCP launch, these sockets are declared as preexisting. Much later, during restart, DMTCP remembers that these are preexisting devices, and so the leader from leader election refuses to believe that these fd's are shared fd's. So, these fd's are removed from outgoingCons. Meanwhile, the non-leader processes list these fd's in missingCons, and block while waiting for the leader to send these shared fd's.
We proved this theory by modifying scanForPreexisting() to also exclude any device beginning with the name "socket" as preexisting. Then, these fd's are no longer considered preexisting, and the bug goes away.
So, now we need to find a more robust bug fix. Maybe some devices from /proc/*/fd truly are preexisting, even if their names is of the form "socket[...]". What is the best way to distinguish between these fd's created by the MVAPICH constructor, and the truly preexisting ones?
from dmtcp.
I think the fix should be to somehow capture the call to socket() originating from MVAPICH constructors. Can you check if the socket() calls lands in our wrappers? The question of truly preexisting sockets becomes moot if we use this fix.
If we can't do that, it's much harder to figure out a way to restore the socket in a generic way. We can probably find a hack for MVAPICH but it would be nice to have a more generic solution.
from dmtcp.
Unfortunately, I wasn't able to capture the creation of the sockets. I wonder if there're other ways to create the sockets? I put JNOTE in all functions inside socketwrapper.cpp, but none of them was invoked.
from dmtcp.
That's interesting. Then it seems like the constructor inside MVAPICH
is using a different library call other than socket(). Could it be
socketpair()?
Jiajun, since we have access to the MVAPICH developers, why
not ask them where file descriptors 3 and 4 come from, while explaining
that it seems to happen in a constructor before main, probably as part
of the MPI initialization.
I agree with Kapil's observation that if we can capture the original
creation of the socket, then we will have a more robust way of determining
preexisting sockets.
On Fri, Apr 10, 2015 at 12:26:57AM -0700, jiajuncao wrote:
Unfortunately, I wasn't able to capture the creation of the sockets. I wonder if there're other ways to create the sockets? I put JNOTE in all functions inside socketwrapper.cpp, but none of them was invoked.
from dmtcp.
One possibility is to use strace to see the order of syscalls to get an idea about the socket call.
from dmtcp.
Actually the sockets are there before dmtcp_prepare_wrappers() is called. Have we met this scenario before?
from dmtcp.
I think the sockets are inherited from the parent process. So we cannot assume that pre-existing sockets are not shared.
from dmtcp.
I think the sockets are inherited from the parent process. So we cannot
assume that pre-existing sockets are not shared.Do you mean the sockets are present within the dmtcp_launch process? This
is less likely (although I have seen such situations where the shell itself
has a socket connection).
If the socket is not present in the dmtcp_launch process, then it must have
been created during the application launch. The only way to figure it out
is to use strace and do some analysis. You might also want to look for some
odd syscall that creates a socket as a side-effect.
from dmtcp.
I think it's the former case: the sockets are present in the dmtcp_launch process. In fact, the way we invoke dmtcp at Stampede is as follows: ibrun dmtcp_launch a.out. Here ibrun is a wrapper around mpirun_rsh. It will launch mpi spawn process on each node, which then will fork the real computing processes. Only the computing processes are running under dmtcp. I think the sockets are passed from the spawn process.
from dmtcp.
I think it's the former case: the sockets are present in the dmtcp_launch
process. In fact, the way we invoke dmtcp at Stampede is as follows: ibrun
dmtcp_launch a.out. Here ibrun is a wrapper around mpirun_rsh. It will
launch mpi spawn process on each node, which then will fork the real
computing processes. Only the computing processes are running under dmtcp.
I think the sockets are passed from the spawn process.
In this case, we need to figure out more information about those sockets
and may be write a separate plugin to handle ckpt/restore.
from dmtcp.
Commit 6503a5f is labelled as a fix for this issue. @jiajuncao: if this issue is truly now fixed, could you close this? Thanks.
from dmtcp.
Related Issues (20)
- Segmentation fault at restart
- dmtcp in docker on apple silicon HOT 5
- dmtcp CI is broken for master branch - root cause: python3.8 vs python3.10 pty module HOT 2
- add soversion
- dmtcp build failed on ppc64le, aarch64 and s390x architecture HOT 1
- The last few checkpoints of the dmtcp save are particularly slow
- INSTALL.md mentions non-existing command line option --no-coordinator for dmtcp_launch HOT 1
- Release 3.0, Windows Subsystem for Linux with Ubuntu 22.04 LTS: all checks fail HOT 1
- Release 3.0, Windows Subsystem for Linux with Ubuntu 22.04 LTS: restart doesn't work HOT 1
- DMTCP build is broken with recent PR 1061 HOT 1
- Duplicating(forking) a checkpointed process? HOT 2
- Segmentation fault on dmtcp (2.6.0) using MPICH (4.2.0)
- Segfault when I set 2 ckpts in a program using share memory HOT 1
- Using '--enable-logging' leads to hang in a simplest case HOT 5
- Segfault when using gethostbyname() after dmtcp_restart HOT 1
- "dmtcp_command -kc" does not kill the node after checkpoint
- Fails ssh1 check on Rocky 8.10 HOT 1
- Hang when running DMTCP in docker and importing scipy.spatial HOT 1
- "Assertion failed: mmappedat == area.addr" under proot
- Move jalloc.cpp:dwcase and friends to DMTCP core
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dmtcp.