GithubHelp home page GithubHelp logo

Comments (2)

gc00 avatar gc00 commented on June 28, 2024

@GoodKairos ,
We don't support Open MPI right now. We have s new implementation here:
https://github.com/mpickpt/mana

Is been tested on MPICH for CentOS. We do intend to support Open MPI in the future. But maybe not on the near term unless there is enough demand.

from dmtcp.

antoinetran avatar antoinetran commented on June 28, 2024

Linked to #910

Hi @gc00 , thank you for the answer.

However I have been led to believe that DMTCP officially support OpenMPI, because this is what is written in doc:
https://dmtcp.sourceforge.io/FAQ.html#mpiCkpt

Does DMTCP support checkpointing of MPI programs?
Yes. And to restart on different hosts, edit the 'ssh' lines in dmtcp_restart_script.sh . DMTCP operates by checkpointing the sockets (or InfiniBand connections, if using --infiniband) created by the MPI library. Hence, it is transparent to MPI and doesn't require any particular MPI configuration or hooks. In principle, DMTCP should run on any MPI over TCP/IP. We usually test on [Open MPI](http://www.open-mpi.org/), [MVAPICH-2](http://mvapich.cse.ohio-state.edu/) and[ MPICH-2](http://www.mcs.anl.gov/research/projects/mpich2/). If you find an MPI that we don't support, this is a bug in DMTCP. We would be appreciative if you can file a bug report. For further details on using MPI, see [QUICK-START.md](http://github.com/dmtcp/dmtcp/tree/master/QUICK-START.md).

    As of DMTCP-2.4, DMTCP should offer robust support for popular implementations of MPI, along with support for the SLURM batch queue. (See the [example SLURM scripts for using DMTCP](https://github.com/dmtcp/dmtcp/tree/master/plugin/batch-queue/job_examples).)

I can also see examples of Slurm batch launching dmtcp with OpenMPI so I thought it would work. I spent quite a lot of time trying, only to find out that I should focus on Mana instead. Can someone fix the official doc? Thank you.

from dmtcp.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.