boostorg / mpi Goto Github PK

View Code? Open in Web Editor NEW

59.0 19.0 63.0 1.16 MB

Boost.org mpi module

Home Page: http://boost.org/libs/mpi

Python 2.29% C++ 97.16% HTML 0.05% CMake 0.50%

mpi's People

Contributors

Stargazers

Watchers

mpi's Issues

Drop MPI1 usage

Compilation of Boost MPI using code fails with OpenMPI 4.0. For example:

[  1%] Building CXX object src/core/cluster_analysis/CMakeFiles/cluster_analysis.dir/Cluster.cpp.o
cd /builddir/build/BUILD/espresso-4.0.0/openmpi_build/src/core/cluster_analysis && /usr/bin/c++  -DH5XX_USE_MPI -Dcluster_analysis_EXPORTS -I/builddir/build/BUILD/espresso-4.0.0/src -I/builddir/build/BUILD/espresso-4.0.0/openmpi_build/src -I/builddir/build/BUILD/espresso-4.0.0/src/core -I/builddir/build/BUILD/espresso-4.0.0/openmpi_build/src/core -isystem /usr/include/openmpi-x86_64  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic -pthread -Wall -Wno-sign-compare -Wno-unused-function -Wno-unused-variable -DNDEBUG -fPIC   -std=c++11 -o CMakeFiles/cluster_analysis.dir/Cluster.cpp.o -c /builddir/build/BUILD/espresso-4.0.0/src/core/cluster_analysis/Cluster.cpp
BUILDSTDERR: In file included from /usr/include/boost/mpi/communicator.hpp:17:0,
BUILDSTDERR:                  from /builddir/build/BUILD/espresso-4.0.0/src/core/communication.hpp:56,
BUILDSTDERR:                  from /builddir/build/BUILD/espresso-4.0.0/src/core/grid.hpp:47,
BUILDSTDERR:                  from /builddir/build/BUILD/espresso-4.0.0/src/core/cluster_analysis/Cluster.cpp:19:
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp: In constructor 'boost::mpi::detail::mpi_datatype_primitive::mpi_datatype_primitive(const void*)':
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp:52:7: error: 'MPI_Address' was not declared in this scope
BUILDSTDERR:        BOOST_MPI_CHECK_RESULT(MPI_Address,(const_cast<void*>(orig), &origin));
BUILDSTDERR:        ^
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp: In member function 'ompi_datatype_t* boost::mpi::detail::mpi_datatype_primitive::get_mpi_datatype()':
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp:75:9: error: 'MPI_Type_struct' was not declared in this scope
BUILDSTDERR:          BOOST_MPI_CHECK_RESULT(MPI_Type_struct,
BUILDSTDERR:          ^
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp: In member function 'void boost::mpi::detail::mpi_datatype_primitive::save_impl(const void*, MPI_Datatype, int)':
BUILDSTDERR: /usr/include/boost/mpi/detail/mpi_datatype_primitive.hpp:108:7: error: 'MPI_Address' was not declared in this scope
BUILDSTDERR:        BOOST_MPI_CHECK_RESULT(MPI_Address,(const_cast<void*>(p), &a));
BUILDSTDERR:        ^
BUILDSTDERR: make[2]: *** [src/core/cluster_analysis/CMakeFiles/cluster_analysis.dir/Cluster.cpp.o] Error 1

std::shared_ptr breaks c++03

Probably better to use boost smart pointers.

Add CMake testing

When building with cmake, we should have ctest enabled too.

If only because the CTest framework makes it easier to have multi process MPI tests.

[question] When develop is merged into master?

I made a PR a while ago and it was merged into develop. A few weeks ago Boost 1.80 was released. And changes in Boost.MPI are not mentioned. Seems like Boost super-project updates it's submodules when there are changes on master. develop merge into master missing?

@aminiussi

boot mpi python module name

Right now it's mpi, which is kind of... bad.

We should at least have boost at the module path root

Nonblocking communication with boost::mpi::any_source not working for serialized communication

Posting more than two irecv with boost::mpi::any_source results in message truncation errors (boost 1.67.0, g++ 7.3, OpenMPI 2.1.1 and newer).

Code:

#include <vector>
#include <iostream>
#include <iterator>
#include <boost/mpi.hpp>
#include <boost/serialization/vector.hpp>

namespace mpi = boost::mpi;

int main(int argc, char **argv)
{
    mpi::environment env(argc, argv);
    mpi::communicator comm_world;
    auto rank = comm_world.rank();
    if (rank == 0) {
        std::vector<boost::mpi::request> req;
        std::vector<std::vector<int>> data(comm_world.size() - 1);
        for (int i = 1; i < comm_world.size(); ++i) {
            req.push_back(comm_world.irecv(mpi::any_source, 0, data[i - 1]));
            //auto req = comm_world.irecv(mpi::any_source, 0, data[i - 1]);
            //req.wait();
        }
        boost::mpi::wait_all(std::begin(req), std::end(req));

        for (int i = 0; i < comm_world.size() - 1; ++i) {
            std::cout << "Process 0 received:" << std::endl;
            std::copy(std::begin(data[i]), std::end(data[i]), std::ostream_iterator<int>(std::cout, " "));
            std::cout << std::endl;
        }

    } else {
        std::vector<int> vec = {1, 2, 3, 4, 5};
        auto req = comm_world.isend(0, 0, vec);
        req.wait();
    }
}

Symptoms:

$ mpic++ serialized-anysource.cc -o serialized-anysource -lboost_mpi -lboost_serialization
$ mpiexec -n 3 ./serialized-anysource
terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception> >'
  what():  MPI_Test: MPI_ERR_TRUNCATE: message truncated
[lapsgs17:04854] *** Process received signal ***
[lapsgs17:04854] Signal: Aborted (6)
[lapsgs17:04854] Signal code:  (-6)
[lapsgs17:04854] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7ff37f056f20]
[lapsgs17:04854] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7ff37f056e97]
[lapsgs17:04854] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7ff37f058801]
[lapsgs17:04854] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c8fb)[0x7ff37f6ad8fb]
[lapsgs17:04854] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d3a)[0x7ff37f6b3d3a]
[lapsgs17:04854] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d95)[0x7ff37f6b3d95]
[lapsgs17:04854] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92fe8)[0x7ff37f6b3fe8]
[lapsgs17:04854] [ 7] ./serialized-anysource(_ZN5boost15throw_exceptionINS_3mpi9exceptionEEEvRKT_+0x84)[0x5597ecc8415f]
[lapsgs17:04854] [ 8] ./serialized-anysource(+0x19c49)[0x5597ecc87c49]
[lapsgs17:04854] [ 9] /usr/lib/x86_64-linux-gnu/libboost_mpi.so.1.65.1(_ZN5boost3mpi7request4testEv+0x35)[0x7ff38011d595]
[lapsgs17:04854] [10] ./serialized-anysource(+0x16ae1)[0x5597ecc84ae1]
[lapsgs17:04854] [11] ./serialized-anysource(+0xfa2c)[0x5597ecc7da2c]
[lapsgs17:04854] [12] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7ff37f039b97]
[lapsgs17:04854] [13] ./serialized-anysource(+0xf79a)[0x5597ecc7d79a]
[lapsgs17:04854] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node lapsgs17 exited on signal 6 (Aborted).
--------------------------------------------------------------------------

If one uses the commented out code instead of pushing back the request, i.e. directly waiting for a request instead of deferring the wait, the code works. Also, again, for PODs the code is working properly.
The symptoms could be explained easily if count and data messages use the same (user provided) tags. Then, count and data messages can get mixed up because of any_source (one irecv receives both counts and one receives both data messages). But this is only speculation. Again, I can do some investigation as soon as I find the time to do it.

[improvement proposal] I/O

The MPI standard provides a model for accessing files. The C++ syntax to access those features is missing in Boost.MPI.

Is there anyone already planning to work on providing the IO bindings to Boost.MPI?

Build error with MPItrampoline

I want to build Boost.MPI with MPItrampoline, an MPI implementation that merely forwards MPI calls to another MPI implementation. This MPI implementation stores most MPI constants in global variables that are initialized at startup, i.e. they are not compile-time constants. I believe the MPI standard (e.g. section 2.5.4 in the standard for MPI 3.1) allows this explicitly.

Boost.MPI's definition of the enum threading::level leads to compile-time errors; e.g. single = MPI_THREAD_SINGLE is then not valid for an enum constant.

Why MPI_Bsend unsupported in boost mpi ?

Hi, I used to use OpenMpi in my project. Recently, I find serialization in boost is great, so I want to use boost mpi in my project. Sadly, I find MPI_Bsend is unsupportted in boost mpi. This causes the code to crash when I transfer large amounts of data. So, is there a switch to enable boost mpi to support cache mode?

collective operations with large data sizes

The library crashes when performing a collective operation like gather when the size of the objects is very large, but that should be still manageable with supercomputers. The issue is not easy to reproduce, because it requires quite some memory available.

The following programs illustrates this:

#include <iostream>
#include <vector>
#include <boost/serialization/vector.hpp>
#include <boost/mpi/collectives.hpp>

struct huge {
  std::vector<unsigned char> data;
  huge() : data(2ull << 30ull, 0) { }

  template <class Archive>
  void serialize(Archive& ar, const unsigned int version)
  {
    ar & data;
  }
};

int main()
{
  boost::mpi::environment env;
  boost::mpi::communicator world;

  huge a{};

  std::cout << world.rank() << " huge created "  << std::endl;
  world.barrier();

  if (world.rank() == 0)
    {
      std::vector<huge> all;
      boost::mpi::gather(world, a, all, 0);
    }
  else
    {
      boost::mpi::gather(world, a, 0);
    }

  return 0;
}

The program create an object of 1G of memory. The struct huge is defined so to force the library to have a non-primitive MPI type. When run with only 2 tasks, it crashes giving

terminate called after throwing an instance of 'std::length_error'
  what():  vector::_M_range_insert

On a supercomputer JUWELS, which has boost 1.69, the error is:

mpi: /gpfs/software/juwels/stages/2019a/software/Boost/1.69.0-gpsmpi-2019a-Python-2.7.16/include/boost/mpi/allocator.hpp:142: T* boost::mpi::allocator<T>::allocate(boost::mpi::allocator<T>::size_type, boost::mpi::allocator<void>::const_pointer) [with T = char; boost::mpi::allocator<T>::pointer = char*; boost::mpi::allocator<T>::size_type = long unsigned int; boost::mpi::allocator<void>::const_pointer = const void*]: Assertion `_check_result == 0' failed.

It appears that the programs crashes around this line on gather.hpp

oa << in_values[i];

The same crash is found even running with 1 task.

Reducing the size of huge as

struct huge {
  std::vector<unsigned char> data;
  huge() : data(2ull << 29ull, 0) { }

   ....

makes again the program crash, this time it appears that the crash happens at this line of gather.hpp

   packed_iarchive::buffer_type recv_buffer(is_root ? std::accumulate(oasizes.begin(), oasizes.end(), 0) : 0);

My impression is that the sizes are sometimes stored as int when they should be a size_t. For example, in the line above, oasizes is a std::vector of int: even if the single-object size fits into a int, the total buffer of gathered objects could exceed 2^31.

Non blocking broadcast

C++ version of https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.2

Reduce with Lambdas

In the standard library lambdas can be used for reduction operations in std::accumulate.
Is there any reason why boost::mpi::all_reduce does not accept lambda functions as custom reduction operation?

Example:

#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/mpi/collectives.hpp>
namespace mpi = boost::mpi;

#include <iostream>
#include <algorithm>

int main()
{
    mpi::environment env;
    mpi::communicator world;

    {
        int a=1;
        int s = mpi::all_reduce(world,a,std::plus<int>()); // ok
        
        std::vector<int> v{1,2,3};
        int s2 = std::accumulate(v.begin(),v.end(),0,std::plus<int>()); // ok
    }
    
    {
        int a=1;
        //int s = mpi::all_reduce(world,a,[](int x,int y){return x+y;}); // error: deleted default constructor for lambdas
        
        std::vector<int> v{1,2,3};
        int s2 = std::accumulate(v.begin(),v.end(),0,[](int x,int y){return x+y;}); // ok
    }
    return 0;
}

communicator::sendrecv is not shown in the documentation

Even if an implementation of sendrecv is present in boost::mpi::communicator

https://github.com/boostorg/mpi/blob/develop/include/boost/mpi/communicator.hpp#L511-L541
https://github.com/boostorg/mpi/blob/develop/include/boost/mpi/communicator.hpp#L1532-L1539

The support in the documentation says 'unsupported'

https://github.com/boostorg/mpi/blob/develop/doc/c_mapping.qbk#L86-L89

Is this intended or not?

Resolve cyclic dependency between mpi and property_map

property_map depends on mpi:

property_map/include/boost/property_map/parallel/detail/untracked_pair.hpp:#include <boost/mpi/datatype.hpp>
property_map/include/boost/property_map/parallel/unsafe_serialize.hpp:#include <boost/mpi/datatype.hpp>

mpi depends on property_map:

mpi/include/boost/mpi/graph_communicator.hpp:#include <boost/property_map/property_map.hpp>

Direct dependency cycles should be avoided to prepare boost for a future where each repository releases independently.

include/boost/mpi/seq.hpp on master only?

In the course of adding Travis support to mpi (so that I can later add it to graph_parallel) I looked at whether master and develop were in a state in which I could merge the Travis changes to master. They aren't because on develop, there's unmerged work from September. But there's something else too; the file include/boost/mpi/seq.hpp sits on master but is not on develop, and I'm not sure why.

(I tried to linearize the history by merging master to develop, an operation which should typically do nothing on the contents of the repo, and this file appeared all of a sudden.)

missing use of detail namespace

I am using boost 1.71.0. In mpi/collectives/scatterv.hpp no scatterv_impl function is implemented, out of scope of "detail" namespace.

boost/include/boost/mpi/collectives/scatterv.hpp:104:16: error: ‘scatterv_impl’ was not declared in this scope; did you mean ‘boost::mpi::detail::scatterv_impl’?

Assertion inside scatterv failed when input == nullptr

Open MPI v4.1.2
Boost 1.76.0
Arch Linux
c++ (GCC) 11.1.0

Open MPI C API work well. Debug build with Boost API crashes at runtime due to assertion:

boost-example: /usr/include/boost/mpi/collectives/scatterv.hpp:32:
void boost::mpi::detail::scatterv_impl(const boost::mpi::communicator&, const T*, T*, int, const int*, const int*, int, mpl_::true_) [with
 T = int; mpl_::true_ = mpl_::bool_<true>]:
Assertion `bool(sizes) == bool(in_values)' failed.
*** Process received signal ***
Signal: Aborted (6)
Signal code:  (-6)

Abort occurs in this file.
Does this assert conform MPI 3.1 standard? Page 161 par. 5.6. SCATTER says:

MPI_SCATTERV
IN -- sendbuf -- address of send buffer (choice, significant only at root)

Later in page 162:

The send buffer is ignored for all non-root processes.

Minimal source files and logs attached.

main.cpp:

Spoiler warning

#include <algorithm>
#include <iostream>
#include <mpi.h>
#include <numeric>
#include <vector>

std::vector<int> get_sendcounts(int N, int m) {
  if (m == N) {
    std::vector<int> ret;
    ret.resize(m, 1);
    return ret;
  } else if (m < N) {
    std::vector<int> ret;
    const int q = N / m;
    const int r = N % m;
    ret.resize(m, q);
    std::fill(ret.begin(), ret.begin() + r, q + 1);
    return ret;
  } else {
    std::vector<int> ret;
    ret.resize(m, 0);
    std::fill(ret.begin(), ret.begin() + N, 1);
    return ret;
  }
}

std::vector<int> get_displs(const std::vector<int> &sendcounts) {
  std::vector<int> displs;
  displs.reserve(sendcounts.size());
  displs.push_back(0);
  std::partial_sum(sendcounts.begin(), sendcounts.end() - 1,
                   std::back_inserter(displs));
  return displs;
}

int main(int argc, char *argv[]) {
  MPI_Init(&argc, &argv);

  int rank = 0, size = 0;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  int *input = nullptr;
  constexpr int INPUT_SIZE = 10;
  if (rank == 0) {
    input = new int[INPUT_SIZE];
    std::iota(input, input + INPUT_SIZE, 0);
  }
  const std::vector<int> sendcounts = get_sendcounts(INPUT_SIZE, size);
  const std::vector<int> displs = get_displs(sendcounts);

  std::vector<int> local_data(sendcounts[rank]);
  MPI_Scatterv(input, sendcounts.data(), displs.data(), MPI_INT,
               local_data.data(), local_data.size(), MPI_INT, 0,
               MPI_COMM_WORLD);
  std::cout << "Process " << rank << " local_data = ";
  for (auto x : local_data)
    std::cout << x << ' ';
  std::cout << std::endl;

  MPI_Finalize();
  return 0;
}

main-boost.cpp:

Spoiler warning

#include <algorithm>
#include <boost/mpi/collectives.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/mpi/environment.hpp>
#include <iostream>
#include <numeric>
#include <vector>

std::vector<int> get_sendcounts(int N, int m) {
  if (m == N) {
    std::vector<int> ret;
    ret.resize(m, 1);
    return ret;
  } else if (m < N) {
    std::vector<int> ret;
    const int q = N / m;
    const int r = N % m;
    ret.resize(m, q);
    std::fill(ret.begin(), ret.begin() + r, q + 1);
    return ret;
  } else {
    std::vector<int> ret;
    ret.resize(m, 0);
    std::fill(ret.begin(), ret.begin() + N, 1);
    return ret;
  }
}

std::vector<int> get_displs(const std::vector<int> &sendcounts) {
  std::vector<int> displs;
  displs.reserve(sendcounts.size());
  displs.push_back(0);
  std::partial_sum(sendcounts.begin(), sendcounts.end() - 1,
                   std::back_inserter(displs));
  return displs;
}

int main(int argc, char *argv[]) {
  boost::mpi::environment env{argc, argv};
  boost::mpi::communicator world;

  int *input = nullptr;
  constexpr int INPUT_SIZE = 10;
  if (world.rank() == 0) {
    input = new int[INPUT_SIZE];
    std::iota(input, input + INPUT_SIZE, 0);
  }
  const std::vector<int> sendcounts = get_sendcounts(INPUT_SIZE, world.size());
  const std::vector<int> displs = get_displs(sendcounts);

  std::vector<int> local_data(sendcounts[world.rank()]);
  boost::mpi::scatterv(world, input, sendcounts, displs, local_data.data(),
                       local_data.size(), 0);

  std::cout << "Process " << world.rank() << " local_data = ";
  for (auto x : local_data)
    std::cout << x << ' ';
  std::cout << std::endl;

  MPI_Finalize();
  return 0;
}

CMakeLists.txt:

cmake_minimum_required(VERSION 3.5)

project(a LANGUAGES CXX)

find_package(MPI REQUIRED)
add_executable(openmpi-example main.cpp)
target_link_libraries(openmpi-example ${MPI_CXX_LIBRARIES})

find_package(Boost REQUIRED COMPONENTS mpi)
add_executable(boost-example main-boost.cpp)
target_link_libraries(boost-example ${Boost_LIBRARIES})

$ cmake -GNinja .. -DCMAKE_BUILD_TYPE=Debug -DCMAKE_VERBOSE_MAKEFILE=1
-- The CXX compiler identification is GNU 11.1.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found MPI_CXX: /usr/lib/openmpi/libmpi_cxx.so (found version "3.1")
-- Found MPI: TRUE (found version "3.1")
-- Found Boost: /usr/lib64/cmake/Boost-1.76.0/BoostConfig.cmake (found version "1.76.0") found components: mpi
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/test-boost-mpi/build

$ cmake --build .
[1/4] /usr/bin/c++   -g -MD -MT CMakeFiles/openmpi-example.dir/main.cpp.o -MF CMakeFiles/openmpi-example.dir/main.cpp.o.d -o CMakeFiles/openmpi-example.dir/main.cpp.o -c /tmp/test-boost-mpi/main.cpp
[2/4] : && /usr/bin/c++ -g  CMakeFiles/openmpi-example.dir/main.cpp.o -o openmpi-example  -Wl,-rpath,/usr/lib/openmpi  /usr/lib/openmpi/libmpi_cxx.so  /usr/lib/openmpi/libmpi.so && :
[3/4] /usr/bin/c++ -DBOOST_ALL_NO_LIB -DBOOST_MPI_DYN_LINK -DBOOST_SERIALIZATION_DYN_LINK  -g -pthread -MD -MT CMakeFiles/boost-example.dir/main-boost.cpp.o -MF CMakeFiles/boost-example.dir/main-boost.cpp.o.d -o CMakeFiles/boost-example.dir/main-boost.cpp.o -c /tmp/test-boost-mpi/main-boost.cpp
[4/4] : && /usr/bin/c++ -g -Wl,-rpath -Wl,/usr/lib/openmpi -Wl,--enable-new-dtags -pthread CMakeFiles/boost-example.dir/main-boost.cpp.o -o boost-example  -Wl,-rpath,/usr/lib/openmpi  /usr/lib/libboost_mpi.so.1.76.0  /usr/lib/libboost_serialization.so.1.76.0  /usr/lib/openmpi/libmpi_cxx.so  /usr/lib/openmpi/libmpi.so && :

$ mpirun -n 5 ./openmpi-example
Process 0 local_data = 0 1
Process 1 local_data = 2 3
Process 2 local_data = 4 5
Process 4 local_data = 8 9
Process 3 local_data = 6 7

$ mpirun -n 5 ./boost-example
boost-example: /usr/include/boost/mpi/collectives/scatterv.hpp:32: void boost::mpi::detail::scatterv_impl(const boost::mpi::communicator&, const T*, T*, int, const int*, const int*, int, mpl_::true_) [with
 T = int; mpl_::true_ = mpl_::bool_<true>]: Assertion `bool(sizes) == bool(in_values)' failed.
*** Process received signal ***
Signal: Aborted (6)
Signal code:  (-6)
...
(call stack truncated)

Both release versions run as expected:

$ cmake -GNinja .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_VERBOSE_MAKEFILE=1
$ cmake --build .
[1/4] /usr/bin/c++   -O3 -DNDEBUG -MD -MT CMakeFiles/openmpi-example.dir/main.cpp.o -MF CMakeFiles/openmpi-example.dir/main.cpp.o.d -o CMakeFiles/openmpi-example.dir/main.cpp.o -c /tmp/test-boost-mpi/main.cpp
[2/4] : && /usr/bin/c++ -O3 -DNDEBUG  CMakeFiles/openmpi-example.dir/main.cpp.o -o openmpi-example  -Wl,-rpath,/usr/lib/openmpi  /usr/lib/openmpi/libmpi_cxx.so  /usr/lib/openmpi/libmpi.so && :
[3/4] /usr/bin/c++ -DBOOST_ALL_NO_LIB -DBOOST_MPI_DYN_LINK -DBOOST_SERIALIZATION_DYN_LINK  -O3 -DNDEBUG -pthread -MD -MT CMakeFiles/boost-example.dir/main-boost.cpp.o -MF CMakeFiles/boost-example.dir/main-boost.cpp.o.d -o CMakeFiles/boost-example.dir/main-boost.cpp.o -c /tmp/test-boost-mpi/main-boost.cpp
[4/4] : && /usr/bin/c++ -O3 -DNDEBUG -Wl,-rpath -Wl,/usr/lib/openmpi -Wl,--enable-new-dtags -pthread CMakeFiles/boost-example.dir/main-boost.cpp.o -o boost-example  -Wl,-rpath,/usr/lib/openmpi  /usr/lib/libboost_mpi.so.1.76.0  /usr/lib/libboost_serialization.so.1.76.0  /usr/lib/openmpi/libmpi_cxx.so  /usr/lib/openmpi/libmpi.so && :

(Mpirun output very similar to the above one.)

Non blocking gather

Boost MPI version of https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.2

'make_offsets' Unresolved Symbol Issue in msvc143

On Windows, when using the shared MPI library with MSVC 143, a link error occurs for the 'scatterv' function. The error message indicates that 'scatterv' is calling 'make_offsets', but the definition for 'make_offsets' cannot be found.

To reproduce this issue, simply invoke any version of the 'scatterv' function using the shared library. It's worth noting that Boost MPI test cases pass because they use the static library. I have tested the latest libraries from both vcpkg (with MSMPI) and from a custom compilation (with Intel MPI), and the issue persists in both cases.

As a potential solution, I found that adding the export directive BOOST_MPI_DECL before the 'make_offsets' declaration in 'offsets.hpp' resolves this issue. However, I have limited knowledge of this library's design and am unsure if this is the proper approach. Please advise on whether this solution is appropriate or if there is an alternative fix.

non blocking all to all

https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.6

mpi-python3 appears to be bust in string (unicode) vs bytes

$ cat demo2.py 
# https://www.boost.org/doc/libs/1_67_0/doc/html/mpi/tutorial.html#mpi.python

import boost.mpi as mpi

if mpi.world.rank == 0:
    mpi.world.send(1, 0, 'Hello')
    msg = mpi.world.recv(1, 1)
    print(msg, '!')
else:
    msg = mpi.world.recv(0, 0)
    print((msg + ', '), end=' ')
    mpi.world.send(0, 1, 'world')

$ mpirun --oversubscribe --allow-run-as-root -np 2 python3 ./demo2.py
Traceback (most recent call last):
  File "./demo2.py", line 6, in <module>
    mpi.world.send(1, 0, 'Hello')
TypeError: Expecting an object of type str; got an object of type bytes instead

It seems like Python unicode string is encoded into bytes, instead of being processed as a Python unicode string?!

Serialized p2p in 1 message

Right now, serialized data is processed in 2 messages when transmitted point to point.
Using MPI_Probe could be faster and maybe would solve #63

Mixing blocking and nonblocking does not work for serialized communication

Mixing blocking and nonblocking sends and receives (send + irecv and isend + recv) does not work for serialized communication (boost 1.67.0, g++ 7.3 and OpenMPI 2.1.1, also for newer versions of OpenMPI). The MPI Standard explicitly allows this (§3.7).

Take this example:

#include <vector>
#include <iostream>
#include <iterator>
#include <boost/mpi.hpp>
#include <boost/serialization/vector.hpp>

namespace mpi = boost::mpi;

int main(int argc, char **argv)
{
    bool iswap = argc > 1 && argv[1] == std::string("iswap");

    mpi::environment env(argc, argv);
    mpi::communicator comm_world;
    auto rank = comm_world.rank();

    if (rank == 0) {
        std::vector<int> data;
        if (iswap) {
            auto req = comm_world.irecv(1, 0, data);
            req.wait();
        } else {
            comm_world.recv(1, 0, data);
        }
        std::cout << "Process 0 received:" << std::endl;
        std::copy(std::begin(data), std::end(data), std::ostream_iterator<int>(std::cout, " "));
        std::cout << std::endl;

    } else if (rank == 1) {
        std::vector<int> vec = {1, 2, 3, 4, 5};
        if (iswap) {
            comm_world.send(0, 0, vec);
        } else {
            auto req = comm_world.isend(0, 0, vec);
            req.wait();
        }
    } 
}

It matches an isend with a recv (or a send with an irecv if you pass iswap as an argument). What happens is this:

$ mpic++ serialized-mixture.cc -o serialized-mixture -lboost_mpi -lboost_serialization
$ mpiexec -n 2 ./serialized-anysource-2proc
Process 0 received:
5 0 1 2 3 4 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
$ mpiexec -n 2 ./serialized-mixture iswap
[lapsgs17:32755] *** Process received signal ***
[lapsgs17:32755] Signal: Aborted (6)
[lapsgs17:32755] Signal code:  (-6)
[lapsgs17:32755] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0xterminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::mpi::exception> >'
  what():  MPI_Wait: MPI_ERR_TRUNCATE: message truncated
3ef20)[0x7f1b3f93bf20]
[lapsgs17:32755] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7)[0x7f1b3f93be97]
[lapsgs17:32755] [ 2] /lib/x86_64-linux-gnu/libc.so.6(abort+0x141)[0x7f1b3f93d801]
[lapsgs17:32755] [ 3] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x8c8fb)[0x7f1b3ff928fb]
[lapsgs17:32755] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d3a)[0x7f1b3ff98d3a]
[lapsgs17:32755] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d95)[0x7f1b3ff98d95]
[lapsgs17:32755] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92fe8)[0x7f1b3ff98fe8]
[lapsgs17:32755] [ 7] ./serialized-mixture(_ZN5boost15throw_exceptionINS_3mpi9exceptionEEEvRKT_+0x84)[0x55fc8d374c07]
[lapsgs17:32755] [ 8] ./serialized-mixture(+0x187d9)[0x55fc8d3777d9]
[lapsgs17:32755] [ 9] /usr/lib/x86_64-linux-gnu/libboost_mpi.so.1.65.1(_ZN5boost3mpi7request4waitEv+0x32)[0x7f1b40a02402]
[lapsgs17:32755] [10] ./serialized-mixture(+0xf9a2)[0x55fc8d36e9a2]
[lapsgs17:32755] [11] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f1b3f91eb97]
[lapsgs17:32755] [12] ./serialized-mixture(+0xf75a)[0x55fc8d36e75a]
[lapsgs17:32755] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node lapsgs17 exited on signal 6 (Aborted).
--------------------------------------------------------------------------

Note that if you do this with PODs, it works (and I suspect it will also work if boost::mpi can map the data directly to an MPI datatype, but I have not tried this case). Also, of course it works if you match isends with irecvs, etc. (change one of the if (iswap) to if (!iswap)).

I could not find a spot in the boost::mpi doc that mentions that boost::mpi does not support mixing blocking and nonblocking calls. Nevertheless, I think an incompatibility with MPI itself in this matter should be prevented.

When I have time, I can do some digging into this matter, for now I just wanted it to be documented.

remove c++ mpi dependency with open mpi

We do not used the C++ binding, but OpenMPI forces them on us (and then we foce them on the end user).
This can be avoided by setting the OMPI_SKIP_MPICXX definition in the config file.

Would the boost(1_73_0)::mpi::all_gatherv() result in memory leakage?

Hi,

I got an issue in my C++ codes.
I tried to gather information as follows,
"......
std::vector<CPdPoint *> * m_pPdPointList;
std::vector<CPdPoint *> * s_pPdPointList;
.......
boost::mpi::all_gatherv < CPdPoint * >(CWorld, *s_pPdPointList, *m_pPdPointList, m_iSendNum);
......."
The gathering WORKS but the MEMORY used by the program kept INCREASING.

I have tried to replace the pointers and changed to
"......
std::vector< CPdPoint > m_P;
std::vector< CPdPoint > s_P;
.......
boost::mpi::all_gatherv < CPdPoint > (CWorld, s_P, m_P, m_iSendNum);
......."

For the change, the gathering doesn't WORK and errors threw out as follows,
"......
[0] application aborted
aborting MPI_COMM_WORLD (comm=0x44000000), error -1, comm rank 0
[1-2] terminated
......"

Any help or discussion is greatly appreciated.

Peng

Non blocking scatter

https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.4

MPI_Probe might have een fixed with Intel MPI 2019.4

Meaning that #63 and #62 might behave correctly with that implementation

wait_all might not return

When using MPI Boost 1.71, the wait_all function may not return, if some of the isends have already been completed (I think).

What I'm doing is roughly this:
I have a producer that sends workunits to multiple consumers, using isend.
When the producer is through, it sends a last message to the consumers that the data is over.
To clean up the isends, I'm doing a wait_all to the vector where the isend requests are stored, to ensure all the requests have been completed (I think it should be a no-op since I have already receive all the data from the consumers).

As a workaround, I've written this (which returns, has expected):
for(auto current = pending_isends.begin(); current != current_isends.end(); ++current){
if (!current->active()){
current->wait();
}
}

binary_buffer_oprimitive::save and binary_buffer_iprimitive::load doesn't work right when CharType is not `char`

They doesn't take into account CharType size

Reported number of non-native objects sent always 1

Line 182 of details/request_handlers.hpp, the boost:status object has the count hard coded to 1. The deserialize method at line 260 takes no boost::status object. Legacy deserialize method takes a boost::status object and keeps track of number of objects that have been deserialized and sets count to that number.

Sequential MPI

Some code exist in both parallel en sequential form, there are various ways to deal with that, one is the ifdef all the MPI calls, the other is to provide a dummy/empty MPI implementation.

I think it would be a good idea to provide a sequential version of Boost MPI so that users could have both parallel and sequential versions with no source code modifications.

One consequence is that we could provide that version on all platforms, even those with no MPI implementations, as users might still want the sequential version. Which could even help catching some compilations problems.

I expecting some bjam problems though...

Non blocking request progress.

This is a copy of SVN ticket 12829 by @tilsche

There is a inconspicuous note in the point-to-point documentation:
Moreover, the MPI standard does not guarantee that the receive makes any progress before a call to "wait" or "test", although most implementations of the C MPI do allow receives to progress before the call to "wait" or "test". Boost.MPI, on the other hand, generally requires "test" or "wait" calls to make progress.
That sounds like MPI makes no additional restrictions compared to the MPI standard, but is more strict than other MPI implementations. That is not correct. The MPI standard explicitly requires a correct MPI implementation to complete a send after the other process "posts the matching (nonblocking) receive even if process one has not yet reached the completing wait call." (Section 3.7.4). The provided Example

The MPI standard gives an example that should not deadlock, translating it to boost.mpi with serialized datatypes (see attachment), it hangs. This is due to the two-phase transfer, where the matching receive is actually only posted on request::wait.

Unfortunately I don't see any way around that limitation. However, the documentation should be improved. It should clearly state that this is a limitation compared to the MPI standard progress guarantee. I believe the wording should also be more clear, i.e. "A synchronous send may not complete until the matching (nonblocking) receive has reached the completing wait call."

Use of MPI_INTEGER - should be MPI_INT

There are six uses of MPI_INTEGER in the following three files:
boost/mpi/collectives/gather.hpp
boost/mpi/collectives/all_gather.hpp
boost/mpi/collectives/scatter.hpp

These should be replaced by the C-type MPI_INT, as MPI_INTEGER is a Fortran type only.

I encountered this as a bug when compiling against an installation of OpenMPI v3.1.2 which was compiled without Fortran support. In this case, any use of MPI_INTEGER from a C/C++ program leads to an "invalid datatype" MPI error. Presumably OpenMPI aliases MPI_INTEGER to MPI_INT and all is well, unless MPI_INTEGER is not supported due to lack of Fortran.

dangling pointer to custom operation in collectives

Initially reported by @mkuron in #52.

I moving the PR into an issue, as the fix will be different than initially planed.

Add support for MPI_Get_library_version

It looks trivial, I could try to make a PR if you don't mind.

Thanks for your attention

Symbol visibility on windows

As explained by @hpcdgrie in #93

non blocking barrier

Based on https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.1

Boost.Python no longer has separate targets for python 2 / 3

The symptoms are:

error: Unable to find file or target named
error:     '/boost/python//boost_python3'
error: referred to from project at
error:     'libs/mpi/build'

And indeed boost_python3 does not exist anymore in the Jamfile

I'll try to do a PR to fix that. I am not sure why this passed the tests though.

valgrind error with reduce of std::vector

Running the following program with valgrind

#include <iostream>
#include <vector>
#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/mpi/collectives.hpp>
#include <boost/mpi/operations.hpp>
#include <boost/serialization/vector.hpp>

struct sum_vec {
  std::vector<int> operator()(const std::vector<int>& a, const std::vector<int>& b) const {
    std::vector<int> res(a.size());
    std::transform(a.begin(), a.end(), b.begin(), res.begin(), std::plus<int>{});
    return res;
  }
};

int main()
{
  boost::mpi::environment env;
  boost::mpi::communicator world;

  std::vector<std::vector<int>> v = { std::vector<int>{ world.rank() + 1, 2 * (world.rank() + 1) },
				      std::vector<int>{ 3 * (world.rank() + 1), 4 * (world.rank() + 1) }  } ;
  if (world.rank() == 0)
    {
      std::vector<std::vector<int>> v_reduced(v.size());
      boost::mpi::reduce(world, &v.front(), v.size(), &v_reduced.front(), sum_vec{}, 0);
      std::cout << "reduced vector:" << std::endl;
      for (auto const& el : v_reduced)
	{
	  for (auto const& i : el)
	    std::cout << i << ' ';
	  std::cout << std::endl;
	}
    }
  else
    {
      boost::mpi::reduce(world, &v.front(), v.size(), sum_vec{}, 0);
    }

  return 0;
}

gives the error

==408229== Thread 1:
==408229== Uninitialised byte(s) found during client check request
==408229== at 0x485221E: check_mem_is_defined_untyped (libmpiwrap.c:953)
==408229== by 0x485221E: PMPI_Get_count (libmpiwrap.c:1132)
==408229== by 0x4915A97: boost::mpi::detail::packed_archive_recv(boost::mpi::communicator const&, int, int, boost::mpi::packed_iarchive&, ompi_status_public_t&) (in /usr/lib/libboost_mpi.so.1.72.0)
==408229== by 0x12149E: void boost::mpi::detail::tree_reduce_impl<std::vector<int, std::allocator >, sum_vec>(boost::mpi::communicator const&, std::vector<int, std::allocator > const*, int, std::vector<int, std::allocator >, sum_vec, int, mpl_::bool_) (in /tmp/reduce_vec)
==408229== by 0x120142: void boost::mpi::detail::reduce_impl<std::vector<int, std::allocator >, sum_vec>(boost::mpi::communicator const&, std::vector<int, std::allocator > const, int, std::vector<int, std::allocator >, sum_vec, int, mpl_::bool_, mpl_::bool_) (in /tmp/reduce_vec)
==408229== by 0x11EDC1: void boost::mpi::reduce<std::vector<int, std::allocator >, sum_vec>(boost::mpi::communicator const&, std::vector<int, std::allocator > const, int, std::vector<int, std::allocator >, sum_vec, int) (in /tmp/reduce_vec)
==408229== by 0x118220: main (in /tmp/reduce_vec)
==408229== Address 0x1ffefff3f8 is on thread 1's stack
==408229== in frame #2, created by void boost::mpi::detail::tree_reduce_impl<std::vector<int, std::allocator >, sum_vec>(boost::mpi::communicator const&, std::vector<int, std::allocator > const, int, std::vector<int, std::allocator >*, sum_vec, int, mpl_::bool_) (???:)
==408229==

The error persists if, say, v_reduced is fully initialized as

  std::vector<std::vector<int>> v_reduced(v.size(), std::vector<int>(2, 1));

The error remains also when using a C-style array of std::vector<int> instead of a vector of a vector:

....
      std::vector<int>* v_reduced = new std::vector<int>[v.size()];
      boost::mpi::reduce(world, &v.front(), v.size(), v_reduced, sum_vec{}, 0);
      std::cout << "reduced vector:" << std::endl;
      for (std::size_t i = 0; i < v.size(); ++i)
	{
	  for (auto const& i : v_reduced[i])
	    std::cout << i << ' ';
	  std::cout << std::endl;
	}
      delete[] v_reduced;
...

I am using
-gcc 9.3.0
-valgrind 3.15.0
-boost 1.72.0
-openmpi 4.0.3 (compiled with support for MPI1)

mpi autodetection fails because b2 doesn't do as requested

Trying to install boost mpi on a system where mpi compiler executable is called CC. For some reason adding using mpi : CC ; to tools/build/src/user-config.jam doesn't work and b2 says that mpi autodetection failed. If I instead add using mpi : CC : <define>DAMNITB2JUSTUSECC ; everything works. Naturally the added definition doesn't affect anything as far as C(++) code goes so why is it needed? Shouldn't b2 in both cases just use CC when compiling mpi code? Seems like a bug...

Name clash with multiple python impls

On Gentoo, we usually build Boost with multiple python impls. Since 1.72 we are getting name clashes when building with MPI and multiple python impls:

error: Name clash for '<p/var/tmp/portage/dev-libs/boost-1.72.0/work/boost_1_72_0-abi_x86_64.amd64/stage/lib>mpi.so'
error: 
error: Tried to build the target twice, with property sets having 
error: these incompatible properties:
error: 
error:     -  <include>/usr/include/python2.7 <python>2.7 <xdll-path>/var/tmp/portage/dev-libs/boost-1.72.0/work/boost_1_72_0-abi_x86_64.amd64/bin.v2/libs/mpi/build/gcc-9.2/gentoorelease/local-visibility-global/pch-off/python-2.7/threading-multi/visibility-hidden <xdll-path>/var/tmp/portage/dev-libs/boost-1.72.0/work/boost_1_72_0-abi_x86_64.amd64/bin.v2/libs/python/build/gcc-9.2/gentoorelease/pch-off/python-2.7/threading-multi/visibility-hidden
error:     -  <include>/usr/include/python3.6m <python>3.6 <xdll-path>/var/tmp/portage/dev-libs/boost-1.72.0/work/boost_1_72_0-abi_x86_64.amd64/bin.v2/libs/mpi/build/gcc-9.2/gentoorelease/local-visibility-global/pch-off/python-3.6/threading-multi/visibility-hidden <xdll-path>/var/tmp/portage/dev-libs/boost-1.72.0/work/boost_1_72_0-abi_x86_64.amd64/bin.v2/libs/python/build/gcc-9.2/gentoorelease/pch-off/python-3.6/threading-multi/visibility-hidden

this is very likely due to the change in 3ecbf83. @Lastique @pdimov any ideas on how we could fix this? I tried injecting all sorts of tags into the Jamfile, but to no avail, it always crashes with the name clash error above. Our bug report: https://bugs.gentoo.org/703036 building it against a single python impl solves the issue, but isn't a tractable solution for gentoo users.

Build failure with python extentions

After patching boost 1.70 with 3ecbf83 to get mpi python extension built, I came up upon a build error that did not happen before.

[  186s] error: Name clash for '<p/home/abuild/rpmbuild/BUILD/boost_1_70_0/python-stage/lib>libboost_serialization.so.1.70.0'
[  186s] error: 
[  186s] error: Tried to build the target twice, with property sets having 
[  186s] error: these incompatible properties:
[  186s] error: 
[  186s] error:     -  <dll-path>/usr/lib64/python2.7
[  186s] error:     -  none

If I remove boost_serialization dependency from the boost_mpi target, it fails later on during the build anyway where it is complaining about boost_mpi when building with --with-graph_parallel --with-python

The full build logs are here,

https://build.opensuse.org/package/live_build_log/home:adamm:boost_test/boost:extra/openSUSE_Tumbleweed/x86_64

There is no build issues before fixing the build with the above patch, aside that that mpi extension is not built.

Missing all_gatherv declaration in collectives.hpp

The header file collectives.hpp doesn't contain any declarations for the all_gatherv functions (introduced in version 1.67), so that any use of boost::mpi::all_gatherv has to explicitly load the relevant header file,
#include <boost/mpi/collectives/all_gatherv.hpp>
Would it be possible to include the declaration of all_gatherv functions in collectives.hpp so that a single
#include <boost/mpi.hpp>
suffices for software using all_gatherv?

Cancelling request fails for serialized communication

I believe this is a regression from 1.70.

In the following code, an asynchronous recv is initiated then later an other thread marks the operation for cancellation (and that should guarantee wait() to return according to the spec).

This works for fundamental type (such as int) but not for serialized types.

#include <mpi.h>

#include <iostream>
#include <future>

#include <boost/mpi.hpp>
#include <boost/serialization/serialization.hpp>

using namespace std::literals::chrono_literals;
namespace mpi = boost::mpi;

void async_cancel(boost::mpi::request request)
{
  std::this_thread::sleep_for(1s);

  std::cout << "Before MPI_Cancel" << std::endl;

  request.cancel();

  std::cout << "After MPI_Cancel" << std::endl;
}


struct data
{
  int i;
};

template <typename Archive>
void serialize(Archive& ar, data& t, const unsigned int version)
{
  ar & t.i;
}

int main(int argc, char* argv[])
{
  mpi::environment env(mpi::threading::level::multiple);
  mpi::communicator world;
  
  if (world.rank() == 0)
  {
    //int buffer; // WORKS
    data buffer;  // FAILS
    auto request = world.irecv(0, 0, buffer);
    
    auto res = std::async(std::launch::async, &async_cancel, request);

    std::cout << "Before MPI_Wait" << std::endl;

    auto status = request.wait();

    std::cout << "After MPI_Wait " << std::endl;
  }
  else
    std::this_thread::sleep_for(2s);

  return 0;
}

The expected result is:

Before MPI_Wait
Before MPI_Cancel
After MPI_Cancel
After MPI_Wait

Add support for MPI_Comm_spawn

Thank you for your tough work!

It would be nice if you add a wrap for MPI_Comm_spawn into the Boost.MPI API. See spec at https://www.mpich.org/static/docs/v3.1/www3/MPI_Comm_spawn.html

mpi::reduce hangs on intel MPI

The following program hangs on calling boost::mpi::reduce when run on an Intel MPI environment.

#include <algorithm>
#include <iostream>
#include <vector>
#include <boost/mpi/collectives.hpp>
#include <boost/mpi/operations.hpp>
#include <boost/serialization/vector.hpp>

struct sum_vec_vec {
  std::vector<double> operator()(const std::vector<double>& a, const std::vector<double>&b) const
  {
    std::vector<double> res(a.size());
    std::transform(a.begin(), a.end(), b.begin(), res.begin(), [](double x, double y) { return x + y; });
    return res;
  }
};

namespace boost {
  namespace mpi {
    template <>
    struct is_commutative<sum_vec_vec, std::vector<double>> : mpl::true_ { };
  }
}

int main()
{
  namespace mpi = boost::mpi;

  mpi::environment env;
  mpi::communicator world;

  std::size_t size = 1000;
  std::size_t L = 32;

  std::vector<std::vector<double>> correlations;

  // Fill up with some data
  for (std::size_t i = 0; i < size; ++i)
    {
      int l = 0;

      std::vector<double> corr(L);
      for (auto&x : corr)
        x = l++;
      correlations.emplace_back(std::move(corr));
    }

  std::vector<std::vector<double>> av_correlations(correlations.size());
  std::cout << "Ready for mpi::reduce" << std::endl;
  boost::mpi::reduce(world, &correlations.front(), correlations.size(), &av_correlations.front(), sum_vec_vec{}, 0);

  return 0;
}

The specific of the MPI environment are:

MPI_Get_library_version: Intel(R) MPI Library 2019 Update 6 for Linux* OS
MPI_VERSION: 3
I_MPI_NUMVERSION: 20190006300

Boost version is 1.74.0, and it defines:

BOOST_MPI_VERSION: 3
BOOST_MPI_USE_IMPROBE: 1

The program above very often hangs when run for example with >6 tasks, and always hangs when, say, Ntasks=192.

The reason apparently lies in the use of the MPI_Mprobe routines in point_to_point.cpp.
On recompiling the library with the flag BOOST_MPI_USE_IMPROBE disabled in config.hpp, the program ends without issues. Also, the program runs without problems, on openmpi and enabled BOOST_MPI_USE_IMPROBE.

All in all, I suspect that the bug mentioned here is still there.

regression with scatter and intel19

capture-output ../../../bin.v2/libs/mpi/test/scatter_test-7.test/intel-linux-std11/debug/visibility-hidden/scatter_test-7-run
====== BEGIN OUTPUT ======
Scattering integers from root 0...
Scattering integers from root 1...
Scattering integers from root 2...
Scattering integers from root 3...
Scattering integers from root 4...
Scattering integers from root 5...
Scattering integers from root 6...
Scattering GPS positions from root 0...
Scattering GPS positions from root 1...
Scattering GPS positions from root 2...
Scattering GPS positions from root 3...
Scattering GPS positions from root 4...
Scattering GPS positions from root 5...
Scattering GPS positions from root 6...
Scattering string from root 0...
Scattering string from root 1...
../../../boost/test/minimal.hpp(136): exception "memory access violation at address: 0x75d631c3: no mapping at fault address" caught in function: 'int main(int, char **)'

It seems that process zero get a slightly shifted archive buffer, just enough to screw the size wrong.

Issues 03708514 was file to Intel.

UB in destruction order with global mpi::environment

I noticed a problem in mpi::environment calling mpi_datatype_map::clear() of an already gone mpi_datatype_map. Consider the following working program:

#include <array>
#include <vector>
#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/mpi/collectives.hpp>

int main()
{
    boost::mpi::environment env;
    boost::mpi::communicator comm;

    using Type = std::array<int, 3>;
    Type p = {{0, 0, 0}};
    std::vector<Type> ps;
    // Using all_gather here, so there is no requirement on the number of processes.
    boost::mpi::all_gather(comm, p, ps);
}

The relevant thing that this program does is sends std::array<int, 3>s around. This creates an entry in the datatype_map with MPI_Type_create_struct.
Now, let's move the environment out of main's scope, i.e.:

#include <array>
#include <vector>
#include <boost/mpi/environment.hpp>
#include <boost/mpi/communicator.hpp>
#include <boost/mpi/collectives.hpp>

boost::mpi::environment env;

int main()
{
    boost::mpi::communicator comm;

    using Type = std::array<int, 3>;
    Type p = {{0, 0, 0}};
    std::vector<Type> ps;
    boost::mpi::all_gather(comm, p, ps);
}

The documentation [1] says: "In the vast majority of Boost.MPI programs, an instance of mpi::environment will be declared in main at the very beginning of the program." It does not forbid an environment in global scope. Nether does the documentation of environment [2].

However, the second program segfaults on my machine (boost 1.70.0, OpenMPI 3.1.3) with:

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code:  (128)
Failing at address: (nil)
[ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x43f60)[0x7f4603819f60]
[ 1] /usr/lib/x86_64-linux-gnu/libmpi.so.40(MPI_Type_free+0x3b)[0x7f4603c485bb]
[ 2] /usr/lib/x86_64-linux-gnu/libboost_mpi.so.1.67.0(_ZN5boost3mpi6detail16mpi_datatype_map5clearEv+0x59)[0x7f4603d44d39]
[ 3] /usr/lib/x86_64-linux-gnu/libboost_mpi.so.1.67.0(_ZN5boost3mpi11environmentD1Ev+0x5d)[0x7f4603d42c1d]
[ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x472ac)[0x7f460381d2ac]
[ 5] /lib/x86_64-linux-gnu/libc.so.6(+0x473da)[0x7f460381d3da]
[ 6] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf2)[0x7f46037fcb72]
[ 7] ./test2(+0xcc8a)[0x55bf187efc8a]
*** End of error message ***

After some debugging, I find that the difference is the order of destructors of the environment and the datatype_cache defined in mpi_datatype_cache.cpp:

mpi_datatype_map& mpi_datatype_cache()
{
    static mpi_datatype_map cache;
    return cache;
}

For version 1 it is:

environment::~environment()
mpi_datatype_map::~mpi_datatype_map()

And for version 2 it is:

mpi_datatype_map::~mpi_datatype_map()
environment::~environment()

So the static gets destructed before the global environment and, thus, environment::~environment() invokes UB. I suspect the reason for this is that the static is initialized on the first call to mpi_datatype_cache() which is after the initialization of the global environment variable in version 2. The reversed destruction, then, causes the above problem.

I did a quick test for a quick fix, namely pulling the mpi_datatype_map out of mpi_datatype_cache()'s scope, i.e. changing mpi_datatype_cache() to this:

static mpi_datatype_map cache;
mpi_datatype_map& mpi_datatype_cache()
{
    return cache;
}

(Although its semantic is now different, I kept the static because the fix should not require external linkage.)
With this, the segfault is gone and the destruction order in version 2 is the same as in version 1.

I did not open a pull request yet because I wanted to discuss the fix first. Maybe someone has a better idea?

[1] https://www.boost.org/doc/libs/1_70_0/doc/html/mpi/tutorial.html
[2] https://www.boost.org/doc/libs/1_70_0/doc/html/boost/mpi/environment.html

send|recv_vector should not be visible

As they are implementation details...

Non blockin allgather

https://www.mpi-forum.org/docs/drafts/mpi-2018-draft-report.pdf#subsection.5.12.5

boostorg / mpi Goto Github PK

mpi's People

Contributors

Stargazers

Watchers

Forkers

mpi's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs