Comments (15)
I'll need to dig up an Intel license, but in the meantime I tried using the intel_mpi
building block:
Stage0 += baseimage(image='ubuntu:18.04')
Stage0 += gnu()
Stage0 += intel_mpi(eula=True)
Stage0 += copy(src='sources/mpi-hello.c', dest='/mpi-hello.c')
Stage0 += shell(commands=['. /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh intel64',
'mpicc -o /mpi-hello-c /mpi-hello.c'])
With this, I am able to run in Docker and in Singularity using either a converted Docker image or a Singularity built one.
$ singularity shell intel_mpi.sif
Singularity> source /etc/bash.bashrc
smcmillan@smcmillan-dev:~$ mpirun -np 4 /mpi-hello-c
rank 0 of 4 on smcmillan-dev.client.nvidia.com
rank 2 of 4 on smcmillan-dev.client.nvidia.com
rank 3 of 4 on smcmillan-dev.client.nvidia.com
rank 1 of 4 on smcmillan-dev.client.nvidia.com
Can you please give this recipe a try and let me know how it works for your environment?
I am using Singularity 3.5.3 and the image has Intel MPI 2019 Update 6 Build 20191024.
from hpc-container-maker.
Thanks @samcmill for the quick response. You have nicely managed to get to the crux of the problem without having to deal with the time-consuming install of the intel psxe. I built a singularity image with your recipe file and I'm getting this - a little different than before but still a problem:
ubuntu@ip-172-31-87-130:/$ mpirun -np 4 mpi-hello-c
[proxy:0:0@ip-172-31-87-130] HYD_spawn (../../../../../src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:129): [proxy:0:0@ip-172-31-87-130] HYD_spawn (../../../../../src/pm/i_hydra/libhydra/spawn/intel/hydra_spawn.c:129): execvp error on file mpi-hello-c (No such file or directory)
execvp error on file mpi-hello-c (No such file or directory)
[mpiexec@ip-172-31-87-130] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:532): downstream from host ip-172-31-87-130 was killed by signal 9 (Killed)
[mpiexec@ip-172-31-87-130] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2084): assert (exitcodes != NULL) failed
However - I just noticed a potential problem - I'm running on an ubuntu 16.04 base system and it looks like it's using the default python, namely 2.7. Could this be an issue?
from hpc-container-maker.
Sorry - silly mistake - .
wasn't in my path - I take that back - this works.
from hpc-container-maker.
Inspired by you answer I'm trying to do a multi-stage build where I install intel_mpi in the second stage. How do I copy /usr/local/bin/hello_world_mpi
from Stage0 to Stage1?
from hpc-container-maker.
The multi-stage recipe would be (manually entered in so there may be some typos):
Stage0 += baseimage(image='ubuntu:18.04', _as='build')
Stage0 += gnu()
Stage0 += intel_mpi(eula=True)
Stage0 += copy(src='sources/mpi-hello.c', dest='/mpi-hello.c')
Stage0 += shell(commands=['. /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh intel64',
'mpicc -o /mpi-hello-c /mpi-hello.c'])
Stage1 += baseimage(image='ubuntu:18.04')
Stage1 += Stage0.runtime()
Stage1 += copy(_from='build', src='/mpi-hello-c', dest='/mpi-hello-c')
It's good that the intel_mpi
building block is working, but it would nice to understand what's happening with the intel_psxe
-based install. Can you try bisecting some of the differences to help root cause it? The first thing might to use the same version of Intel MPI?
from hpc-container-maker.
@samcmill - I've gotten a lot of suggestions on this problem - yours was the only one that worked!
Here is my recipe file - I know the baselibs stuff is extraneous and I'm not sure I entered the copy bit right - before I got your email I generated a Dockerfile and then edited manually to copy hello_world_mpi over. I'm having trouble with the _as=
bit - probably because I'm using python 2.7 - I need to update that to python3 and clean up this recipe file a bit.
But - the point is - it worked! Using the intel_mpi() building block in the second stage was the key:
# Base image
Stage0.name = 'devel'
Stage0.baseimage(image='ubuntu:18.04')
baselibs = apt_get(ospackages=['build-essential','tcsh','csh','ksh','git',
'openssh-server','libncurses-dev','libssl-dev',
'libx11-dev','less','man-db','tk','tcl','swig',
'bc','file','flex','bison','libexpat1-dev',
'libxml2-dev','unzip','wish','curl','wget',
'libcurl4-openssl-dev','nano','screen', 'libasound2',
'libgtk2.0-common','software-properties-common',
'libpango-1.0.0','xserver-xorg','dirmngr',
'gnupg2','lsb-release','vim'])
Stage0 += baselibs
# Install Intel compilers, mpi, and mkl
ilibs = intel_psxe(eula=True, license=os.getenv('INTEL_LICENSE_FILE',default='intel_license/COM_L___LXMW-67CW6CHW.lic'),
tarball=os.getenv('INTEL_TARBALL',default='intel_tarballs/parallel_studio_xe_2019_update5_cluster_edition.tgz'))
Stage0 += ilibs
# Install application
Stage0 += copy(src='hello_world_mpi.c', dest='/root/jedi/hello_world_mpi.c')
Stage0 += shell(commands=['export COMPILERVARS_ARCHITECTURE=intel64',
'. /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh',
'cd /root/jedi','mpiicc hello_world_mpi.c -o /usr/local/bin/hello_world_mpi -lstdc++'])
# Runtime container
Stage1.baseimage(image='ubuntu:18.04')
Stage1 += baselibs
Stage1 += intel_mpi(eula=True)
Stage1 += copy(_from='devel', src='/usr/local/bin/hello_world_mpi', dest='/usr/local/bin/hello_world_mpi')
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ singularity shell -e $CNAME
Singularity intel19-multi-hello:~/jedi/charliecloud> source /etc/bash.bashrc
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ mpirun -np 4 hello_world_mpi
Hello from rank 1 of 4 running on ip-172-31-87-130
Hello from rank 2 of 4 running on ip-172-31-87-130
Hello from rank 3 of 4 running on ip-172-31-87-130
Hello from rank 0 of 4 running on ip-172-31-87-130
from hpc-container-maker.
Oh - however - I was also having a problem with the .runtime()
method for intel psxe, getting an error like this:
Warning: apt-key output should not be parsed (stdout is not a terminal)
gpg: no valid OpenPGP data found.
I had to do something more like this:
mkdir -p /root/tmp
cd /root/tmp
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
rm GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'
sh -c 'echo deb https://apt.repos.intel.com/mpi all main > /etc/apt/sources.list.d/intel-mpi.list'
sh -c 'echo deb https://apt.repos.intel.com/tbb all main > /etc/apt/sources.list.d/intel-tbb.list'
sh -c 'echo deb https://apt.repos.intel.com/ipp all main > /etc/apt/sources.list.d/intel-ipp.list'
apt-get update
apt-get install intel-mpi-rt-2019.6-166
apt-get install intel-mkl-2019.6-166
from hpc-container-maker.
Yeah - I'm getting this, even though I did a pip3 install:
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ python --version
Python 3.5.2
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ hpccm --recipe itest.py --format docker > Dockerfile.itest
ERROR: baseimage() got an unexpected keyword argument '_as'
This is the command it's complaining about
Stage0.baseimage(image='ubuntu:18.04',_as='devel')
I'm using an up-to-date version of hpccm
ubuntu@ip-172-31-87-130:~/jedi/hpc-container-maker$ git branch
* (HEAD detached at v20.2.0)
master
ubuntu@ip-172-31-87-130:~/jedi/hpc-container-maker$ sudo -H pip3 install hpccm
Requirement already satisfied: hpccm in /usr/local/lib/python3.5/dist-packages (20.2.0)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from hpccm) (1.10.0)
Requirement already satisfied: enum34 in /usr/local/lib/python3.5/dist-packages (from hpccm) (1.1.10)
from hpc-container-maker.
Ah - I see. This isn't a Python 2 vs 3 issue.
There are 2 ways to specify the base image:
Stage0.name = 'devel'
Stage0.baseimage(image='ubuntu:18.04')
and
Stage0 += baseimage(image='ubuntu:18.04', _as='devel')
These two are equivalent. But you can't use the _as
option or any of the other baseimage
primitive options with the first one.
I mostly default to the second one nowadays, but the first is valid too.
from hpc-container-maker.
Thanks @samcmill, but I'm not using both - just this fails:
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ hpccm --recipe itest2.py --format docker > Dockerfile.itest2
ERROR: baseimage() got an unexpected keyword argument '_as'
ubuntu@ip-172-31-87-130:~/jedi/charliecloud$ cat itest2.py
# Base image
Stage0.baseimage(image='ubuntu:18.04',_as='devel')
Stage0 += gnu()
from hpc-container-maker.
Sorry for not being clearer. You can't use the _as
option with Stage0.basename()
, only Stage0 += basename()
. Use Stage0.name = '...'
when using Stage0.basename()
.
from hpc-container-maker.
Ahhh - got it - no, I should apologize for not looking closely at what you wrote. Thanks again.
from hpc-container-maker.
The HPCCM related aspects of this issue seem to be resolved, so closing. Please reopen or start a new issue if there are further questions.
from hpc-container-maker.
I am resurrecting this thread because I wonder if there is a more elegant way to automatically add the command
'. /opt/intel/compilers_and_libraries/linux/mpi/intel64/bin/mpivars.sh intel64'
so that all the blocks following the installation of the Intel compiler can use it?
(at the moment I manually edit the Dockerfile or Singularity .def to add it every time the compiler is needed)
from hpc-container-maker.
See the documentation for the mpivars option in the intel_mpi building block.
from hpc-container-maker.
Related Issues (20)
- Upload universal wheel to PyPi HOT 2
- nvshmem building block issue (singularity): nvshmem does not build when flag hydra=True set HOT 1
- nvhpc building block with C++20 library support HOT 9
- Public GPG key error fron nvidia ubuntu repository HOT 2
- Need to update default version of Slurm
- Support for Intel oneAPI HOT 2
- Allow use of CMake binary package for aarch64
- Support for chmod flag in copy primitive
- Newer Miniconda versions aren't accessible via the conda module
- error dockerfile with example hpcbase-nvhpc-openmpi.py HOT 1
- Question on boost.py: Why runtime instructions only copies boost-binary? HOT 2
- ucx() problem with versions with the suffix -rc HOT 6
- Error while using "node:latest" base image on HPCCM HOT 2
- LLVM 15 toolchain cant find omp.h HOT 8
- LLVM trunk version out-of-date HOT 2
- GPG error with MKL building block
- Incompatible with python 3.12
- Create a build_block for micromamba HOT 1
- Building blocks should error if a passed parameter does not match any parameters.
- Support for PNetCDF from github
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hpc-container-maker.