nvidia / hpc-container-maker Goto Github PK
View Code? Open in Web Editor NEWHPC Container Maker
License: Apache License 2.0
HPC Container Maker
License: Apache License 2.0
By default when installing Anaconda the architecture is Linux-x86_64 (with Miniconda3-py38_4.8.3-Linux-x86_64.sh).
How to change to Linux-ppc64le (which would require for example Miniconda3-py38_4.9.2-Linux-ppc64le.sh)?
The default Slurm v20.11.7 has been removed from their website in favor of v20.11.9
New dockerfiles support:
COPY --chmod=0755 src dest
Simple addition of below code in copy primitive will enable it:
if self.__chmod: base_inst = base_inst + '--chmod={} '.format(self.__chmod)
When you generate an environment.yml file, the first line gives the name of your conda environment. If your environment name isn't the default base
, then the Docker image produced by hpccm doesn't conda activate
into the new environment before launching the Jupyter notebook. As a result, you don't have access to any of the conda packages you installed.
@samcmill As discussed during our conf call, one thing currently blocking us from leveraging hpccm
in EasyBuild (cfr. #20) is that it currently requires Python 3.x; EasyBuild is not compatible with Python 3.x yet, but we're working on it (see easybuilders/easybuild-framework#133).
I took a quick look at this, by enabling tests for Python 2.7 in Travis (cfr. #24). Here's what I ran into:
the use of from enum import Enum
implies that enum34
needs to be installed (https://pypi.org/project/enum34/), which is fine imho
with enum34
installed, a bunch of tests fail with:
File "hpccm/recipe.py", line 75
raise e from e
^
SyntaxError: invalid syntax
Changing raise e from e
back to just raise e
fixes that, but I'm not sure if that's OK to do?
With the above changes, just two tests fail on top of Python 2.7:
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/hpc-container-maker/test/test_sed.py", line 37, in test_basic
r's/FOO = BAR/FOO = BAZ/g']),
File "hpccm/sed.py", line 54, in sed_step
quoted_patterns = ['-e {}'.format(shlex.quote(patterns[0]))]
AttributeError: 'module' object has no attribute 'quote'
======================================================================
ERROR: test_verify (test_git.Test_git)
git with verification enabled
----------------------------------------------------------------------
Traceback (most recent call last):
File "/tmp/hpc-container-maker/test/test_git.py", line 96, in test_verify
branch=valid_branch, verify=True),
File "hpccm/git.py", line 104, in clone_step
fatal=fatal)
File "hpccm/git.py", line 53, in __verify
p = subprocess.Popen(command, shell=True, stdout=subprocess.DEVNULL,
AttributeError: 'module' object has no attribute 'DEVNULL'
----------------------------------------------------------------------
I'm happy to help out with restoring Python 2 compatibility if that's desired. If so, it would be useful to have #24 merged first, so we can rely on Travis to check Python 2.7 compatibility, and to ensure the codebase stays compatible with both Python 2.7 & 3.x.
The NetCDF tarballs are no longer available on are no longer available on https://www.unidata.ucar.edu/downloads/netcdf/ftp they moved to GitHub:
and changed their name to no longer include netcdf-c-
, netcdf-cxx4-
or netcdf-fortran-
instead all tarballs just use vX.Y.Z
, e.g. for netcdf-c-4.6.3
the tarball URL is now https://github.com/Unidata/netcdf-c/archive/v4.6.3.tar.gz
What is the correct syntax to install netcdf with pnetcdf support using hpc-container-maker?
Assuming that pnetcdf is installed in /usr/local/pnetcdf, it does not seem sufficient to add
enable_pnetcdf='/usr/local/pnetcdf'
Whatever I tried nc-config --all always shows --has-pnetcdf -> no
This is both a question and maybe feature request. I wanted to make sure that the current intel_psxe building block doesn't work for Intel oneAPI, right? If not, I might try to make a PR to do it...but my Python is not good. :)
Environment files are not working for Singularity. I get the error "EnvironmentFileNotFound: '/var/tmp/environment.yml' file not found". If I go into the generated ".def" file and change the location of the environment.yml file to be /environment.yml, it works. I'm not sure why, but the location /var/tmp does not work.
Minimum example:
"""
Conda Error Min Example
"""
from __future__ import absolute_import
from __future__ import unicode_literals
from __future__ import print_function
import hpccm
if __name__ == '__main__':
### Create Stage
stage = hpccm.Stage()
stage += hpccm.primitives.baseimage(image='nvidia/cuda:11.1-cudnn8-devel-centos8')
### Install Conda Python with environment
stage += hpccm.building_blocks.conda(environment="environment.yml",
eula=True,
ospackages=['wget', 'ca-certificates', 'git'])
### Set container specification output format
hpccm.config.set_container_format("singularity")
### Output container specification
print(stage)
I get a missing library file which I think is because hpccm copying only libraries in REDIST folder as:
COPY --from=0 /opt/pgi/linux86-64/18.4/REDIST/*.so /opt/pgi/linux86-64/18.4/lib/
I think my app will work if hpccm generated the following line instead copying all the library files
COPY --from=0 /opt/pgi/linux86-64/18.4/lib/*.so /opt/pgi/linux86-64/18.4/lib/
====================
[root@8b7645d85a7f em_les]# ./ideal.exe
./ideal.exe: error while loading shared libraries: libcudaforwrapblas.so: cannot open shared object file: No such file or directory
[root@8b7645d85a7f em_les]# ls /opt/pgi/linux86-64/18.4/lib
libaccapi.so libaccgmp.so libblas.so libcudafor2.so libcudapgi.so libpgc.so libpgf90_rpm1_p.so libpgnod_prof.so libpgnod_prof_mpi3.so libpgnod_prof_time.so
libaccapimp.so libaccn.so libcublasemu.so libcudafor80.so libhugetlbfs_pgi.so libpgf90.so libpgf90rtl.so libpgnod_prof_g.so libpgnod_prof_mvapich.so libpgnuma.so
libaccg.so libaccnc.so libcudacemu.so libcudafor90.so liblapack.so libpgf902.so libpgftnrtl.so libpgnod_prof_inst.so libpgnod_prof_mvapich2.so
libaccg2.so libaccncmp.so libcudadevice.so libcudafor91.so libnuma.so libpgf90_prof.so libpgmath.so libpgnod_prof_mpi.so libpgnod_prof_openmpi.so
libaccg2mp.so libaccnmp.so libcudafor.so libcudaforblas.so libpgatm.so libpgf90_rpm1.so libpgmp.so libpgnod_prof_mpi2.so libpgnod_prof_pfo.so
==================
You can see libcudaforwrapblas.so is not in the PGI lib path.
However, if I go through only the first stage Stage0, everything works perfectly
===================
[root@4e464d1dcac4 em_les]# ./ideal.exe
IDEAL V3.8.1 PREPROCESSOR
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1 , 2772348328 bytes allocated
pi is 3.141593
[root@4e464d1dcac4 /]# ls /usr/local/cuda/lib64/
libOpenCL.so libcudart.so libcuinj64.so.9.0.176 libnppc.so libnppicom.so.9.0.176 libnppim.so libnppitc.so.9.0.176 libnvgraph.so.9.0
libOpenCL.so.1 libcudart.so.9.0 libculibos.a libnppc.so.9.0 libnppicom_static.a libnppim.so.9.0 libnppitc_static.a libnvgraph.so.9.0.176
libOpenCL.so.1.0 libcudart.so.9.0.176 libcurand.so libnppc.so.9.0.176 libnppidei.so libnppim.so.9.0.176 libnpps.so libnvgraph_static.a
libOpenCL.so.1.0.0 libcudart_static.a libcurand.so.9.0 libnppc_static.a libnppidei.so.9.0 libnppim_static.a libnpps.so.9.0 libnvrtc-builtins.so
libaccinj64.so libcufft.so libcurand.so.9.0.176 libnppial.so libnppidei.so.9.0.176 libnppist.so libnpps.so.9.0.176 libnvrtc-builtins.so.9.0
libaccinj64.so.9.0 libcufft.so.9.0 libcurand_static.a libnppial.so.9.0 libnppidei_static.a libnppist.so.9.0 libnpps_static.a libnvrtc-builtins.so.9.0.176
libaccinj64.so.9.0.176 libcufft.so.9.0.176 libcusolver.so libnppial.so.9.0.176 libnppif.so libnppist.so.9.0.176 libnvToolsExt.so libnvrtc.so
libcublas.so libcufft_static.a libcusolver.so.9.0 libnppial_static.a libnppif.so.9.0 libnppist_static.a libnvToolsExt.so.1 libnvrtc.so.9.0
libcublas.so.9.0 libcufftw.so libcusolver.so.9.0.176 libnppicc.so libnppif.so.9.0.176 libnppisu.so libnvToolsExt.so.1.0.0 libnvrtc.so.9.0.176
libcublas.so.9.0.176 libcufftw.so.9.0 libcusolver_static.a libnppicc.so.9.0 libnppif_static.a libnppisu.so.9.0 libnvblas.so stubs
libcublas.so.9.0.425 libcufftw.so.9.0.176 libcusparse.so libnppicc.so.9.0.176 libnppig.so libnppisu.so.9.0.176 libnvblas.so.9.0
libcublas_device.a libcufftw_static.a libcusparse.so.9.0 libnppicc_static.a libnppig.so.9.0 libnppisu_static.a libnvblas.so.9.0.176
libcublas_static.a libcuinj64.so libcusparse.so.9.0.176 libnppicom.so libnppig.so.9.0.176 libnppitc.so libnvblas.so.9.0.425
libcudadevrt.a libcuinj64.so.9.0 libcusparse_static.a libnppicom.so.9.0 libnppig_static.a libnppitc.so.9.0 libnvgraph.so
======================
Hello,
I'm trying to add the upstream repository for llvm to the GROMACS docker image build (https://github.com/gromacs/gromacs/blob/master/admin/containers/scripted_gmx_docker_builds.py), but I'm running into issues with hpccm adding the Ubuntu Xenial repository by default for the upstream.
Changes to the python script to fetch the upstream
diff --git a/admin/containers/scripted_gmx_docker_builds.py b/admin/containers/scripted_gmx_docker_builds.py
index 3ed8cb7020..9e149c69b6 100755
--- a/admin/containers/scripted_gmx_docker_builds.py
+++ b/admin/containers/scripted_gmx_docker_builds.py
@@ -245,7 +245,7 @@ def get_compiler(args, compiler_build_stage: hpccm.Stage = None) -> bb_base:
raise RuntimeError('No TSAN compiler build stage!')
# Build the default compiler if we don't need special support
else:
- compiler = hpccm.building_blocks.llvm(extra_repository=True, version=args.llvm)
+ compiler = hpccm.building_blocks.llvm(extra_repository=True, upstream=True, version=args.llvm)
elif args.oneapi is not None:
if compiler_build_stage is not Non
Log from building:
Step 3/80 : RUN wget -qO - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - && echo "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial main" >> /etc/apt/sources.list.d/hpccm.list && echo "deb-src http://apt.llvm.org/xenial/ llvm-toolchain-xenial main" >> /etc/apt/sources.list.d/hpccm.list && apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends clang-12 libomp-12-dev && rm -rf /var/lib/apt/lists/*
As you can see, instead of the current Ubuntu version 20.04 (Focal), the old Xenial release is picked up, and thus fails later to install clang-12.
Can this be changed in the config to pick up a different version, or to change the default?
Cheers
Paul
I was referred to this (seemingly quite new) project by @gppezzi, which is quite interesting to us EasyBuilders since we have recently added support in EasyBuild (http://easybuilders.github.io/easybuild/) to generate Singularity definition files, and (optionally) also call out to Singularity to build the container image as well, see http://easybuild.readthedocs.io/en/latest/Containers.html .
It seems like it would be interesting and mutually beneficial to look into integration between HPCCM & EasyBuild, and I think this can be done in two main ways:
HPCCM can be enhanced to provide an easybuild
function, which can be used in recipe files to generate EasyBuild commands to install software, sort of similar to the openmpi
function that is already supported
EasyBuild can be enhanced to leverage HPCCM rather than implementing it's own functionality to create Docker/Singularity definition files
It's unclear to me whether HPCCM is currently ready to be used as a library rather than a command line tool, but if it's not we can probably contribute to making that work.
Maybe we should set up a conf call to discuss this further?
Not sure there is something to change on the HPCCM side but I am having problem building a container image based on ubuntu because of nvidia repository:
W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease' is not signed.
W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease' is not signed.
I want to understand the difference between "recipe files" and "module files"
I believe the differences are described here
But there isn't much detail. What is the difference between code written as a recipe and code written as a module? I can guess at a few. But, for instance, does the hpccm tool auto import hpccm before loading the recipe files?
Hi, I'm trying to run Relion and get this error: ERROR: CUDA driver version is insufficient for CUDA runtime version in /var/tmp/relion/src/ml_optimiser.cpp at line 1143 (error-code 35)
in: /var/tmp/relion/src/acc/cuda/cuda_settings.h, line 67
Please help.Thanks.
ADD some.tar.gz /data
would extract the tarball to destination, but COPY
only copy the file, no extraction.
Extracting source code for compile or extract binary files into container would be common usage for docker building from my experience. I hope this primitive can be added into hpccm.
(This might be a docker specific problem.)
All,
I'm trying to learn how to use Docker, etc., and figured I'd lean on NVIDIA's experts in creating non-trivial build images/containers (I still am not fluent with the lingo) for things like an GCC 9/Open MPI 4.0.2 setup.
So, I created a recipe cribbed off of the hpcbase-gnu-openmpi one but using CentOS 8 to try and get close to a Dockerfile a more knowledgeable coworker created by hand. To wit:
Stage0 += comment(__doc__, reformat=False)
Stage0 += baseimage(image="centos:8")
# Python
Stage0 += python()
# GNU compilers
compiler = gnu(version='9')
#compiler = gnu(extra_repository=True, version='9')
#compiler = gnu(extra_repository=False, version='9')
Stage0 += compiler
# OpenMPI
Stage0 += openmpi(cuda=False, infiniband=False,
version='4.0.2', toolchain=compiler.toolchain)
However, when I try to build this:
$ docker build -t fortran/gcc9-openmpi402:v1.0.0 -f Dockerfile .
Sending build context to Docker daemon 6.656kB
Step 1/7 : FROM centos:8
---> 470671670cac
Step 2/7 : RUN yum install -y python2 python3 && rm -rf /var/cache/yum/*
---> Using cache
---> 02444ced632f
Step 3/7 : RUN yum install -y centos-release-scl && yum install -y devtoolset-9-gcc devtoolset-9-gcc-c++ devtoolset-9-gcc-gfortran && rm -rf /var/cache/yum/*
---> Running in ef980a85cdcd
Last metadata expiration check: 0:02:39 ago on Wed Feb 12 14:46:59 2020.
No match for argument: centos-release-scl
Error: Unable to find a match: centos-release-scl
The command '/bin/sh -c yum install -y centos-release-scl && yum install -y devtoolset-9-gcc devtoolset-9-gcc-c++ devtoolset-9-gcc-gfortran && rm -rf /var/cache/yum/*' returned a non-zero code: 1
I've tried all the variants seen above:
compiler = gnu(version='9')
compiler = gnu(extra_repository=True, version='9')
compiler = gnu(extra_repository=False, version='9')
but none of them seem to work (which seems to track with the "if you set version
you get extra in the docs).
I suppose the main question is: what have I done wrong? I'm sure it's simple, but I'm a bit lost right now.
Currently, adding the llvm building block only adds the base compiler to the image.
It would be great if it was also possible to add the corresponding tools as well, to allow the generated image to be used for source code linting.
env:
sudo pip install hpccm
script:
import hpccm
# Use appropriate container base images based on the CPU architecture
arch = 'x86_64'
default_build_image = 'nvidia/cuda:10.1-devel-ubuntu18.04'
default_runtime_image = 'nvidia/cuda:10.1-base-ubuntu18.04'
########
# Build stage (Stage 0)
########
# Base image
Stage0 += baseimage(image=USERARG.get('build_image', default_build_image),
_arch=arch, _as='build')
Stage0 += ucx(
enable_devel-headers=True,
gdrcopy='/usr/local/gdrcopy',
knem='/usr/local/knem',
without_java=True,
ofed=True,
ldconfig=True,
version='1.7.0',
)
cmd:
hpccm --recipe test.py --format singularity --singularity-version=3.2 > test.def
msg:
ERROR: keyword can't be an expression (test.py, line 17)
other blocks package have same issue with enable_FEATURE
I try to build boost with the building block but I get an error code 8.
I found out, that wget fails:
wget -nc --no-check-certificate -P /tmp https://dl.bintray.com/boostorg/release/1.74.0/source/boost_1_74_0.tar.bz2
--2021-06-14 11:43:32-- https://dl.bintray.com/boostorg/release/1.74.0/source/boost_1_74_0.tar.bz2
Resolving dl.bintray.com (dl.bintray.com)... 3.127.93.119, 18.196.33.98
Connecting to dl.bintray.com (dl.bintray.com)|3.127.93.119|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-06-14 11:43:32 ERROR 403: Forbidden.
If I go to the boost site and copy the download link I get another one, which is working:
wget https://boostorg.jfrog.io/artifactory/main/release/1.74.0/source/boost_1_74_0.tar.bz2
--2021-06-14 11:47:22-- https://boostorg.jfrog.io/artifactory/main/release/1.74.0/source/boost_1_74_0.tar.bz2
Resolving boostorg.jfrog.io (boostorg.jfrog.io)... 35.80.249.196, 54.148.141.177, 52.43.90.32, ...
Connecting to boostorg.jfrog.io (boostorg.jfrog.io)|35.80.249.196|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com/aol-boostorg/filestore/f8/f82c0d8685b4d0e3971e8e2a8f9ef1551412c125?x-jf-traceId=49786281f31d0f64&response-content-disposition=attachment%3Bfilename%3D%22boost_1_74_0.tar.bz2%22&response-content-type=application%2Fx-bzip2&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210614T094723Z&X-Amz-SignedHeaders=host&X-Amz-Expires=60&X-Amz-Credential=AKIASG3IHPL63WBBRCUD%2F20210614%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=226ca0520d266f89c911b5be749c29fac47a8106a33ee4efc0e4b8b00517a364 [following]
--2021-06-14 11:47:23-- https://jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com/aol-boostorg/filestore/f8/f82c0d8685b4d0e3971e8e2a8f9ef1551412c125?x-jf-traceId=49786281f31d0f64&response-content-disposition=attachment%3Bfilename%3D%22boost_1_74_0.tar.bz2%22&response-content-type=application%2Fx-bzip2&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210614T094723Z&X-Amz-SignedHeaders=host&X-Amz-Expires=60&X-Amz-Credential=AKIASG3IHPL63WBBRCUD%2F20210614%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=226ca0520d266f89c911b5be749c29fac47a8106a33ee4efc0e4b8b00517a364
Resolving jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com (jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com)... 52.218.169.83
Connecting to jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com (jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com)|52.218.169.83|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 109600630 (105M) [application/x-bzip2]
Saving to: ‘boost_1_74_0.tar.bz2’
boost_1_74_0.tar.bz2 100%[============================================================================================================================================================================>] 104,52M 9,49MB/s in 13s
2021-06-14 11:47:36 (8,09 MB/s) - ‘boost_1_74_0.tar.bz2’ saved [109600630/109600630]
Hi guys, I'm having troubles installing NetCDF with HPCCM. The issue is that the default version of netcdf-c library is now 4.7.3, and the later releases are not available via ftp anymore, see ftp://ftp.unidata.ucar.edu/pub/netcdf/
Specifying explicit version='4.7.3' works fine.
A different solution would also be to change the FTP web-address to the one directly from github releases (there the version archive is still maintained): https://github.com/Unidata/netcdf-c/releases
Exactly same stuff with versions apply to netcdf-fortran and netcdf-cxx
Cheers!
Hello,
I am trying to test the multistage capability of hpccm recipe made for singularity. I started with a simple example made for singularity 3.4. I want to build an image with pgi compiler starting from a docker devel image in stage0, make the stage1 and move file from stage0 to stage1 according to the default recipe
hpcbase-pgi-1910-openmpi.py.txt
recipe_hpcbase-pgi-1910-openmpi_runtime.ref.txt
Using the following commands (in a server with Centos 7)
hpccm --recipe hpcbase-pgi-1910-openmpi.py --format singularity --singularity-version=3.4 > recipe_hpcbase-pgi-1910-openmpi_runtime.ref
sudo -E SINGULARITY_TMPDIR=/home/simone/Documents/sing_tmpdir SINGULARITY_CACHEDIR=/home/simone/Documents/sing_cachedir singularity -d img_hpcbase-pgi-1910-openmpi_runtime.sif recipe_hpcbase-pgi-1910-openmpi_runtime.ref
I got this error when I pass from stage0 to stage1, I used the -d option when I run the command
: [/home/simone/Documents/sing_tmpdir/sbuild-855637488 /home/simone/Documents/sing_tmpdir/sbuild-042428559]
FATAL [U=0,P=13655] run() While performing build: unable to copy files a stage to container fs: stage 0 was not found
I put my hpccm recipes, the corresponding singularity recipe in attachment
Could you please help me to understand what is going wrong ?
Thank you
The intel_psxe building block installs Intel Parallel Studio XE and requires a license file and a tarball:
Stage0 += baseimage(image='ubuntu:{}'.format(18.04), _as='build')
#Stage0 += intel_psxe(eula=True, license='license.lic', tarball='parallel_studio_xe_2020_update4_cluster_edition.tgz')
When building a singularity container these files are apparently copied into a temporary folder (whose name changes every time):
INFO: Copying parallel_studio_xe_2020_update4_cluster_edition.tgz to /tmp/build-temp-795513904/rootfs/var/tmp/parallel_studio_xe_2020_update4_cluster_edition.tgz
INFO: Copying license.lic to /tmp/build-temp-795513904/rootfs/var/tmp/license.lic
which means that the build fails when they are not found:
I tried to copy the files in other locations but they always end up in temporary folders and are never found during the build
Can you help me sort this out?
Hello,
When trying to build a Singularity container with nvshmem v2.2.1 (by using the nvshmem building block along with other dependency building blocks), it seems that nvshmem does not actually build when the hydra=True parameter is set as the nvshmem include & lib directories are not present in /usr/local/nvshmem/ and attempting to compile nvshmem programs with nvcc fail (because of this). However, the hydra launcher successfully installs and seemingly is able to launch pre-compiled nvshmem programs.
After removing the flag, nvshmem seems to build normally within the container.
I am not sure if I am just doing something wrong in my recipe or if this is the intended behavior, but I thought I would go ahead and post just in case!
Hello,
for my projects I need some open-source libraries, which I build from scratch. So I need to run make install
for these libraries. Unfortunately, I can't find a Python function that generates the make install
command. Did I miss something? If not, can we add a install_step()
function to CMakeBuild
? It would improve the readability and maintainability of my receipts. At the moment I'm using shell()
functions.
Thank you for your help,
Simeon
The gfortran
runtime library for Ubuntu based images is set to libgfortran3
hpc-container-maker/hpccm/building_blocks/gnu.py
Lines 409 to 411 in adbe43f
This is correct on ubuntu:16.04
. However on ubuntu:18.04
this should be libgfortran4
, and libgfortran5
on ubuntu:20.04
.
Would you be interested on a PR?
For releases can you please create and upload a universal wheel in addition to the source distribution? You can create them with:
python setup.py bdist_wheel --universal
I am trying to run hpccm in JupyterLite and installation in this environment requires a wheel:
import micropip
await micropip.install("hpccm")
...
ValueError: Couldn't find a pure Python 3 wheel for 'hpccm'
I want to use gromacs.py to generate gromacs container, but in the step 2/23 i hava meet the problem following:
Step 2/23 : RUN apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends python && rm -rf /var/lib/apt/lists/*
---> Running in 87c834e52e38
Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:2 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:3 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [1019 kB]
Ign:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 InRelease
Get:5 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 InRelease
Get:7 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Get:8 http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [12.7 kB]
Get:9 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:10 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [592 kB]
Get:11 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Release [169 B]
Get:12 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Release [169 B]
Get:13 http://security.ubuntu.com/ubuntu xenial-security/multiverse amd64 Packages [6280 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Release.gpg [169 B]
Get:15 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Release.gpg [169 B]
Get:16 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64 Packages [252 kB]
Get:17 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:18 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:19 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Packages [73.4 kB]
Err:19 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 Packages
Hash Sum mismatch
Get:20 http://archive.ubuntu.com/ubuntu xenial/multiverse amd64 Packages [176 kB]
Get:21 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [1396 kB]
Get:22 http://archive.ubuntu.com/ubuntu xenial-updates/restricted amd64 Packages [13.1 kB]
Get:23 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [996 kB]
Get:24 http://archive.ubuntu.com/ubuntu xenial-updates/multiverse amd64 Packages [19.3 kB]
Get:25 http://archive.ubuntu.com/ubuntu xenial-backports/main amd64 Packages [7942 B]
Get:26 http://archive.ubuntu.com/ubuntu xenial-backports/universe amd64 Packages [8807 B]
Fetched 16.5 MB in 1min 13s (225 kB/s)
Reading package lists...
E: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/Packages.gz Hash Sum mismatch
E: Some index files failed to download. They have been ignored, or old ones used instead.
The command '/bin/sh -c apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends python && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
It would be nice to have a way to install nsight compute only without the rest of the HPC SDK.
I want to specify the components of Boost to be built:
boost(
prefix=/opt/boost/1.73.0,
version="1.73.0",
bootstrap_opts=["--with-libraries=atomic,chrono"]
)
The output is ./bootstrap.sh --prefix=/opt/boost1.73.0 --with-libraries=atomic,chrono --without-libraries=python
which is not allowed by boost because you cannot use --with-libraries
and --without-libraries
together. The workaround is to set python=True
in the hpccm code, which is semantically incorrect because it means to build Python bindings, which it does not.
For our CI we install different versions of Clang/LLVM in parallel in a container. Unfortunately there is a dependency between lipomp5-7
and lipomp5-8
:
Step 20/28 : RUN apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends clang-8 libomp-8-dev && rm -rf /var/lib/apt/lists/*
---> Running in c82cf74f02b3
Get:2 http://archive.ubuntu.com/ubuntu bionic InRelease [242 kB]
Get:1 https://apt.llvm.org/bionic llvm-toolchain-bionic InRelease [4232 B]
Get:3 https://apt.llvm.org/bionic llvm-toolchain-bionic-10 InRelease [4232 B]
Get:4 https://apt.llvm.org/bionic llvm-toolchain-bionic-11 InRelease [4232 B]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:6 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:8 https://apt.llvm.org/bionic llvm-toolchain-bionic/main Sources [2193 B]
Get:9 https://apt.llvm.org/bionic llvm-toolchain-bionic/main amd64 Packages [10.6 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic/restricted amd64 Packages [13.5 kB]
Get:11 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages [1344 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [11.3 MB]
Get:13 https://apt.llvm.org/bionic llvm-toolchain-bionic-10/main Sources [1665 B]
Get:14 https://apt.llvm.org/bionic llvm-toolchain-bionic-10/main amd64 Packages [8762 B]
Get:15 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [186 kB]
Get:16 https://apt.llvm.org/bionic llvm-toolchain-bionic-11/main Sources [1666 B]
Get:17 https://apt.llvm.org/bionic llvm-toolchain-bionic-11/main amd64 Packages [8738 B]
Get:18 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [44.6 kB]
Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [220 kB]
Get:20 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [2110 kB]
Get:21 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2095 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports/main amd64 Packages [11.3 kB]
Get:23 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [11.4 kB]
Get:24 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1332 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [193 kB]
Get:26 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [1693 kB]
Get:27 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [14.6 kB]
Fetched 21.1 MB in 2s (8756 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
libclang-common-8-dev libclang1-8 libllvm8 libomp5-8
Suggested packages:
clang-8-doc libomp-8-doc
Recommended packages:
llvm-8-dev python
The following NEW packages will be installed:
clang-8 libclang-common-8-dev libclang1-8 libllvm8 libomp-8-dev libomp5-8
0 upgraded, 6 newly installed, 0 to remove and 18 not upgraded.
Need to get 31.9 MB of archives.
After this operation, 173 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libllvm8 amd64 1:8-3~ubuntu18.04.2 [13.6 MB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libclang-common-8-dev amd64 1:8-3~ubuntu18.04.2 [3802 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libclang1-8 amd64 1:8-3~ubuntu18.04.2 [6225 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 clang-8 amd64 1:8-3~ubuntu18.04.2 [7940 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libomp5-8 amd64 1:8-3~ubuntu18.04.2 [299 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libomp-8-dev amd64 1:8-3~ubuntu18.04.2 [56.2 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 31.9 MB in 1s (35.2 MB/s)
Selecting previously unselected package libllvm8:amd64.
(Reading database ... 14420 files and directories currently installed.)
Preparing to unpack .../0-libllvm8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libllvm8:amd64 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libclang-common-8-dev.
Preparing to unpack .../1-libclang-common-8-dev_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libclang-common-8-dev (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libclang1-8.
Preparing to unpack .../2-libclang1-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libclang1-8 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package clang-8.
Preparing to unpack .../3-clang-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking clang-8 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libomp5-8:amd64.
Preparing to unpack .../4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libomp5-8:amd64 (1:8-3~ubuntu18.04.2) ...
dpkg: error processing archive /tmp/apt-dpkg-install-04VwC3/4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb (--unpack):
trying to overwrite '/usr/lib/x86_64-linux-gnu/libomp.so.5', which is also in package libomp5-7:amd64 1:7-3~ubuntu0.18.04.1
Selecting previously unselected package libomp-8-dev.
Preparing to unpack .../5-libomp-8-dev_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libomp-8-dev (1:8-3~ubuntu18.04.2) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-04VwC3/4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
The command '/bin/sh -c apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends clang-8 libomp-8-dev && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
For us it is not necessary to install different OpenMP versions in parallel. Can you please make the OpenMP installation optional.
Cheers,
Simeon
After updating from 20.8.0 to 20.9.0 my scripts fail to build NetCDF 4.7.3 with the following error
configure: error: netcdf-c version 4.7.4 or greater is required
Before the update, NetCDF 4.7.3 meant netcdf-c/4.7.3 with netcdf-cxx4/4.3.1 and netcdf-fortran/4.5.2,
after the update it tries to compile with netcdf-fortran/4.5.3. This seems to be the cause of the error.
The following recipe.py
will build without issues on 20.8.0 and trigger the error on 20.9.0
Stage0 += baseimage(image="ubuntu:18.04", _as='build')
# gCC & gFortran
compiler = gnu()
Stage0 += compiler
# OpenMPI
Stage0 += openmpi(version="4.0.2", infiniband=False, cuda=False, toolchain=compiler.toolchain)
# HDF5
Stage0 += hdf5(version="1.10.5", toolchain=compiler.toolchain)
# NetCDF
Stage0 += netcdf(version="4.7.3", toolchain=compiler.toolchain)
When installing both nsys and nsight-compute in a centos container, it issues the following command:
RUN rpm --import https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64/nvidia.pub && \
yum install -y yum-utils && \
yum-config-manager --add-repo https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 && \
yum install -y \
nsight-systems-cli-2021.1.1 && \
rm -rf /var/cache/yum/*
# NVIDIA Nsight Compute 2020.2.1
RUN rpm --import https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64/nvidia.pub && \
yum install -y yum-utils && \
yum-config-manager --add-repo https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 && \
yum install -y \
nsight-compute-2020.2.1 && \
rm -rf /var/cache/yum/*
This errors out because yum-config-manager does not allow the same repo to be added twice:
adding repo from: https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64
Cannot add repo from https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 as is a duplicate of an existing repo
When generating Singularity definition files, HPCCM makes use of multiple %post
, %environment
, etc. sections. There is a bug in Singularity 3.x where only one section is recognized (see sylabs/singularity#2349).
The workarounds are:
Many thanks for developing this amazing tool.
I just want to use hcoll library
which inside hpcx
to compile with my openmpi.
Then I try to use hpcx block, however this block will output all of hpcx environment.(include openmpi inside hpcx)
Therefore, is there any possible to disable output of hpcx environment ?
Thank you !
If I try to use older PGI compiler version 17.10, it still copies the latest PGI community edition compiler
There is the following comment in the code which makes me think that it is not supposed to work using but the latest version
# The version is fragile since the latest version is
# automatically downloaded, which may not match this default.
self.__version = kwargs.get('version', '18.4')
self.__wd = '/var/tmp' # working directory
This ia feature request.
Singularity stores its definition file in the resulting image. This can be extracted using "singularty inspect -d". This allows you to see how the image was created. You can then edit this definition and create a new image.
I don't like editing Singularity definition files or Dockerfiles directly (they can be too long and wildly complicationed). I like editing HPCCM recipe files. However, they are not stored with the images.
I would like to see a way to store HPCCM recipes in the Singularity of Docker images. (similiar to how Singularity stores it's definition file).
For example, Singularity and Docker both have the capability of storing metadata in a section like "%label". It would be useful to store the HPCCM recipe in the %label section of the images.
Here is a simple recipe:
Stage0 += baseimage(image='nvidia/cuda:10.1-base-ubuntu18.04')
Stage0 += shell(commands=['apt-get update'])
Stage0 += shell(commands=['apt-get install -y octave'])
Stage0 += pgi(eula=True, mpi=True)
Using HPCCM to "process" this definition file,
$ hpccm --recipe test2.py --format singularity > test2.def
Produces a definition file that looks like,
BootStrap: docker
From: nvidia/cuda:10.1-base-ubuntu18.04
%post
. /.singularity.d/env/10-docker*.sh
%post
cd /
apt-get update
%post
cd /
apt-get install -y octave
%post
apt-get update -y
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends
g++
gcc
libnuma1
openssh-client
perl
wget
rm -rf /var/lib/apt/lists/*
%post
cd /
mkdir -p /var/tmp && wget -q -nc --no-check-certificate -O /var/tmp/pgi-community-linux-x64-latest.tar.gz --referer https://www.pgroup.com/products/community.htm?ut
m_source=hpccm&utm_medium=wgt&utm_campaign=CE&nvid=nv-int-14-39155 -P /var/tmp https://www.pgroup.com/support/downloader.php?file=pgi-community-linux-x64
mkdir -p /var/tmp/pgi && tar -x -f /var/tmp/pgi-community-linux-x64-latest.tar.gz -C /var/tmp/pgi -z
cd /var/tmp/pgi && PGI_ACCEPT_EULA=accept PGI_INSTALL_DIR=/opt/pgi PGI_INSTALL_MPI=true PGI_INSTALL_NVIDIA=true PGI_MPI_GPU_SUPPORT=true PGI_SILENT=true ./install
echo "variable LIBRARY_PATH is environment(LIBRARY_PATH);" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "variable library_path is default($if($LIBRARY_PATH,$foreach(ll,$replace($LIBRARY_PATH,":",), -L$ll)));" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "append LDLIBARGS=$library_path;" >> /opt/pgi/linux86-64/19.10/bin/siterc
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so.1
rm -rf /var/tmp/pgi-community-linux-x64-latest.tar.gz /var/tmp/pgi
%environment
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH
%post
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH
I would like that definition file to look like the following.
BootStrap: docker
From: nvidia/cuda:10.1-base-ubuntu18.04
%post
. /.singularity.d/env/10-docker*.sh
%post
cd /
apt-get update
%post
cd /
apt-get install -y octave
%post
apt-get update -y
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends
g++
gcc
libnuma1
openssh-client
perl
wget
rm -rf /var/lib/apt/lists/*
%post
cd /
mkdir -p /var/tmp && wget -q -nc --no-check-certificate -O /var/tmp/pgi-community-linux-x64-latest.tar.gz --referer https://www.pgroup.com/products/community.htm?ut
m_source=hpccm&utm_medium=wgt&utm_campaign=CE&nvid=nv-int-14-39155 -P /var/tmp https://www.pgroup.com/support/downloader.php?file=pgi-community-linux-x64
mkdir -p /var/tmp/pgi && tar -x -f /var/tmp/pgi-community-linux-x64-latest.tar.gz -C /var/tmp/pgi -z
cd /var/tmp/pgi && PGI_ACCEPT_EULA=accept PGI_INSTALL_DIR=/opt/pgi PGI_INSTALL_MPI=true PGI_INSTALL_NVIDIA=true PGI_MPI_GPU_SUPPORT=true PGI_SILENT=true ./install
echo "variable LIBRARY_PATH is environment(LIBRARY_PATH);" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "variable library_path is default($if($LIBRARY_PATH,$foreach(ll,$replace($LIBRARY_PATH,":",), -L$ll)));" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "append LDLIBARGS=$library_path;" >> /opt/pgi/linux86-64/19.10/bin/siterc
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so.1
rm -rf /var/tmp/pgi-community-linux-x64-latest.tar.gz /var/tmp/pgi
%environment
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH
%post
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH
%label
Stage0 += baseimage(image='nvidia/cuda:10.1-base-ubuntu18.04')
Stage0 += shell(commands=['apt-get update'])
Stage0 += shell(commands=['apt-get install -y octave'])
Stage0 += pgi(eula=True, mpi=True)
Notice how the HPCCM recipe is stored in the %label section. Now I can extract it from an image created with the definition file and edit that recipe.
Thanks!
Currently you are forced to compile CMake from source for any architecture other than x86_64. I'm having some issues getting cmake to compile for one of my recipes which I could avoid by using the pre-compiled binaries. aarch64 binaries have been available since 3.20.0 release as far as I can tell. Installation procedure is the same as x86_64 in my experience.
The argument --recipe should just be a required argument with no default.
The current behavior is confusing for new users. Running hpccm with no arguments should show the help?
/ # hpccm
ERROR: [Errno 2] No such file or directory: 'recipes/hpcbase-gnu-openmpi.py'
/ # find / -name hpcbase-gnu-openmpi.py
/ #
The behavior of the Singularity %files directive changed beginning in version 3.6. See apptainer/singularity#5514.
For this Singularity definition file:
Bootstrap: docker
From: ubuntu:18.04
%files
foo /var/tmp
%post
ls -l /var/tmp/foo
Singularity 3.5.3:
# singularity --version
singularity version 3.5.3
# singularity build foo.sif Singularity.def
INFO: Starting build...
INFO: Copying foo to /tmp/rootfs-f4ee8c55-7790-11eb-b63f-0242ac110002/var/tmp
INFO: Running post scriptlet
+ ls -l /var/tmp/foo
-rw-r--r--. 1 root root 0 Feb 25 17:43 /var/tmp/foo
INFO: Creating SIF file...
INFO: Build complete: foo.sif
Singularity 3.6.4:
# singularity --version
singularity version 3.6.4
# singularity build foo.sif Singularity.def
INFO: Starting build...
INFO: Copying foo to /tmp/build-temp-392680558/rootfs/var/tmp
INFO: Running post scriptlet
+ ls -l /var/tmp/foo
ls: cannot access '/var/tmp/foo': No such file or directory
Changing the path from /var/tmp
to /opt
works with either version:
INFO: Copying foo to /tmp/build-temp-697021000/rootfs/opt
INFO: Running post scriptlet
+ ls -l /opt/foo
-rw-r--r--. 1 root root 0 Feb 25 17:41 /opt/foo
INFO: Creating SIF file...
INFO: Build complete: foo.sif
The --working-directory=/other/path
option can be used to generate a Singularity definition file that works around this behavior change.
A better solution might be to use %setup
to stage files into the container build environment. Alternatively, use a default location other than /var/tmp
or /tmp
when copying files from the host (such as the package
option in many building blocks).
According to the nvhpc docs, one needs gcc/glibc>=10.1 for C++20 library support in the new nvhpc 22.3.
Neither
Stage0 += baseimage(image='nvcr.io/nvidia/nvhpc:22.3-devel-cuda11.6-ubuntu20.04')
Stage0 += gnu(version='10')
nor
Stage0 += baseimage(image='ubuntu:20.04')
Stage0 += gnu(version='10')
Stage0 += nvhpc(eula=True, cuda_multi=False, version='22.3')
seem to do the trick. The second one even actively installs gcc 9 again instead of using the provided gcc 10 toolchain.
I tested the support by trying to compile a source file with #include <ranges>
via nvc++ -std=c++20
inside the container (which works when using the installed g++
10.3.0).
Maybe the nvhpc building block needs a toolchain
argument?
Hi all!
It does not seem the nv_hpc_sdk building block currently supports downloading the package and I am having trouble to create a recipe which does it automatically for my CI workflow.
I tried to implement my self this way:
d = downloader(url='https://xxxxxxxxxxxxxxxx/nvhpc_2020_207_Linux_x86_64_cuda_multi.tar.gz')
Stage0 += d.download_step()
Stage0 += nv_hpc_sdk(eula=True, mpi=False, system_cuda=True,
tarball='nvhpc_2020_207_Linux_x86_64_cuda_multi.tar.gz')
Unfortunately I get a docker error when building: error building image:
parsing dockerfile: Dockerfile parse error line 57: unknown instruction: MKDIR
Anyone has an idea for me?
Thanks in advance!
I adapted gromacs.py
recipe into a recipe for singularity
. My intention is to try hpccm and also create a docker image for testing singularity.
I am having two problems:
build_cmds = ['export VERSION=1.11.4 OS=linux ARCH=amd64 && \
wget -O /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz https://dl.google.com/go/go${VERSION}.${OS}-${ARCH}.tar.gz && \
tar -C /usr/local -xzf /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz',
"echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
. ~/.bashrc ",
"mkdir -p ${GOPATH}/src/github.com/sylabs && \
cd ${GOPATH}/src/github.com/sylabs && \
git clone https://github.com/sylabs/singularity.git && \
cd singularity ",
"git checkout v{}".format(singularity_version),
"cd ${GOPATH}/src/github.com/sylabs/singularity && \
./mconfig &&\
cd ./builddir && \
make && \
make install"
]
Stage0 += shell(commands=build_cmds)
singularity
commandEven though build seems successful:
.....
# excerpt of docker build log
GEN etc/bash_completion.d/singularity
CNI PLUGIN tuning
CNI PLUGIN vlan
INSTALL /usr/local/bin/singularity
INSTALL /usr/local/etc/bash_completion.d/singularity
INSTALL /usr/local/etc/singularity/singularity.conf
INSTALL /usr/local/libexec/singularity/bin/starter
INSTALL /usr/local/var/singularity/mnt/session
INSTALL /usr/local/bin/run-singularity
INSTALL /usr/local/etc/singularity/capability.json
INSTALL /usr/local/etc/singularity/ecl.toml
....
I guess this has something to do with multi-stage build.
This is complete content of my recipe:
r"""
Generate Dockerfile for Singularity
Contents:
Ubuntu 18.04
CUDA version 10.0
GNU compilers (upstream)
OFED (upstream)
Singularity
"""
import os
from hpccm.templates.CMakeBuild import CMakeBuild
from hpccm.templates.git import git
singularity_version = USERARG.get('SINGULARITY_VERSION', '3.1.1')
Stage0 += comment(__doc__.strip(), reformat=False)
Stage0.name = 'devel'
Stage0 += baseimage(image='nvidia/cuda:10.0-devel-ubuntu18.04', _as=Stage0.name)
Stage0 += python(python3=True)
Stage0 += gnu(fortran=False)
Stage0 += packages(ospackages=['ca-certificates', 'cmake', 'git','build-essential', 'libssl-dev', 'uuid-dev', 'libgpgme11-dev', 'libseccomp-dev', 'pkg-config', 'squashfs-tools'])
Stage0 += ofed()
Stage0 += openmpi(configure_opts=['--enable-mpi-cxx'],
prefix="/opt/openmpi", version='3.0.0')
cm = CMakeBuild()
build_cmds = ['export VERSION=1.11.4 OS=linux ARCH=amd64 && \
wget -O /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz https://dl.google.com/go/go${VERSION}.${OS}-${ARCH}.tar.gz && \
tar -C /usr/local -xzf /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz',
"echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
. ~/.bashrc ",
"mkdir -p ${GOPATH}/src/github.com/sylabs && \
cd ${GOPATH}/src/github.com/sylabs && \
git clone https://github.com/sylabs/singularity.git && \
cd singularity ",
"git checkout v{}".format(singularity_version),
"cd ${GOPATH}/src/github.com/sylabs/singularity && \
./mconfig &&\
cd ./builddir && \
make && \
make install"
]
Stage0 += shell(commands=build_cmds)
# Include examples if they exist in the build context
# if os.path.isdir('recipes/singularity/examples'):
# Stage0 += copy(src='recipes/singularity/examples', dest='/workspace/examples')
# Stage0 += environment(variables={'PATH': '$PATH:/singularity/install/bin'})
Stage0 += label(metadata={'io.sylabs.singularity.version': singularity_version})
Stage0 += workdir(directory='/workspace')
######
# Runtime image stage
######
Stage1.baseimage('nvidia/cuda:10.0-runtime-ubuntu18.04')
Stage1 += Stage0.runtime(_from=Stage0.name)
# Stage1 += copy(_from=Stage0.name, src='${GOPATH}/src/github.com/sylabs/singularity/install',
# dest='/singularity/install')
# Include examples if they exist in the build context
# if os.path.isdir('recipes/singularity/examples'):
# Stage1 += copy(src='recipes/singularity/examples', dest='/workspace/examples')
# Stage1 += environment(variables={'PATH': '$PATH:/singularity/install/bin'})
Stage1 += label(metadata={'io.sylabs.singularity.version': singularity_version})
Stage1 += workdir(directory='/workspace')
I want to build my boost with C++ 14, so I need to pass the flag cxxflags="-std=c++14"
to the ./b2
build process: ./b2 cxxflags="-std=c++14" -j$(nproc) -q install
. There is currently no way to pass arguments to ./b2
:
The ubuntu default cmake might not always be the latest and greatest.
Cmake binaries are readily available at https://cmake.org/download/.
The .sh
files unpack in the default system location.
E.g. in my singularity files I use
# CMAKE
wget https://cmake.org/files/v3.10/cmake-3.10.0-Linux-x86_64.sh
sh cmake-3.10.0-Linux-x86_64.sh --skip-license
rm cmake-3.10.0-Linux-x86_64.sh
to install cmake.
I might be able to figure out on my own how to do it but would need to find some time.
The link points to HPC-X package is invalid anymore: http://www.mellanox.com/downloads/hpc/hpc-x/v2.8/hpcx-v2.8.1-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1-ubuntu18.04-x86_64.tbz
First, let me take this opportunity to thank you for providing this extremely useful tool. I'm a big fan. We use this for generating a variety of containers.
However, currently I am focusing on intel Singularity containers and I cannot get them to work. After having problems with more sophisticated applications, I went back to a simple recipe file that includes a "hello world" mpi application:
"""Intel/impi Development container
"""
import os
# Base image
Stage0.baseimage('ubuntu:18.04')
Stage0 += apt_get(ospackages=['build-essential','tcsh','csh','ksh','git',
'openssh-server','libncurses-dev','libssl-dev',
'libx11-dev','less','man-db','tk','tcl','swig',
'bc','file','flex','bison','libexpat1-dev',
'libxml2-dev','unzip','wish','curl','wget',
'libcurl4-openssl-dev','nano','screen', 'libasound2',
'libgtk2.0-common','software-properties-common',
'libpango-1.0.0','xserver-xorg','dirmngr',
'gnupg2','lsb-release','vim'])
# Install Intel compilers, mpi, and mkl
Stage0 += intel_psxe(eula=True, license=os.getenv('INTEL_LICENSE_FILE',default='intel_license/****.lic'), tarball=os.getenv('INTEL_TARBALL',default='intel_tarballs/parallel_studio_xe_2019_update5_cluster_edition.tgz'))
# Install application
Stage0 += copy(src='hello_world_mpi.c', dest='/root/jedi/hello_world_mpi.c')
Stage0 += shell(commands=['export COMPILERVARS_ARCHITECTURE=intel64',
'. /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh',
'cd /root/jedi','mpiicc hello_world_mpi.c -o /usr/local/bin/hello_world_mpi -lstdc++'])
Stage0 += runscript(commands=['/bin/bash -l'])
If I build a docker image with this, it works fine:
CNAME=intel19-impi-hello
hpccm --recipe $CNAME.py --format docker > Dockerfile.$CNAME
sudo docker image build -f Dockerfile.${CNAME} -t jedi-${CNAME} .
ubuntu@ip-172-31-87-130:~/jedi$ sudo docker run --rm -it jedi-intel19-impi-hello:latest
root@1dfdbccc1110:/# mpirun -np 4 hello_world_mpi
Hello from rank 1 of 4 running on 1dfdbccc1110
Hello from rank 2 of 4 running on 1dfdbccc1110
Hello from rank 0 of 4 running on 1dfdbccc1110
Hello from rank 3 of 4 running on 1dfdbccc1110
But, if I try to build a singularity image I get this:
hpccm --recipe $CNAME.py --format singularity > Singularity.$CNAME
sudo singularity build $CNAME.sif Singularity.$CNAME
ubuntu@ip-172-31-87-130:~/jedi$ singularity shell -e intel19-impi-hello.sif
Singularity intel19-impi-hello.sif:~/jedi> source /etc/bash.bashrc
ubuntu@ip-172-31-87-130:~/jedi$ mpirun -np 4 hello_world_mpi
[mpiexec@ip-172-31-87-130] enqueue_control_fd (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:70): assert (!closed) failed
[mpiexec@ip-172-31-87-130] launch_bstrap_proxies (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:517): error enqueuing control fd
[mpiexec@ip-172-31-87-130] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:714): unable to launch bstrap proxy
[mpiexec@ip-172-31-87-130] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1919): error setting up the boostrap proxies
I get the same thing if I try to create the Singularity image from the docker image:
sudo singularity build intel19-impi-hello.sif docker-daemon:jedi-intel19-impi-hello:latest
For much more detail, please see the corresponding issue on the sylabs github site
I just wanted to see if anyone here had any tips on building a working singularity container with the intel_psxe
building block. Thanks!
Hi,
our workflow needs for the moment a python 3.7 version of conda (nothing I can change on our side). Until last version, I was able to retrieve it by giving 'py37_4.8.3' as version argument, but since this version, the version argument is strictly checked in its format, and py38 (or py27) is forced.
Could the __python_version be modified by the user as well ? That would be useful for us.
i use sudo docker build -t hpccm -f Dockerfile .
to generate HPCCM container ,but meet some problem。
Sending build context to Docker daemon 6.144kB
Step 1/3 : FROM python:3-slim
---> ca7f9e245002
Step 2/3 : RUN pip install --no-cache-dir hpccm
---> Running in b34d42cd6c00
Collecting hpccm
Downloading https://files.pythonhosted.org/packages/b8/bd/a88fe1e1fa0bf5b17f3a2d4d001050bdb1e2c89cd9bf5bb24c47b314b33b/hpccm-19.5.0.tar.gz (133kB)
ERROR: Complete output from command python setup.py egg_info:
ERROR: Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-install-biiwov0l/hpccm/setup.py", line 10, in <module>
from hpccm.version import __version__
File "/tmp/pip-install-biiwov0l/hpccm/hpccm/__init__.py", line 22, in <module>
from hpccm.Stage import Stage
File "/tmp/pip-install-biiwov0l/hpccm/hpccm/Stage.py", line 25, in <module>
from hpccm.primitives.baseimage import baseimage
File "/tmp/pip-install-biiwov0l/hpccm/hpccm/primitives/__init__.py", line 24, in <module>
from hpccm.primitives.runscript import runscript
File "/tmp/pip-install-biiwov0l/hpccm/hpccm/primitives/runscript.py", line 23, in <module>
from six.moves import shlex_quote
ModuleNotFoundError: No module named 'six'
----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-biiwov0l/hpccm/
The command '/bin/sh -c pip install --no-cache-dir hpccm' returned a non-zero code: 1
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.