nvidia / hpc-container-maker Goto Github PK

HPC Container Maker

License: Apache License 2.0

Python 99.51% Shell 0.40% Dockerfile 0.03% Jupyter Notebook 0.06%

hpc-container-maker's Issues

How to specify the architecture in a conda building block?

By default when installing Anaconda the architecture is Linux-x86_64 (with Miniconda3-py38_4.8.3-Linux-x86_64.sh).
How to change to Linux-ppc64le (which would require for example Miniconda3-py38_4.9.2-Linux-ppc64le.sh)?

Need to update default version of Slurm

The default Slurm v20.11.7 has been removed from their website in favor of v20.11.9

Support for chmod flag in copy primitive

Support for chmod flag in copy primitive.

New dockerfiles support:
COPY --chmod=0755 src dest

Simple addition of below code in copy primitive will enable it:
if self.__chmod: base_inst = base_inst + '--chmod={} '.format(self.__chmod)

Jupyter recipe doesn't support conda environment names other than `base`

When you generate an environment.yml file, the first line gives the name of your conda environment. If your environment name isn't the default base, then the Docker image produced by hpccm doesn't conda activate into the new environment before launching the Jupyter notebook. As a result, you don't have access to any of the conda packages you installed.

Python 2.7 compatibility?

@samcmill As discussed during our conf call, one thing currently blocking us from leveraging hpccm in EasyBuild (cfr. #20) is that it currently requires Python 3.x; EasyBuild is not compatible with Python 3.x yet, but we're working on it (see easybuilders/easybuild-framework#133).

I took a quick look at this, by enabling tests for Python 2.7 in Travis (cfr. #24). Here's what I ran into:

the use of from enum import Enum implies that enum34 needs to be installed (https://pypi.org/project/enum34/), which is fine imho
with enum34 installed, a bunch of tests fail with:
```
File "hpccm/recipe.py", line 75
  raise e from e
             ^
SyntaxError: invalid syntax
```
Changing raise e from e back to just raise e fixes that, but I'm not sure if that's OK to do?

With the above changes, just two tests fail on top of Python 2.7:

----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/hpc-container-maker/test/test_sed.py", line 37, in test_basic
    r's/FOO = BAR/FOO = BAZ/g']),
  File "hpccm/sed.py", line 54, in sed_step
    quoted_patterns = ['-e {}'.format(shlex.quote(patterns[0]))]
AttributeError: 'module' object has no attribute 'quote'

======================================================================
ERROR: test_verify (test_git.Test_git)
git with verification enabled
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/hpc-container-maker/test/test_git.py", line 96, in test_verify
    branch=valid_branch, verify=True),
  File "hpccm/git.py", line 104, in clone_step
    fatal=fatal)
  File "hpccm/git.py", line 53, in __verify
    p = subprocess.Popen(command, shell=True, stdout=subprocess.DEVNULL,
AttributeError: 'module' object has no attribute 'DEVNULL'

----------------------------------------------------------------------

I'm happy to help out with restoring Python 2 compatibility if that's desired. If so, it would be useful to have #24 merged first, so we can rely on Travis to check Python 2.7 compatibility, and to ensure the codebase stays compatible with both Python 2.7 & 3.x.

NetCDF tarballs moved

The NetCDF tarballs are no longer available on are no longer available on https://www.unidata.ucar.edu/downloads/netcdf/ftp they moved to GitHub:

and changed their name to no longer include netcdf-c-, netcdf-cxx4- or netcdf-fortran- instead all tarballs just use vX.Y.Z, e.g. for netcdf-c-4.6.3 the tarball URL is now https://github.com/Unidata/netcdf-c/archive/v4.6.3.tar.gz

Netcdf with pnetCDF support

What is the correct syntax to install netcdf with pnetcdf support using hpc-container-maker?

Assuming that pnetcdf is installed in /usr/local/pnetcdf, it does not seem sufficient to add

enable_pnetcdf='/usr/local/pnetcdf'

Whatever I tried nc-config --all always shows --has-pnetcdf -> no

Support for Intel oneAPI

This is both a question and maybe feature request. I wanted to make sure that the current intel_psxe building block doesn't work for Intel oneAPI, right? If not, I might try to make a PR to do it...but my Python is not good. :)

Conda environment file not working for Singularity

Environment files are not working for Singularity. I get the error "EnvironmentFileNotFound: '/var/tmp/environment.yml' file not found". If I go into the generated ".def" file and change the location of the environment.yml file to be /environment.yml, it works. I'm not sure why, but the location /var/tmp does not work.

Minimum example:

"""
Conda Error Min Example

"""

from __future__ import absolute_import
from __future__ import unicode_literals
from __future__ import print_function

import hpccm

if __name__ == '__main__':

    ### Create Stage
    stage = hpccm.Stage()

    stage += hpccm.primitives.baseimage(image='nvidia/cuda:11.1-cudnn8-devel-centos8')

    ### Install Conda Python with environment
    stage += hpccm.building_blocks.conda(environment="environment.yml",
                                         eula=True,
                                         ospackages=['wget', 'ca-certificates', 'git'])

    ### Set container specification output format
    hpccm.config.set_container_format("singularity")

    ### Output container specification
    print(stage)

PGI runtime library copy

I get a missing library file which I think is because hpccm copying only libraries in REDIST folder as:

COPY --from=0 /opt/pgi/linux86-64/18.4/REDIST/*.so /opt/pgi/linux86-64/18.4/lib/

I think my app will work if hpccm generated the following line instead copying all the library files

COPY --from=0 /opt/pgi/linux86-64/18.4/lib/*.so /opt/pgi/linux86-64/18.4/lib/

====================
[root@8b7645d85a7f em_les]# ./ideal.exe
./ideal.exe: error while loading shared libraries: libcudaforwrapblas.so: cannot open shared object file: No such file or directory

[root@8b7645d85a7f em_les]# ls /opt/pgi/linux86-64/18.4/lib
libaccapi.so    libaccgmp.so   libblas.so        libcudafor2.so     libcudapgi.so        libpgc.so         libpgf90_rpm1_p.so  libpgnod_prof.so       libpgnod_prof_mpi3.so      libpgnod_prof_time.so
libaccapimp.so  libaccn.so     libcublasemu.so   libcudafor80.so    libhugetlbfs_pgi.so  libpgf90.so       libpgf90rtl.so      libpgnod_prof_g.so     libpgnod_prof_mvapich.so   libpgnuma.so
libaccg.so      libaccnc.so    libcudacemu.so    libcudafor90.so    liblapack.so         libpgf902.so      libpgftnrtl.so      libpgnod_prof_inst.so  libpgnod_prof_mvapich2.so
libaccg2.so     libaccncmp.so  libcudadevice.so  libcudafor91.so    libnuma.so           libpgf90_prof.so  libpgmath.so        libpgnod_prof_mpi.so   libpgnod_prof_openmpi.so
libaccg2mp.so   libaccnmp.so   libcudafor.so     libcudaforblas.so  libpgatm.so          libpgf90_rpm1.so  libpgmp.so          libpgnod_prof_mpi2.so  libpgnod_prof_pfo.so
==================

You can see libcudaforwrapblas.so is not in the PGI lib path.

However, if I go through only the first stage Stage0, everything works perfectly

===================
[root@4e464d1dcac4 em_les]# ./ideal.exe
IDEAL V3.8.1 PREPROCESSOR
DYNAMICS OPTION: Eulerian Mass Coordinate
   alloc_space_field: domain             1 ,                2772348328  bytes allocated
  pi is     3.141593

[root@4e464d1dcac4 /]# ls /usr/local/cuda/lib64/
libOpenCL.so            libcudart.so          libcuinj64.so.9.0.176   libnppc.so            libnppicom.so.9.0.176  libnppim.so           libnppitc.so.9.0.176    libnvgraph.so.9.0
libOpenCL.so.1          libcudart.so.9.0      libculibos.a            libnppc.so.9.0        libnppicom_static.a    libnppim.so.9.0       libnppitc_static.a      libnvgraph.so.9.0.176
libOpenCL.so.1.0        libcudart.so.9.0.176  libcurand.so            libnppc.so.9.0.176    libnppidei.so          libnppim.so.9.0.176   libnpps.so              libnvgraph_static.a
libOpenCL.so.1.0.0      libcudart_static.a    libcurand.so.9.0        libnppc_static.a      libnppidei.so.9.0      libnppim_static.a     libnpps.so.9.0          libnvrtc-builtins.so
libaccinj64.so          libcufft.so           libcurand.so.9.0.176    libnppial.so          libnppidei.so.9.0.176  libnppist.so          libnpps.so.9.0.176      libnvrtc-builtins.so.9.0
libaccinj64.so.9.0      libcufft.so.9.0       libcurand_static.a      libnppial.so.9.0      libnppidei_static.a    libnppist.so.9.0      libnpps_static.a        libnvrtc-builtins.so.9.0.176
libaccinj64.so.9.0.176  libcufft.so.9.0.176   libcusolver.so          libnppial.so.9.0.176  libnppif.so            libnppist.so.9.0.176  libnvToolsExt.so        libnvrtc.so
libcublas.so            libcufft_static.a     libcusolver.so.9.0      libnppial_static.a    libnppif.so.9.0        libnppist_static.a    libnvToolsExt.so.1      libnvrtc.so.9.0
libcublas.so.9.0        libcufftw.so          libcusolver.so.9.0.176  libnppicc.so          libnppif.so.9.0.176    libnppisu.so          libnvToolsExt.so.1.0.0  libnvrtc.so.9.0.176
libcublas.so.9.0.176    libcufftw.so.9.0      libcusolver_static.a    libnppicc.so.9.0      libnppif_static.a      libnppisu.so.9.0      libnvblas.so            stubs
libcublas.so.9.0.425    libcufftw.so.9.0.176  libcusparse.so          libnppicc.so.9.0.176  libnppig.so            libnppisu.so.9.0.176  libnvblas.so.9.0
libcublas_device.a      libcufftw_static.a    libcusparse.so.9.0      libnppicc_static.a    libnppig.so.9.0        libnppisu_static.a    libnvblas.so.9.0.176
libcublas_static.a      libcuinj64.so         libcusparse.so.9.0.176  libnppicom.so         libnppig.so.9.0.176    libnppitc.so          libnvblas.so.9.0.425
libcudadevrt.a          libcuinj64.so.9.0     libcusparse_static.a    libnppicom.so.9.0     libnppig_static.a      libnppitc.so.9.0      libnvgraph.so
======================

Add ability to choose ubuntu version for llvm upstream repository

Hello,

I'm trying to add the upstream repository for llvm to the GROMACS docker image build (https://github.com/gromacs/gromacs/blob/master/admin/containers/scripted_gmx_docker_builds.py), but I'm running into issues with hpccm adding the Ubuntu Xenial repository by default for the upstream.

Changes to the python script to fetch the upstream

diff --git a/admin/containers/scripted_gmx_docker_builds.py b/admin/containers/scripted_gmx_docker_builds.py
index 3ed8cb7020..9e149c69b6 100755
--- a/admin/containers/scripted_gmx_docker_builds.py
+++ b/admin/containers/scripted_gmx_docker_builds.py
@@ -245,7 +245,7 @@ def get_compiler(args, compiler_build_stage: hpccm.Stage = None) -> bb_base:
                 raise RuntimeError('No TSAN compiler build stage!')
         # Build the default compiler if we don't need special support
         else:
-            compiler = hpccm.building_blocks.llvm(extra_repository=True, version=args.llvm)
+            compiler = hpccm.building_blocks.llvm(extra_repository=True, upstream=True, version=args.llvm)
 
     elif args.oneapi is not None:
         if compiler_build_stage is not Non

Log from building:

Step 3/80 : RUN wget -qO - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - &&     echo "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial main" >> /etc/apt/sources.list.d/hpccm.list &&     echo "deb-src http://apt.llvm.org/xenial/ llvm-toolchain-xenial main" >> /etc/apt/sources.list.d/hpccm.list &&     apt-get update -y &&     DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends         clang-12         libomp-12-dev &&     rm -rf /var/lib/apt/lists/*

As you can see, instead of the current Ubuntu version 20.04 (Focal), the old Xenial release is picked up, and thus fails later to install clang-12.

Can this be changed in the config to pick up a different version, or to change the default?

Cheers

Paul

integration with EasyBuild?

I was referred to this (seemingly quite new) project by @gppezzi, which is quite interesting to us EasyBuilders since we have recently added support in EasyBuild (http://easybuilders.github.io/easybuild/) to generate Singularity definition files, and (optionally) also call out to Singularity to build the container image as well, see http://easybuild.readthedocs.io/en/latest/Containers.html .

It seems like it would be interesting and mutually beneficial to look into integration between HPCCM & EasyBuild, and I think this can be done in two main ways:

HPCCM can be enhanced to provide an easybuild function, which can be used in recipe files to generate EasyBuild commands to install software, sort of similar to the openmpi function that is already supported
EasyBuild can be enhanced to leverage HPCCM rather than implementing it's own functionality to create Docker/Singularity definition files

It's unclear to me whether HPCCM is currently ready to be used as a library rather than a command line tool, but if it's not we can probably contribute to making that work.

Maybe we should set up a conf call to discuss this further?

Public GPG key error fron nvidia ubuntu repository

Not sure there is something to change on the HPCCM side but I am having problem building a container image based on ubuntu because of nvidia repository:

W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease' is not signed.

W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease' is not signed.

Improve documentation on recipe files

I want to understand the difference between "recipe files" and "module files"

I believe the differences are described here

But there isn't much detail. What is the difference between code written as a recipe and code written as a module? I can guess at a few. But, for instance, does the hpccm tool auto import hpccm before loading the recipe files?

CUDA driver version is insufficient for CUDA runtime version

Hi, I'm trying to run Relion and get this error: ERROR: CUDA driver version is insufficient for CUDA runtime version in /var/tmp/relion/src/ml_optimiser.cpp at line 1143 (error-code 35)
in: /var/tmp/relion/src/acc/cuda/cuda_settings.h, line 67
Please help.Thanks.

No ADD primitive?

ADD some.tar.gz /data would extract the tarball to destination, but COPY only copy the file, no extraction.

Extracting source code for compile or extract binary files into container would be common usage for docker building from my experience. I hope this primitive can be added into hpccm.

(This might be a docker specific problem.)

Help with CentOS and gnu()

All,

I'm trying to learn how to use Docker, etc., and figured I'd lean on NVIDIA's experts in creating non-trivial build images/containers (I still am not fluent with the lingo) for things like an GCC 9/Open MPI 4.0.2 setup.

So, I created a recipe cribbed off of the hpcbase-gnu-openmpi one but using CentOS 8 to try and get close to a Dockerfile a more knowledgeable coworker created by hand. To wit:

Stage0 += comment(__doc__, reformat=False)

Stage0 += baseimage(image="centos:8")

# Python
Stage0 += python()

# GNU compilers
compiler = gnu(version='9')
#compiler = gnu(extra_repository=True, version='9')
#compiler = gnu(extra_repository=False, version='9')
Stage0 += compiler

# OpenMPI
Stage0 += openmpi(cuda=False, infiniband=False, 
        version='4.0.2', toolchain=compiler.toolchain)

However, when I try to build this:

$ docker build -t fortran/gcc9-openmpi402:v1.0.0 -f Dockerfile  .
Sending build context to Docker daemon  6.656kB
Step 1/7 : FROM centos:8
 ---> 470671670cac
Step 2/7 : RUN yum install -y         python2         python3 &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> 02444ced632f
Step 3/7 : RUN yum install -y centos-release-scl &&     yum install -y         devtoolset-9-gcc         devtoolset-9-gcc-c++         devtoolset-9-gcc-gfortran &&     rm -rf /var/cache/yum/*
 ---> Running in ef980a85cdcd
Last metadata expiration check: 0:02:39 ago on Wed Feb 12 14:46:59 2020.
No match for argument: centos-release-scl
Error: Unable to find a match: centos-release-scl
The command '/bin/sh -c yum install -y centos-release-scl &&     yum install -y         devtoolset-9-gcc         devtoolset-9-gcc-c++         devtoolset-9-gcc-gfortran &&     rm -rf /var/cache/yum/*' returned a non-zero code: 1

I've tried all the variants seen above:

compiler = gnu(version='9')
compiler = gnu(extra_repository=True, version='9')
compiler = gnu(extra_repository=False, version='9')

but none of them seem to work (which seems to track with the "if you set version you get extra in the docs).

I suppose the main question is: what have I done wrong? I'm sure it's simple, but I'm a bit lost right now.

Add ability to include llvm tools (clang-tidy/format)

Currently, adding the llvm building block only adds the base compiler to the image.
It would be great if it was also possible to add the corresponding tools as well, to allow the generated image to be used for source code linting.

enable_FEATURE[=ARG] is not working

env:

CentOS 7
Python 2.7.5/pip 8.1.2
hpccm 20.2.0
install by sudo pip install hpccm

script:

import hpccm

# Use appropriate container base images based on the CPU architecture
arch = 'x86_64'
default_build_image = 'nvidia/cuda:10.1-devel-ubuntu18.04'
default_runtime_image = 'nvidia/cuda:10.1-base-ubuntu18.04'

########
# Build stage (Stage 0)
########

# Base image
Stage0 += baseimage(image=USERARG.get('build_image', default_build_image),
                    _arch=arch, _as='build')

Stage0 += ucx(
              enable_devel-headers=True,
              gdrcopy='/usr/local/gdrcopy',
              knem='/usr/local/knem',
              without_java=True,
              ofed=True,
              ldconfig=True,
              version='1.7.0',
             )

cmd:

hpccm --recipe test.py --format singularity --singularity-version=3.2 > test.def

msg:

ERROR: keyword can't be an expression (test.py, line 17)

other blocks package have same issue with enable_FEATURE

Boost URL outdated?

I try to build boost with the building block but I get an error code 8.
I found out, that wget fails:

wget -nc --no-check-certificate -P /tmp https://dl.bintray.com/boostorg/release/1.74.0/source/boost_1_74_0.tar.bz2
--2021-06-14 11:43:32--  https://dl.bintray.com/boostorg/release/1.74.0/source/boost_1_74_0.tar.bz2
Resolving dl.bintray.com (dl.bintray.com)... 3.127.93.119, 18.196.33.98
Connecting to dl.bintray.com (dl.bintray.com)|3.127.93.119|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-06-14 11:43:32 ERROR 403: Forbidden.

If I go to the boost site and copy the download link I get another one, which is working:

wget https://boostorg.jfrog.io/artifactory/main/release/1.74.0/source/boost_1_74_0.tar.bz2
--2021-06-14 11:47:22--  https://boostorg.jfrog.io/artifactory/main/release/1.74.0/source/boost_1_74_0.tar.bz2
Resolving boostorg.jfrog.io (boostorg.jfrog.io)... 35.80.249.196, 54.148.141.177, 52.43.90.32, ...
Connecting to boostorg.jfrog.io (boostorg.jfrog.io)|35.80.249.196|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com/aol-boostorg/filestore/f8/f82c0d8685b4d0e3971e8e2a8f9ef1551412c125?x-jf-traceId=49786281f31d0f64&response-content-disposition=attachment%3Bfilename%3D%22boost_1_74_0.tar.bz2%22&response-content-type=application%2Fx-bzip2&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210614T094723Z&X-Amz-SignedHeaders=host&X-Amz-Expires=60&X-Amz-Credential=AKIASG3IHPL63WBBRCUD%2F20210614%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=226ca0520d266f89c911b5be749c29fac47a8106a33ee4efc0e4b8b00517a364 [following]
--2021-06-14 11:47:23--  https://jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com/aol-boostorg/filestore/f8/f82c0d8685b4d0e3971e8e2a8f9ef1551412c125?x-jf-traceId=49786281f31d0f64&response-content-disposition=attachment%3Bfilename%3D%22boost_1_74_0.tar.bz2%22&response-content-type=application%2Fx-bzip2&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20210614T094723Z&X-Amz-SignedHeaders=host&X-Amz-Expires=60&X-Amz-Credential=AKIASG3IHPL63WBBRCUD%2F20210614%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=226ca0520d266f89c911b5be749c29fac47a8106a33ee4efc0e4b8b00517a364
Resolving jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com (jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com)... 52.218.169.83
Connecting to jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com (jfrog-prod-usw2-shared-oregon-main.s3.amazonaws.com)|52.218.169.83|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 109600630 (105M) [application/x-bzip2]
Saving to: ‘boost_1_74_0.tar.bz2’

boost_1_74_0.tar.bz2                                                   100%[============================================================================================================================================================================>] 104,52M  9,49MB/s    in 13s     

2021-06-14 11:47:36 (8,09 MB/s) - ‘boost_1_74_0.tar.bz2’ saved [109600630/109600630]

NetCDF issues

Hi guys, I'm having troubles installing NetCDF with HPCCM. The issue is that the default version of netcdf-c library is now 4.7.3, and the later releases are not available via ftp anymore, see ftp://ftp.unidata.ucar.edu/pub/netcdf/
Specifying explicit version='4.7.3' works fine.

A different solution would also be to change the FTP web-address to the one directly from github releases (there the version archive is still maintained): https://github.com/Unidata/netcdf-c/releases

Exactly same stuff with versions apply to netcdf-fortran and netcdf-cxx

Cheers!

problem copy from stage0 to stage1 with singularity 3.2 and singularity 3.4

Hello,
I am trying to test the multistage capability of hpccm recipe made for singularity. I started with a simple example made for singularity 3.4. I want to build an image with pgi compiler starting from a docker devel image in stage0, make the stage1 and move file from stage0 to stage1 according to the default recipe

hpcbase-pgi-1910-openmpi.py.txt
recipe_hpcbase-pgi-1910-openmpi_runtime.ref.txt

Using the following commands (in a server with Centos 7)

hpccm --recipe hpcbase-pgi-1910-openmpi.py --format singularity --singularity-version=3.4 > recipe_hpcbase-pgi-1910-openmpi_runtime.ref

sudo -E SINGULARITY_TMPDIR=/home/simone/Documents/sing_tmpdir SINGULARITY_CACHEDIR=/home/simone/Documents/sing_cachedir singularity -d img_hpcbase-pgi-1910-openmpi_runtime.sif recipe_hpcbase-pgi-1910-openmpi_runtime.ref

I got this error when I pass from stage0 to stage1, I used the -d option when I run the command

: [/home/simone/Documents/sing_tmpdir/sbuild-855637488 /home/simone/Documents/sing_tmpdir/sbuild-042428559]
FATAL [U=0,P=13655] run() While performing build: unable to copy files a stage to container fs: stage 0 was not found

I put my hpccm recipes, the corresponding singularity recipe in attachment

Could you please help me to understand what is going wrong ?

Thank you

Files copied inside singularity not found (intel_psxe building block)

The intel_psxe building block installs Intel Parallel Studio XE and requires a license file and a tarball:

base image

Stage0 += baseimage(image='ubuntu:{}'.format(18.04), _as='build')

Intel PSXE

#Stage0 += intel_psxe(eula=True, license='license.lic', tarball='parallel_studio_xe_2020_update4_cluster_edition.tgz')

When building a singularity container these files are apparently copied into a temporary folder (whose name changes every time):

INFO: Copying parallel_studio_xe_2020_update4_cluster_edition.tgz to /tmp/build-temp-795513904/rootfs/var/tmp/parallel_studio_xe_2020_update4_cluster_edition.tgz
INFO: Copying license.lic to /tmp/build-temp-795513904/rootfs/var/tmp/license.lic

which means that the build fails when they are not found:

tar -x -f /var/tmp/parallel_studio_xe_2020_update4_cluster_edition.tgz -C /var/tmp -z
tar (child): /var/tmp/parallel_studio_xe_2020_update4_cluster_edition.tgz: Cannot open: No such file or directory

I tried to copy the files in other locations but they always end up in temporary folders and are never found during the build

Can you help me sort this out?

nvshmem building block issue (singularity): nvshmem does not build when flag hydra=True set

Hello,

When trying to build a Singularity container with nvshmem v2.2.1 (by using the nvshmem building block along with other dependency building blocks), it seems that nvshmem does not actually build when the hydra=True parameter is set as the nvshmem include & lib directories are not present in /usr/local/nvshmem/ and attempting to compile nvshmem programs with nvcc fail (because of this). However, the hydra launcher successfully installs and seemingly is able to launch pre-compiled nvshmem programs.

After removing the flag, nvshmem seems to build normally within the container.

I am not sure if I am just doing something wrong in my recipe or if this is the intended behavior, but I thought I would go ahead and post just in case!

Add support for CrayPE

install_step for CMakeBuild

Hello,
for my projects I need some open-source libraries, which I build from scratch. So I need to run make install for these libraries. Unfortunately, I can't find a Python function that generates the make install command. Did I miss something? If not, can we add a install_step()function to CMakeBuild? It would improve the readability and maintainability of my receipts. At the moment I'm using shell() functions.
Thank you for your help,
Simeon

wrong gfortran runtime library on ubuntu:18.04

The gfortran runtime library for Ubuntu based images is set to libgfortran3

hpc-container-maker/hpccm/building_blocks/gnu.py

Lines 409 to 411 in adbe43f

 if self.__fortran: 

 self.__compiler_debs.append('gfortran') 

 self.__runtime_debs.append('libgfortran3')

This is correct on ubuntu:16.04. However on ubuntu:18.04 this should be libgfortran4, and libgfortran5 on ubuntu:20.04.

ubuntu:16.04 gfortran --> gfortran-5 --> libgfortran3-dbg --> libgfortran3
ubuntu:18.04 gfortran --> gfortran-7 --> libgfortran7-dbg --> libgfortran4
ubuntu:20.04 gfortran --> gfortran-9 --> libgfortran9-dbg --> libgfortran5

Would you be interested on a PR?

Upload universal wheel to PyPi

For releases can you please create and upload a universal wheel in addition to the source distribution? You can create them with:

python setup.py bdist_wheel --universal

I am trying to run hpccm in JupyterLite and installation in this environment requires a wheel:

import micropip
await micropip.install("hpccm")
...
ValueError: Couldn't find a pure Python 3 wheel for 'hpccm'

gromacs.py error

I want to use gromacs.py to generate gromacs container, but in the step 2/23 i hava meet the problem following:

Step 2/23 : RUN apt-get update -y &&     DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends         python &&     rm -rf /var/lib/apt/lists/*
 ---> Running in 87c834e52e38
Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:2 http://security.ubuntu.com/ubuntu xenial-security InRelease [109 kB]
Get:3 http://security.ubuntu.com/ubuntu xenial-security/main amd64 Packages [1019 kB]
Ign:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  InRelease
Get:5 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Ign:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  InRelease
Get:7 http://archive.ubuntu.com/ubuntu xenial-backports InRelease [107 kB]
Get:8 http://security.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [12.7 kB]
Get:9 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:10 http://security.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [592 kB]
Get:11 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Release [169 B]
Get:12 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Release [169 B]
Get:13 http://security.ubuntu.com/ubuntu xenial-security/multiverse amd64 Packages [6280 B]
Get:14 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Release.gpg [169 B]
Get:15 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Release.gpg [169 B]
Get:16 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64  Packages [252 kB]
Get:17 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:18 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:19 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Packages [73.4 kB]
Err:19 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64  Packages
  Hash Sum mismatch
Get:20 http://archive.ubuntu.com/ubuntu xenial/multiverse amd64 Packages [176 kB]
Get:21 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [1396 kB]
Get:22 http://archive.ubuntu.com/ubuntu xenial-updates/restricted amd64 Packages [13.1 kB]
Get:23 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [996 kB]
Get:24 http://archive.ubuntu.com/ubuntu xenial-updates/multiverse amd64 Packages [19.3 kB]
Get:25 http://archive.ubuntu.com/ubuntu xenial-backports/main amd64 Packages [7942 B]
Get:26 http://archive.ubuntu.com/ubuntu xenial-backports/universe amd64 Packages [8807 B]
Fetched 16.5 MB in 1min 13s (225 kB/s)
Reading package lists...
E: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/Packages.gz  Hash Sum mismatch
E: Some index files failed to download. They have been ignored, or old ones used instead.
The command '/bin/sh -c apt-get update -y &&     DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends         python &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100

Add new building block for nsight compute

It would be nice to have a way to install nsight compute only without the rest of the HPC SDK.

Building block Boost: argument python collides with bootstrap_opts="--with-libraries=..."

I want to specify the components of Boost to be built:

boost(
  prefix=/opt/boost/1.73.0,
  version="1.73.0",
  bootstrap_opts=["--with-libraries=atomic,chrono"]
)

The output is ./bootstrap.sh --prefix=/opt/boost1.73.0 --with-libraries=atomic,chrono --without-libraries=python which is not allowed by boost because you cannot use --with-libraries and --without-libraries together. The workaround is to set python=True in the hpccm code, which is semantically incorrect because it means to build Python bindings, which it does not.

LLVM package: make OpenMP installation optinal

For our CI we install different versions of Clang/LLVM in parallel in a container. Unfortunately there is a dependency between lipomp5-7 and lipomp5-8:

Step 20/28 : RUN apt-get update -y &&     DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends         clang-8         libomp-8-dev &&     rm -rf /var/lib/apt/lists/*
 ---> Running in c82cf74f02b3
Get:2 http://archive.ubuntu.com/ubuntu bionic InRelease [242 kB]
Get:1 https://apt.llvm.org/bionic llvm-toolchain-bionic InRelease [4232 B]
Get:3 https://apt.llvm.org/bionic llvm-toolchain-bionic-10 InRelease [4232 B]
Get:4 https://apt.llvm.org/bionic llvm-toolchain-bionic-11 InRelease [4232 B]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
Get:6 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
Get:8 https://apt.llvm.org/bionic llvm-toolchain-bionic/main Sources [2193 B]
Get:9 https://apt.llvm.org/bionic llvm-toolchain-bionic/main amd64 Packages [10.6 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic/restricted amd64 Packages [13.5 kB]
Get:11 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages [1344 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [11.3 MB]
Get:13 https://apt.llvm.org/bionic llvm-toolchain-bionic-10/main Sources [1665 B]
Get:14 https://apt.llvm.org/bionic llvm-toolchain-bionic-10/main amd64 Packages [8762 B]
Get:15 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [186 kB]
Get:16 https://apt.llvm.org/bionic llvm-toolchain-bionic-11/main Sources [1666 B]
Get:17 https://apt.llvm.org/bionic llvm-toolchain-bionic-11/main amd64 Packages [8738 B]
Get:18 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [44.6 kB]
Get:19 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [220 kB]
Get:20 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [2110 kB]
Get:21 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2095 kB]
Get:22 http://archive.ubuntu.com/ubuntu bionic-backports/main amd64 Packages [11.3 kB]
Get:23 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [11.4 kB]
Get:24 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1332 kB]
Get:25 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [193 kB]
Get:26 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [1693 kB]
Get:27 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [14.6 kB]
Fetched 21.1 MB in 2s (8756 kB/s)
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libclang-common-8-dev libclang1-8 libllvm8 libomp5-8
Suggested packages:
  clang-8-doc libomp-8-doc
Recommended packages:
  llvm-8-dev python
The following NEW packages will be installed:
  clang-8 libclang-common-8-dev libclang1-8 libllvm8 libomp-8-dev libomp5-8
0 upgraded, 6 newly installed, 0 to remove and 18 not upgraded.
Need to get 31.9 MB of archives.
After this operation, 173 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libllvm8 amd64 1:8-3~ubuntu18.04.2 [13.6 MB]
Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libclang-common-8-dev amd64 1:8-3~ubuntu18.04.2 [3802 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libclang1-8 amd64 1:8-3~ubuntu18.04.2 [6225 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 clang-8 amd64 1:8-3~ubuntu18.04.2 [7940 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libomp5-8 amd64 1:8-3~ubuntu18.04.2 [299 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 libomp-8-dev amd64 1:8-3~ubuntu18.04.2 [56.2 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 31.9 MB in 1s (35.2 MB/s)
Selecting previously unselected package libllvm8:amd64.
(Reading database ... 14420 files and directories currently installed.)
Preparing to unpack .../0-libllvm8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libllvm8:amd64 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libclang-common-8-dev.
Preparing to unpack .../1-libclang-common-8-dev_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libclang-common-8-dev (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libclang1-8.
Preparing to unpack .../2-libclang1-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libclang1-8 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package clang-8.
Preparing to unpack .../3-clang-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking clang-8 (1:8-3~ubuntu18.04.2) ...
Selecting previously unselected package libomp5-8:amd64.
Preparing to unpack .../4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libomp5-8:amd64 (1:8-3~ubuntu18.04.2) ...
dpkg: error processing archive /tmp/apt-dpkg-install-04VwC3/4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb (--unpack):
 trying to overwrite '/usr/lib/x86_64-linux-gnu/libomp.so.5', which is also in package libomp5-7:amd64 1:7-3~ubuntu0.18.04.1
Selecting previously unselected package libomp-8-dev.
Preparing to unpack .../5-libomp-8-dev_1%3a8-3~ubuntu18.04.2_amd64.deb ...
Unpacking libomp-8-dev (1:8-3~ubuntu18.04.2) ...
Errors were encountered while processing:
 /tmp/apt-dpkg-install-04VwC3/4-libomp5-8_1%3a8-3~ubuntu18.04.2_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
The command '/bin/sh -c apt-get update -y &&     DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends         clang-8         libomp-8-dev &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100

For us it is not necessary to install different OpenMP versions in parallel. Can you please make the OpenMP installation optional.

Cheers,
Simeon

fail to build NetCDF 4.7.3 with 20.9.0

After updating from 20.8.0 to 20.9.0 my scripts fail to build NetCDF 4.7.3 with the following error

configure: error: netcdf-c version 4.7.4 or greater is required

Before the update, NetCDF 4.7.3 meant netcdf-c/4.7.3 with netcdf-cxx4/4.3.1 and netcdf-fortran/4.5.2,
after the update it tries to compile with netcdf-fortran/4.5.3. This seems to be the cause of the error.

The following recipe.py will build without issues on 20.8.0 and trigger the error on 20.9.0

Stage0 += baseimage(image="ubuntu:18.04", _as='build')

# gCC & gFortran
compiler = gnu()
Stage0 += compiler

# OpenMPI
Stage0 += openmpi(version="4.0.2", infiniband=False, cuda=False, toolchain=compiler.toolchain)

# HDF5
Stage0 += hdf5(version="1.10.5", toolchain=compiler.toolchain)

# NetCDF
Stage0 += netcdf(version="4.7.3", toolchain=compiler.toolchain)

yum errors with same repo twice

When installing both nsys and nsight-compute in a centos container, it issues the following command:

RUN rpm --import https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64/nvidia.pub && \
    yum install -y yum-utils && \
    yum-config-manager --add-repo https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 && \
    yum install -y \
        nsight-systems-cli-2021.1.1 && \
    rm -rf /var/cache/yum/*
# NVIDIA Nsight Compute 2020.2.1
RUN rpm --import https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64/nvidia.pub && \
    yum install -y yum-utils && \
    yum-config-manager --add-repo https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 && \
    yum install -y \
        nsight-compute-2020.2.1 && \
    rm -rf /var/cache/yum/*

This errors out because yum-config-manager does not allow the same repo to be added twice:

adding repo from: https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64
Cannot add repo from https://developer.download.nvidia.com/devtools/repos/rhel7/x86_64 as is a duplicate of an existing repo

Building recipes with Singularity 3.0

When generating Singularity definition files, HPCCM makes use of multiple %post, %environment, etc. sections. There is a bug in Singularity 3.x where only one section is recognized (see sylabs/singularity#2349).

The workarounds are:

Use Singularity 2.x to build images
Generate a Dockerfile, build the image, and then convert it to a Singularity image.

Is there any possible to disable output of hpcx environment ?

Many thanks for developing this amazing tool.
I just want to use hcoll library which inside hpcx to compile with my openmpi.
Then I try to use hpcx block, however this block will output all of hpcx environment.(include openmpi inside hpcx)
Therefore, is there any possible to disable output of hpcx environment ?
Thank you !

Using older version of PGI compiler 17.10 does not work

If I try to use older PGI compiler version 17.10, it still copies the latest PGI community edition compiler
There is the following comment in the code which makes me think that it is not supposed to work using but the latest version

    # The version is fragile since the latest version is
    # automatically downloaded, which may not match this default.
    self.__version = kwargs.get('version', '18.4')
    self.__wd = '/var/tmp' # working directory

Feature Request - embed HPCCM recipe files in definition/Dockerfile

This ia feature request.

Singularity stores its definition file in the resulting image. This can be extracted using "singularty inspect -d". This allows you to see how the image was created. You can then edit this definition and create a new image.

I don't like editing Singularity definition files or Dockerfiles directly (they can be too long and wildly complicationed). I like editing HPCCM recipe files. However, they are not stored with the images.

I would like to see a way to store HPCCM recipes in the Singularity of Docker images. (similiar to how Singularity stores it's definition file).

For example, Singularity and Docker both have the capability of storing metadata in a section like "%label". It would be useful to store the HPCCM recipe in the %label section of the images.

Here is a simple recipe:

Stage0 += baseimage(image='nvidia/cuda:10.1-base-ubuntu18.04')

Stage0 += shell(commands=['apt-get update'])
Stage0 += shell(commands=['apt-get install -y octave'])

Stage0 += pgi(eula=True, mpi=True)

Using HPCCM to "process" this definition file,

$ hpccm --recipe test2.py --format singularity > test2.def

Produces a definition file that looks like,

BootStrap: docker
From: nvidia/cuda:10.1-base-ubuntu18.04
%post
. /.singularity.d/env/10-docker*.sh

%post
cd /
apt-get update

%post
cd /
apt-get install -y octave

PGI compiler version 19.10

%post
apt-get update -y
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends
g++
gcc
libnuma1
openssh-client
perl
wget
rm -rf /var/lib/apt/lists/*
%post
cd /
mkdir -p /var/tmp && wget -q -nc --no-check-certificate -O /var/tmp/pgi-community-linux-x64-latest.tar.gz --referer https://www.pgroup.com/products/community.htm?ut
m_source=hpccm&utm_medium=wgt&utm_campaign=CE&nvid=nv-int-14-39155 -P /var/tmp https://www.pgroup.com/support/downloader.php?file=pgi-community-linux-x64
mkdir -p /var/tmp/pgi && tar -x -f /var/tmp/pgi-community-linux-x64-latest.tar.gz -C /var/tmp/pgi -z
cd /var/tmp/pgi && PGI_ACCEPT_EULA=accept PGI_INSTALL_DIR=/opt/pgi PGI_INSTALL_MPI=true PGI_INSTALL_NVIDIA=true PGI_MPI_GPU_SUPPORT=true PGI_SILENT=true ./install
echo "variable LIBRARY_PATH is environment(LIBRARY_PATH);" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "variable library_path is default($if($LIBRARY_PATH,$foreach(ll,$replace($LIBRARY_PATH,":",), -L$ll)));" >> /opt/pgi/linux86-64/19.10/bin/siterc
echo "append LDLIBARGS=$library_path;" >> /opt/pgi/linux86-64/19.10/bin/siterc
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so
ln -sf /usr/lib/x86_64-linux-gnu/libnuma.so.1 /opt/pgi/linux86-64/19.10/lib/libnuma.so.1
rm -rf /var/tmp/pgi-community-linux-x64-latest.tar.gz /var/tmp/pgi
%environment
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH
%post
export LD_LIBRARY_PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/lib:/opt/pgi/linux86-64/19.10/lib:$LD_LIBRARY_PATH
export PATH=/opt/pgi/linux86-64/19.10/mpi/openmpi-3.1.3/bin:/opt/pgi/linux86-64/19.10/bin:$PATH

I would like that definition file to look like the following.

BootStrap: docker
From: nvidia/cuda:10.1-base-ubuntu18.04
%post
. /.singularity.d/env/10-docker*.sh

%post
cd /
apt-get update

%post
cd /
apt-get install -y octave

PGI compiler version 19.10

%label
Stage0 += baseimage(image='nvidia/cuda:10.1-base-ubuntu18.04')
Stage0 += shell(commands=['apt-get update'])
Stage0 += shell(commands=['apt-get install -y octave'])
Stage0 += pgi(eula=True, mpi=True)

Notice how the HPCCM recipe is stored in the %label section. Now I can extract it from an image created with the definition file and edit that recipe.

Thanks!

Allow use of CMake binary package for aarch64

Currently you are forced to compile CMake from source for any architecture other than x86_64. I'm having some issues getting cmake to compile for one of my recipes which I could avoid by using the pre-compiled binaries. aarch64 binaries have been available since 3.20.0 release as far as I can tell. Installation procedure is the same as x86_64 in my experience.

Argument --recipe should be required and have no default

The argument --recipe should just be a required argument with no default.

The current behavior is confusing for new users. Running hpccm with no arguments should show the help?

/ # hpccm
ERROR: [Errno 2] No such file or directory: 'recipes/hpcbase-gnu-openmpi.py'
/ # find / -name hpcbase-gnu-openmpi.py
/ #

Singularity %files behavior change in 3.6 and later

The behavior of the Singularity %files directive changed beginning in version 3.6. See apptainer/singularity#5514.

For this Singularity definition file:

Bootstrap: docker
From: ubuntu:18.04

%files
   foo /var/tmp

%post
   ls -l /var/tmp/foo

Singularity 3.5.3:

# singularity --version
singularity version 3.5.3

# singularity build foo.sif Singularity.def 
INFO:    Starting build...
INFO:    Copying foo to /tmp/rootfs-f4ee8c55-7790-11eb-b63f-0242ac110002/var/tmp
INFO:    Running post scriptlet
+ ls -l /var/tmp/foo
-rw-r--r--. 1 root root 0 Feb 25 17:43 /var/tmp/foo
INFO:    Creating SIF file...
INFO:    Build complete: foo.sif

Singularity 3.6.4:

# singularity --version
singularity version 3.6.4

# singularity build foo.sif Singularity.def 
INFO:    Starting build...
INFO:    Copying foo to /tmp/build-temp-392680558/rootfs/var/tmp
INFO:    Running post scriptlet
+ ls -l /var/tmp/foo
ls: cannot access '/var/tmp/foo': No such file or directory

Changing the path from /var/tmp to /opt works with either version:

INFO:    Copying foo to /tmp/build-temp-697021000/rootfs/opt
INFO:    Running post scriptlet
+ ls -l /opt/foo
-rw-r--r--. 1 root root 0 Feb 25 17:41 /opt/foo
INFO:    Creating SIF file...
INFO:    Build complete: foo.sif

The --working-directory=/other/path option can be used to generate a Singularity definition file that works around this behavior change.

A better solution might be to use %setup to stage files into the container build environment. Alternatively, use a default location other than /var/tmp or /tmp when copying files from the host (such as the package option in many building blocks).

Originally reported in #341 by @VI-gha

nvhpc building block with C++20 library support

According to the nvhpc docs, one needs gcc/glibc>=10.1 for C++20 library support in the new nvhpc 22.3.

Neither

Stage0 += baseimage(image='nvcr.io/nvidia/nvhpc:22.3-devel-cuda11.6-ubuntu20.04')
Stage0 += gnu(version='10')

nor

Stage0 += baseimage(image='ubuntu:20.04')
Stage0 += gnu(version='10')
Stage0 += nvhpc(eula=True, cuda_multi=False, version='22.3')

seem to do the trick. The second one even actively installs gcc 9 again instead of using the provided gcc 10 toolchain.
I tested the support by trying to compile a source file with #include <ranges> via nvc++ -std=c++20 inside the container (which works when using the installed g++ 10.3.0).

Maybe the nvhpc building block needs a toolchain argument?

Add the ability to download nv_hpc_sdk

Hi all!

It does not seem the nv_hpc_sdk building block currently supports downloading the package and I am having trouble to create a recipe which does it automatically for my CI workflow.

I tried to implement my self this way:

d = downloader(url='https://xxxxxxxxxxxxxxxx/nvhpc_2020_207_Linux_x86_64_cuda_multi.tar.gz')
Stage0 += d.download_step()
Stage0 += nv_hpc_sdk(eula=True, mpi=False, system_cuda=True,
                                      tarball='nvhpc_2020_207_Linux_x86_64_cuda_multi.tar.gz')

Unfortunately I get a docker error when building: error building image:

parsing dockerfile: Dockerfile parse error line 57: unknown instruction: MKDIR

Anyone has an idea for me?
Thanks in advance!

clean up shell commands ?

I adapted gromacs.py recipe into a recipe for singularity. My intention is to try hpccm and also create a docker image for testing singularity.
I am having two problems:

1. The following shell commands looks too messy according hpccm's expectation:

build_cmds = ['export VERSION=1.11.4 OS=linux ARCH=amd64 && \
              wget -O /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz https://dl.google.com/go/go${VERSION}.${OS}-${ARCH}.tar.gz && \
              tar -C /usr/local -xzf /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz',

              "echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
              echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
              .  ~/.bashrc ",

              "mkdir -p ${GOPATH}/src/github.com/sylabs && \
              cd ${GOPATH}/src/github.com/sylabs && \
              git clone https://github.com/sylabs/singularity.git && \
              cd singularity ",

              "git checkout v{}".format(singularity_version),

              "cd ${GOPATH}/src/github.com/sylabs/singularity && \
              ./mconfig &&\
              cd ./builddir && \
              make && \
              make install"
]

Stage0 += shell(commands=build_cmds)

2. The final docker image does not actually has `singularity` command

Even though build seems successful:

.....
# excerpt of docker build log
GEN etc/bash_completion.d/singularity
 CNI PLUGIN tuning
 CNI PLUGIN vlan
 INSTALL /usr/local/bin/singularity
 INSTALL /usr/local/etc/bash_completion.d/singularity
 INSTALL /usr/local/etc/singularity/singularity.conf
 INSTALL /usr/local/libexec/singularity/bin/starter
 INSTALL /usr/local/var/singularity/mnt/session
 INSTALL /usr/local/bin/run-singularity
 INSTALL /usr/local/etc/singularity/capability.json
 INSTALL /usr/local/etc/singularity/ecl.toml
....

I guess this has something to do with multi-stage build.

This is complete content of my recipe:

r"""
Generate Dockerfile for Singularity

Contents:
  Ubuntu 18.04
  CUDA version 10.0
  GNU compilers (upstream)
  OFED (upstream)
  Singularity
"""
import os
from hpccm.templates.CMakeBuild import CMakeBuild
from hpccm.templates.git import git

singularity_version = USERARG.get('SINGULARITY_VERSION', '3.1.1')

Stage0 += comment(__doc__.strip(), reformat=False)
Stage0.name = 'devel'
Stage0 += baseimage(image='nvidia/cuda:10.0-devel-ubuntu18.04', _as=Stage0.name)

Stage0 += python(python3=True)

Stage0 += gnu(fortran=False)

Stage0 += packages(ospackages=['ca-certificates', 'cmake', 'git','build-essential', 'libssl-dev', 'uuid-dev', 'libgpgme11-dev', 'libseccomp-dev', 'pkg-config', 'squashfs-tools'])

Stage0 += ofed()

Stage0 += openmpi(configure_opts=['--enable-mpi-cxx'],
               prefix="/opt/openmpi", version='3.0.0')

cm = CMakeBuild()

build_cmds = ['export VERSION=1.11.4 OS=linux ARCH=amd64 && \
              wget -O /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz https://dl.google.com/go/go${VERSION}.${OS}-${ARCH}.tar.gz && \
              tar -C /usr/local -xzf /tmp/go${VERSION}.${OS}-${ARCH}.tar.gz',

              "echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
              echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
              .  ~/.bashrc ",

              "mkdir -p ${GOPATH}/src/github.com/sylabs && \
              cd ${GOPATH}/src/github.com/sylabs && \
              git clone https://github.com/sylabs/singularity.git && \
              cd singularity ",

              "git checkout v{}".format(singularity_version),

              "cd ${GOPATH}/src/github.com/sylabs/singularity && \
              ./mconfig &&\
              cd ./builddir && \
              make && \
              make install"
]

Stage0 += shell(commands=build_cmds)

# Include examples if they exist in the build context
# if os.path.isdir('recipes/singularity/examples'):
#    Stage0 += copy(src='recipes/singularity/examples', dest='/workspace/examples')

# Stage0 += environment(variables={'PATH': '$PATH:/singularity/install/bin'})

Stage0 += label(metadata={'io.sylabs.singularity.version': singularity_version})

Stage0 += workdir(directory='/workspace')

######
# Runtime image stage
######

Stage1.baseimage('nvidia/cuda:10.0-runtime-ubuntu18.04')

Stage1 += Stage0.runtime(_from=Stage0.name)

# Stage1 += copy(_from=Stage0.name, src='${GOPATH}/src/github.com/sylabs/singularity/install',
               # dest='/singularity/install')

# Include examples if they exist in the build context
# if os.path.isdir('recipes/singularity/examples'):
#    Stage1 += copy(src='recipes/singularity/examples', dest='/workspace/examples')

# Stage1 += environment(variables={'PATH': '$PATH:/singularity/install/bin'})

Stage1 += label(metadata={'io.sylabs.singularity.version': singularity_version})

Stage1 += workdir(directory='/workspace')

Building block Boost: add possibility to pass args to ./b2

I want to build my boost with C++ 14, so I need to pass the flag cxxflags="-std=c++14" to the ./b2 build process: ./b2 cxxflags="-std=c++14" -j$(nproc) -q install. There is currently no way to pass arguments to ./b2:

hpc-container-maker/hpccm/building_blocks/boost.py

Line 185 in 5d7c29c

self.__commands.append('./b2 -j{} -q install'.format(self.__parallel))

Add cmake support for build stage

The ubuntu default cmake might not always be the latest and greatest.
Cmake binaries are readily available at https://cmake.org/download/.
The .sh files unpack in the default system location.

E.g. in my singularity files I use

        # CMAKE
        wget https://cmake.org/files/v3.10/cmake-3.10.0-Linux-x86_64.sh
        sh cmake-3.10.0-Linux-x86_64.sh  --skip-license
        rm cmake-3.10.0-Linux-x86_64.sh

to install cmake.

I might be able to figure out on my own how to do it but would need to find some time.

Mellanox HPC-X installation error

The link points to HPC-X package is invalid anymore: http://www.mellanox.com/downloads/hpc/hpc-x/v2.8/hpcx-v2.8.1-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1-ubuntu18.04-x86_64.tbz

is the intel building block working for Singularity?

First, let me take this opportunity to thank you for providing this extremely useful tool. I'm a big fan. We use this for generating a variety of containers.

However, currently I am focusing on intel Singularity containers and I cannot get them to work. After having problems with more sophisticated applications, I went back to a simple recipe file that includes a "hello world" mpi application:

"""Intel/impi Development container
"""

import os

# Base image
Stage0.baseimage('ubuntu:18.04')

Stage0 += apt_get(ospackages=['build-essential','tcsh','csh','ksh','git',
                              'openssh-server','libncurses-dev','libssl-dev',
                              'libx11-dev','less','man-db','tk','tcl','swig',
                              'bc','file','flex','bison','libexpat1-dev',
                              'libxml2-dev','unzip','wish','curl','wget',
                              'libcurl4-openssl-dev','nano','screen', 'libasound2',
                              'libgtk2.0-common','software-properties-common',
                              'libpango-1.0.0','xserver-xorg','dirmngr',
                              'gnupg2','lsb-release','vim'])

# Install Intel compilers, mpi, and mkl 
Stage0 += intel_psxe(eula=True, license=os.getenv('INTEL_LICENSE_FILE',default='intel_license/****.lic'), tarball=os.getenv('INTEL_TARBALL',default='intel_tarballs/parallel_studio_xe_2019_update5_cluster_edition.tgz'))

# Install application
Stage0 += copy(src='hello_world_mpi.c', dest='/root/jedi/hello_world_mpi.c')
Stage0 += shell(commands=['export COMPILERVARS_ARCHITECTURE=intel64',
                          '. /opt/intel/compilers_and_libraries/linux/bin/compilervars.sh',
                         'cd /root/jedi','mpiicc hello_world_mpi.c -o /usr/local/bin/hello_world_mpi -lstdc++'])

Stage0 += runscript(commands=['/bin/bash -l'])

If I build a docker image with this, it works fine:

CNAME=intel19-impi-hello
hpccm --recipe $CNAME.py --format docker > Dockerfile.$CNAME
sudo docker image build -f Dockerfile.${CNAME} -t jedi-${CNAME} .
ubuntu@ip-172-31-87-130:~/jedi$ sudo docker run --rm -it jedi-intel19-impi-hello:latest
root@1dfdbccc1110:/# mpirun -np 4 hello_world_mpi
Hello from rank 1 of 4 running on 1dfdbccc1110
Hello from rank 2 of 4 running on 1dfdbccc1110
Hello from rank 0 of 4 running on 1dfdbccc1110
Hello from rank 3 of 4 running on 1dfdbccc1110

But, if I try to build a singularity image I get this:

hpccm --recipe $CNAME.py --format singularity > Singularity.$CNAME
sudo singularity build $CNAME.sif Singularity.$CNAME
ubuntu@ip-172-31-87-130:~/jedi$ singularity shell -e intel19-impi-hello.sif
Singularity intel19-impi-hello.sif:~/jedi> source /etc/bash.bashrc
ubuntu@ip-172-31-87-130:~/jedi$ mpirun -np 4 hello_world_mpi
[mpiexec@ip-172-31-87-130] enqueue_control_fd (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:70): assert (!closed) failed
[mpiexec@ip-172-31-87-130] launch_bstrap_proxies (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:517): error enqueuing control fd
[mpiexec@ip-172-31-87-130] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:714): unable to launch bstrap proxy
[mpiexec@ip-172-31-87-130] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1919): error setting up the boostrap proxies

I get the same thing if I try to create the Singularity image from the docker image:

sudo singularity build intel19-impi-hello.sif docker-daemon:jedi-intel19-impi-hello:latest

For much more detail, please see the corresponding issue on the sylabs github site

I just wanted to see if anyone here had any tips on building a working singularity container with the intel_psxe building block. Thanks!

conda: allow changing python subversion

Hi,
our workflow needs for the moment a python 3.7 version of conda (nothing I can change on our side). Until last version, I was able to retrieve it by giving 'py37_4.8.3' as version argument, but since this version, the version argument is strictly checked in its format, and py38 (or py27) is forced.
Could the __python_version be modified by the user as well ? That would be useful for us.

can't use bootstrap.py to generate HPCCM container

i use sudo docker build -t hpccm -f Dockerfile . to generate HPCCM container ,but meet some problem。

ERROR

Sending build context to Docker daemon  6.144kB
Step 1/3 : FROM python:3-slim
 ---> ca7f9e245002
Step 2/3 : RUN pip install --no-cache-dir hpccm
 ---> Running in b34d42cd6c00
Collecting hpccm
  Downloading https://files.pythonhosted.org/packages/b8/bd/a88fe1e1fa0bf5b17f3a2d4d001050bdb1e2c89cd9bf5bb24c47b314b33b/hpccm-19.5.0.tar.gz (133kB)
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-biiwov0l/hpccm/setup.py", line 10, in <module>
        from hpccm.version import __version__
      File "/tmp/pip-install-biiwov0l/hpccm/hpccm/__init__.py", line 22, in <module>
        from hpccm.Stage import Stage
      File "/tmp/pip-install-biiwov0l/hpccm/hpccm/Stage.py", line 25, in <module>
        from hpccm.primitives.baseimage import baseimage
      File "/tmp/pip-install-biiwov0l/hpccm/hpccm/primitives/__init__.py", line 24, in <module>
        from hpccm.primitives.runscript import runscript
      File "/tmp/pip-install-biiwov0l/hpccm/hpccm/primitives/runscript.py", line 23, in <module>
        from six.moves import shlex_quote
    ModuleNotFoundError: No module named 'six'
    ----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-biiwov0l/hpccm/
The command '/bin/sh -c pip install --no-cache-dir hpccm' returned a non-zero code: 1

	if self.__fortran:
	self.__compiler_debs.append('gfortran')
	self.__runtime_debs.append('libgfortran3')

nvidia / hpc-container-maker Goto Github PK

hpc-container-maker's Issues

Support for chmod flag in copy primitive.

base image

Intel PSXE

PGI compiler version 19.10

PGI compiler version 19.10

1. The following shell commands looks too messy according hpccm's expectation:

2. The final docker image does not actually has singularity command

ERROR

Recommend Projects

Recommend Topics

Recommend Org

Jobs

2. The final docker image does not actually has `singularity` command