GithubHelp home page GithubHelp logo

rapidsai / docker Goto Github PK

View Code? Open in Web Editor NEW
65.0 9.0 50.0 3.11 MB

Dockerfile templates for creating RAPIDS Docker Images

License: Apache License 2.0

Shell 34.45% Python 26.04% Dockerfile 34.22% jq 5.29%

docker's Introduction

rapidsai/docker

This repository contains the end-user docker images for RAPIDS.

Image types

There are two image types: base (rapidsai/base) and notebooks (rapidsai/notebooks).

Base image

This image can be found here: https://hub.docker.com/r/rapidsai/base

It contains the basic installation of RAPIDS and dask-sql. By default it starts an ipython REPL.

Notebooks image

This image can be found here: https://hub.docker.com/r/rapidsai/notebooks

It extends the base images to include RAPIDS notebooks and a jupyterlab server which starts automatically.

Image tags

Tags for both base and notebooks images take the form of ${RAPIDS_VER}-cuda${CUDA_VER}-py${PYTHON_VER}.

There is no latest tag.

Environment Variables

The following environment variables can be passed to the docker run commands for each image:

  • EXTRA_CONDA_PACKAGES - used to install additional conda packages in the container. Use a space separated list of conda version specs
  • CONDA_TIMEOUT - how long (in seconds) the conda install should wait before exiting
  • EXTRA_PIP_PACKAGES - used to install additional pip packages in the container. Use a space separated list of pip version specs
  • PIP_TIMEOUT - how long (in seconds) the pip install should wait before exiting
  • UNQUOTE - Whether the command line args to docker run should be executed with or without being quoted. Default to false and it is unlikely that you need to change this.

Bind Mounts

Mounting files/folders to the locations specified below provide additional functionality for the images.

  • /home/rapids/environment.yml - a conda YAML environment file that contains a list of dependencies that will be installed. The file should look like:
dependencies:
  - beautifulsoup4
  - jq

Contributing

Please see CONTRIBUTING.md for details on how to contribute to this repo.

docker's People

Contributors

ajschmidt8 avatar aleksficek avatar ayodeawe avatar ayushdg avatar bdice avatar bradreeswork avatar bsuryadevara avatar charlesbluca avatar dantegd avatar dependabot[bot] avatar dillon-cullinan avatar divyegala avatar efajardo-nv avatar github-actions[bot] avatar gputester avatar jacobtomlinson avatar jakirkham avatar jjacobelli avatar jolorunyomi avatar mattf avatar miguelusque avatar mike-wendt avatar mluukkainen avatar msadang avatar okoskinen avatar quasiben avatar raydouglass avatar renovate[bot] avatar riebart avatar rlratzel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker's Issues

[DOC] Update README.md to include description of clone step, and others

Report needed documentation

Report needed documentation
buildDockerImage automates several steps which are not included in the help text (which is intended to be the primary source of documentation) including cloning the RAPIDS sources from github. Not every step needs to be documented, but the clone step is important enough that it should be.

Describe the documentation you'd like
Describe that buildDockerImage will clone RAPIDS sources based on the settings in the config file, and where the cloned sources will reside afterwards.

Steps taken to search for needed documentation
Asked reviewers what was missing, they mentioned these items.

[BUG] pytables dependency not available in RAPIDS container

Describe the bug
When trying to export to HDF5 file format, the following error is displayed:

/opt/conda/envs/rapids/lib/python3.6/site-packages/cudf/io/hdf.py:26: UserWarning: Using CPU via Pandas to write HDF dataset, this may be GPU accelerated in the future
  "Using CPU via Pandas to write HDF dataset, this may "
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
/opt/conda/envs/rapids/lib/python3.6/site-packages/pandas/io/pytables.py in __init__(self, path, mode, complevel, complib, fletcher32, **kwargs)
    465         try:
--> 466             import tables  # noqa
    467         except ImportError as ex:  # pragma: no cover

ModuleNotFoundError: No module named 'tables'

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-1-4cfff73d1cdf> in <module>
      6 df.head()
      7 
----> 8 df.to_hdf("hello.hdf5", key="tmp")

/opt/conda/envs/rapids/lib/python3.6/site-packages/cudf/core/dataframe.py in to_hdf(self, path_or_buf, key, *args, **kwargs)
   3781         import cudf.io.hdf as hdf
   3782 
-> 3783         hdf.to_hdf(path_or_buf, key, self, *args, **kwargs)
   3784 
   3785     @ioutils.doc_to_dlpack()

/opt/conda/envs/rapids/lib/python3.6/site-packages/cudf/io/hdf.py in to_hdf(path_or_buf, key, value, *args, **kwargs)
     28     )
     29     pd_value = value.to_pandas()
---> 30     pd.io.pytables.to_hdf(path_or_buf, key, pd_value, *args, **kwargs)

/opt/conda/envs/rapids/lib/python3.6/site-packages/pandas/io/pytables.py in to_hdf(path_or_buf, key, value, mode, complevel, complib, append, **kwargs)
    271     if isinstance(path_or_buf, string_types):
    272         with HDFStore(path_or_buf, mode=mode, complevel=complevel,
--> 273                       complib=complib) as store:
    274             f(store)
    275     else:

/opt/conda/envs/rapids/lib/python3.6/site-packages/pandas/io/pytables.py in __init__(self, path, mode, complevel, complib, fletcher32, **kwargs)
    467         except ImportError as ex:  # pragma: no cover
    468             raise ImportError('HDFStore requires PyTables, "{ex!s}" problem '
--> 469                               'importing'.format(ex=ex))
    470 
    471         if complib is not None and complib not in tables.filters.all_complibs:

ImportError: HDFStore requires PyTables, "No module named 'tables'" problem importing

Steps/Code to reproduce bug

$ docker pull rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04 
$ docker run --runtime=nvidia --rm -it -v /raid/miguelm:/rapids/notebooks/miguelm  -p 8888:8888 -p 8787:8787 -p 8786:8786 rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04 

Create a notebook with the following code:

import cudf

df = cudf.DataFrame([('a', list(range(20))),
                     ('b', list(reversed(range(20)))),
                     ('c', list(range(20)))])
df.head()

df.to_hdf("hello.hdf5", key="tmp")

Expected behaviour
The example DataFrame should be stored in an HDF5 file.

I think the issue is due that the container does not include pytables dependency.

Hope it helps!

[BUG] Jupyter cannot block and logs are sent to /dev/null

Describe the bug

When working on the RAPIDS Helm Chart we have a pod which will be running Jupyter as a daemon. Ideally this pod would simply run Jupyter as the main blocking process.

The current config for the runtime image is for Jupyter to be run during the entrypoint with nohup and all stdout/stderr is sent to /dev/null.

The default command is bash which just exits if the container is not running in interactive mode.

In order for us to block on the Jupyter process we currently have to override the entrypoint and command with a new one which runs Jupyter in the foreground. This results in duplication of the Jupyter startup command in the helm chart. Alternatively we could instead tail the nohup output as the command in order to block, but this isn't possible due to all output going to /dev/null.

Steps/Code to reproduce bug

docker run --rm 0.14-cuda10.1-runtime-ubuntu18.04-py3.7

Container just exits.

Expected behavior

I would expect to be able to run the container with Jupyter as the foreground process.

A few ways this could be achieved:

  • Pipe the output to somewhere like /var/log/jupyter.log so that we can override the command as tail -f /var/log/jupyter.log
  • Run Jupyter in the foreground by default without nohup and output redirection
  • Split the start_jupyter.sh script into a foreground and background version, allowing us to override the entrypoint/command to the foreground version.

Environment details (please complete the following information):
Any

Additional context
NA

cc @quasiben

[FEA] Add a table-style description of dependencies to install

Is your feature request related to a problem? Please describe.
Users would like to specify the versions of dependencies (libs, tools, etc.) being installed and/or built with in a table, similar to this. The same table could be output to users to show a summary of what the build contains. The table could be provided as a file of some sort and managed by a CM tool and shared by different users to more easily reproduce builds.

Describe the solution you'd like
Provide a file to rapidstool.sh that allows for setting parameters that influence the build dependencies. The CLI args will override the file parameters. rapidstool.sh will output a summary of the final values of every build param - which is also logged - so the user can see exactly what's in the build.

Describe alternatives you've considered
Using environment variables as a way to set build params was also discussed, but the highly-visible nature of a file that allows for CLI overrides seemed better for traceability (eg. "did I remember to set or unset that env var?"). The "output a summary of the final value of each param" would alleviate some concerns with env vars, but the ability to save and easily share a file made env vars even less attractive in comparison.

Additional context
The motivation for this feature was this table and how well-received it is with users.

There are multiple users stories from this feature request (maybe I should split the request up?). They are:

  • As a build user, I would like to be able to specify dependency versions in an easy to work with file which I can save or share with others.
  • As a build user, I would like to be able to override parameters set in the file with args to the rapidstool.sh command.
  • As a build user, I need to know the final value of all the parameters that influence the build after the parameter file is read and all CLI overrides are applied.

[DOC] - Website configurator is incorrect

Report incorrect documentation

The configurator documentation at https://rapids.ai/start.html#get-rapids is no longer quite correct
These are tiny points, but might put a newcomer off:

Provide links and line numbers if applicable.
https://rapids.ai/start.html#get-rapids

Describe the problems or issues found in the documentation
A clear and concise description of what you found to be incorrect.
The https://rapids.ai/start.html#get-rapids page says "The copied Docker command above should auto-run a notebook server. If it does not, run the following command within the Docker container to launch the notebook server.
bash /rapids/notebooks/utils/start-jupyter.sh
"

Steps taken to verify documentation is incorrect
List any steps you have taken:
I can confirm that neither cuda10.2-runtime-ubuntu18.04 or cuda10.2-runtime-ubuntu18.04-py3.7 auto-run a notebook server (they drop you to the command line).
And the directory for the start_jupyter.sh script is now at /rapids/utils

Suggested fix for documentation
Detail proposed changes to fix the documentation if you have any.

Correct the the Docker images to auto-run a notebook server and change the documentation so the location of the directory for start_jupyter.sh is /rapids/utils (not the current /rapids/notebooks/utils/start-jupyter.sh)

Report needed documentation

Report needed documentation
A clear and concise description of what documentation you believe it is needed and why.
Current documentation is wrong.

Describe the documentation you'd like
A clear and concise description of what you want to happen.
Options are laid-out above

Steps taken to search for needed documentation
List any steps you have taken:
Not applicable

[QST] Access NVIDA Runtime on JupyterHUB

What is your question?

This is a general question regarding deploying GPU aware containers. Has the RapidsAI team deployed jupyterhub on their hardware in which single user notebooks were able to access GPUs? Or is the case, that engineers are operating from a single user notebook environment to access gpus and use rapids? I am trying to set up a hub on a dgx but spawning notebooks that can see the have access to nvidia runtime is problematic.

[BUG] GPU Dashboards blank on dev container due to old jupyterlab-nvdashboard version

Describe the bug
The current rapids nightly docker + dev container pulled with
docker pull rapidsai/rapidsai-dev:0.12-cuda10.0-devel-ubuntu18.04-py3.7
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786
rapidsai/rapidsai-dev:0.12-cuda10.0-devel-ubuntu18.04-py3.7

I using an old version of jupyterlab-nvdashboard that causes the gpu dashboards to be blank.

Updating the version resolves the issue. This guidance was given in this issue:
rapidsai/jupyterlab-nvdashboard#24 (comment)
and the resolution steps came from this issue:
rapidsai/jupyterlab-nvdashboard#49

Steps/Code to reproduce bug
Run the container using

docker pull rapidsai/rapidsai-dev:0.12-cuda10.0-devel-ubuntu18.04-py3.7
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
    rapidsai/rapidsai-dev:0.12-cuda10.0-devel-ubuntu18.04-py3.7

And go to any of the gpu dashboards, they will be blank.
Expected behavior
The gpu dashboards show the expected output.

Additional context
Testing was done on a dgx station.

[FEA] Add openpyxl library to Docker containers to permit export as Excel via Pandas

When writing the output of a cudf groupby query to Excel (via Pandas) I got the following:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<timed exec> in <module>

/opt/conda/envs/rapids/lib/python3.6/site-packages/pandas/io/excel/_openpyxl.py in __init__(self, path, engine, mode, **engine_kwargs)
     17     def __init__(self, path, engine=None, mode="w", **engine_kwargs):
     18         # Use the openpyxl module as the Excel writer.
---> 19         from openpyxl.workbook import Workbook
     20 
     21         super().__init__(path, mode=mode, **engine_kwargs)

ModuleNotFoundError: No module named 'openpyxl'

As a workaround, I manually added openpyxl with:

conda install openpyxl

Please can this Pandas dependency be added to future versions of the containers?

[FEA] rapidsdevtool.sh needs to warn users if RAPIDS dependencies are not being met

Is your feature request related to a problem? Please describe.
Some RAPIDS components depend on others for build and test (cuML requires cuDF). If a user only selects cuML, they will get build errors unless cuDF is also available.

Describe the solution you'd like
rapidsdevtool.sh should print a warning, or error out, if dependencies are not being met. The tool could also have a flag to ignore these problems in the case where a user knows how to satisfy the dependency some other way.

Describe alternatives you've considered
The tool could automatically select the required dependencies, but that could lead to problems if the user really doesn't want them (eg. they already have a local copy of cuDF elsewhere, for example).

Additional context
n/a

Adding Requirement to rapidsai

I am having issues adding python libraries to my docker file when I use rapidsai as the base image.
This is what my docker image looks like:

FROM rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04-py3.7
COPY * /ai/
WORKDIR /ai/

RUN python3 -m pip install --upgrade pip
RUN python3 -m pip install -r requirements.txt

CMD ["python3", "test.py"]

And my requirements.txt file looks like this:

kivy
beautifulsoup4
sympy

my test.py file:

import kivy
print('Kivy version: ', kivy.version)

import sympy
print('Sympy version: ', sympy.version)

import platform
print('Python version: ', platform.python_version())

The result I get after building and running my docker image is:

Traceback (most recent call last):
File "test.py", line 1, in
import kivy
ModuleNotFoundError: No module named 'kivy'

And I get similar results for the other libraries in my requirements.txt file. Why does my docker image build and install the libraries but it doesn't know it is installed when I am trying to run the image. How am I supposed to add python libraries that I need to the rapidsai/rapidsai:cuda10.0-runtime-ubuntu18.04-py3.7 image?

[BUG] Demonstration Notebook missing requirements

Describe the bug
Periodicity_Detection requires s3fs which isn't included

Steps/Code to reproduce bug
Follow the Docker cuda setup commands then try to run the periodicity detection notebook

docker pull rapidsai/rapidsai:cuda10.2-runtime-ubuntu18.04-py3.8
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
    rapidsai/rapidsai:cuda10.2-runtime-ubuntu18.04-py3.8

Expected behavior
The notebook should be able to be run full. Or a requirements.txt of some sort should be provided for each notebook with instructions on install

Environment details (please complete the following information):

  • Environment location: Docker
  • Method of cuDF install: See above

pip install s3fs fixes it but either it should be bundled or there should be a requirements.txt?

[DOC] Missing Usage Document for Ports

Report needed documentation

Report needed documentation
Reference: https://hub.docker.com/r/rapidsai/rapidsai-dev, Usage section

The running container exposes ports 8787 and 8786, according to the Dockerfile. And the docker run command binds those ports to localhost. But there's no information on how to use those ports. Is anything even listening on them? Are there servers on the image that can be started?

I did an apt update && apt install mlocate && updatedb && locate rstudio, since the RStudio Server conventionally listens on port 8787. But it's not there.

Describe the documentation you'd like
If there's nothing listening on those ports, delete them from the docker run command. If there is something listening, document how to use it.

[FEA] Allow a user to specify specific versions of RAPIDS components

Is your feature request related to a problem? Please describe.
rapidstool.sh should allow a user to build a container (and/or a baremetal build eventually) for only a specific RAPIDS component, and a specific version. This will allow the tool to support the use case common to RAPIDS component devs.

Describe the solution you'd like
A config file or CLI arg (or both) which lets a user specify the RAPIDS comp name and version to clone and build. Ideally - for consistency - this wouldn't be too different than how a user specifies 3rd party lib dependencies, as described here

Describe alternatives you've considered
Current workaround is to edit clone.sh and the copy_and_build_rapids template, which could be a reasonable final solution if necessary.

[FEA] Add cuSpatial repo/package to all Docker images

This repo and corresponding conda package needs to be added to the appropriate Docker images:
https://github.com/rapidsai/cuspatial
https://anaconda.org/rapidsai-nightly/cuspatial

devel images will have the new repo added alongside the other repos in /rapids and will have it built and installed from sources (using standard build.sh)

  • NOTE: the current cuspatial build process requires that it be built after cudf, since it dependes on cudf headers and libs.

base will need cuspatial added to the conda install line (and therefore runtime will automatically have it included)

[FEA] Allow Docker build args to be passed from the rapidsdevtool.sh command line

Is your feature request related to a problem? Please describe.
Jenkins jobs in particular need to be able to easily pass args to the docker build command (--build-arg, etc.), and at the moment the only way to do that is to generate the Dockerfile and run docker build manually.

Describe the solution you'd like
Add the ability to pass arbitrary options to docker build from the rapidsdevtool.sh command. This includes options such as --build-arg and --squash

Describe alternatives you've considered
Somehow embrace that users just want to run docker build themselves directly and work with that, not sure exactly what that would entail though.

Additional context
n/a

[FEA] Add support for doing bare-metal builds

Is your feature request related to a problem? Please describe.
Users should be able to use the same rapidstool.sh script to do a local bare-metal build just as they do for container builds. This will provide additional build consistency for more build types.

Describe the solution you'd like
rapidstool.sh could generate a build.sh script in a similar way to how it generates a Dockerfile. This would be a consistent UX abetween both build types and leverage the same code generation and template writing mechanisms.

Describe alternatives you've considered
The script could just issue the local commands to build instead of generating a script to do the same thing. This might be nice, and worth discussing further.

dask-setup.sh cannot be used to setup multi-nodes environment

Reporting a bug

I ran into some issues of setting up multi-node dask environments with the included dask-setup.sh script in the container (nvcr.io/nvidia/rapidsai/rapidsai:ubuntu1604_cuda92_py35). dask-setup.sh file is located at /rapids/utils/ inside the container. I checked the dask-setup.sh file and found workers are registered by the following command:

 dask-worker $MASTER_IPADDR:$DASK_SCHED_PORT \
                             --host=${MY_IPADDR[0]} --no-nanny \
                             --nprocs=1 --nthreads=1 \
                             --memory-limit=0 --name ${MY_IPADDR[0]}_gpu_$worker_id \
                             --local-directory $DASK_LOCAL_DIR/$name

where ${MY_IPADDR} is defined by:

MY_IPADDR=($(hostname --all-ip-addresses))

It works fine for single node dask task as different wokers are distingished by $worker_id. However for multiple nodes, the ${My_IPADDR} is the same across nodes. This creates name clash problem and wokers cannot be registered properly.

Another notable issue I noticed is that the script takes in the input arguments for BOKEN_PORT. However, this information is not used in setting up the scheduler and workers.

I changed the script a bit to make it workable for me and I attached here.
dask-setup.zip
To use it , in master node run

./dask-setup $NUM_GPUS PORT_SCHEDULE PORT_BOKEN IP_SCHEDULER MASTER

In worker node run

./dask-setup $NUM_GPUS PORT_SCHEDULE PORT_BOKEN IP_SCHEDULER WORKER{0-999}

[DOC] Add developer documentation

Report needed documentation

Report needed documentation
There's currently no developer documentation. A new maintainer of the build repo will have to reverse engineer the sources in order to maintain them, which is not ideal and error prone.

Describe the documentation you'd like
Either an addition to the toplevel README.md, or additional README.md files in different subdirs, or both describing things like:

  • The directory layout and what files are where.
  • Description of the code generators.

Steps taken to search for needed documentation
This gap was uncovered as part of a design review.

[FEA] Make it easier to customize the install step for local builds

Is your feature request related to a problem? Please describe.
Builds performed by build.sh install to a specific location, but some use cases require alternate locations.

Describe the solution you'd like
An option to allow users to customize install locations without editing build.sh or even build.sh.template.

Describe alternatives you've considered
Documenting how a user can edit the generated build.sh script, or how they can create a new template and support generating build.sh from different templates.

Additional context
n/a

[BUG] - Jupyter lab is not started after new release

Describe the bug
Jupyter lab is not started after new release

Steps/Code to reproduce bug
We are using the https://github.com/rapidsai/helm-chart . it was working fine but after new release it is not working. It was using following images:
image:
repository: "rapidsai/rapidsai"
tag: cuda10.0-runtime-ubuntu16.04
Start command is :
- bash
- '/rapids/notebooks/utils/start-jupyter.sh'

Even after changing path to new utils it is not working:

  • bash
    • '/rapids/utils/start-jupyter.sh'

Expected behavior
Rapids Helm package should work with rapidsai docker image.

Environment details (please complete the following information):
-using the Azure Kubernetics service for deploying with gpu instances.

Additional context
Add any other context about the problem here.

[FEA] Add interactive config file generator

Is your feature request related to a problem? Please describe.
Some users may need to be guided more than others when it comes to configuring rapidsdevtool.sh, and the config file may not communicate exctly what needs to be changed for a particular use case well.

Describe the solution you'd like
Something like an interactive mode which "interviewed" a client and generated/updated the config based on common use cases would help.

Describe alternatives you've considered
Settings based on "modes" or use cases which were essentially like configuration macros could also help. For example, a config setting that allowed a user to specify a RAPIDS component they were building, like "buildFor=cuDF", and all the other settings were overridden automatically.

Additional context
n/a

[FEA] Allow for builds that contain only specific RAPIDS components

Is your feature request related to a problem? Please describe.
All builds done by rapidstool.sh include every RAPIDS component, when many use cases (mainly dev builds) only require one or two. Builds for specific RAPIDS components like this are done by one-off scripts or Dockerfiles provided by the components. Ideally, rapidstool.sh replaces these and provides a consistent way to build any RAPIDS component.

Describe the solution you'd like
Some mechanism (optional CLI args and/or input file) that allows for the build to include only the RAPIDS component(s) specified. Dependencies between RAPIDS should be satisfied and communicated automatically (eg. specifying cuML should also include cuDF automatically). This will require either modifications to the clone.sh script or using some other script(s) altogether.
rapidsdevtool.sh should also print a warning, or error out, if dependencies are not being met. The tool could also have a flag to ignore these problems in the case where a user knows how to satisfy the dependency some other way.

Describe alternatives you've considered
For the dependency requirements aspect, the tool could automatically select the required dependencies, but that could lead to problems if the user really doesn't want them (eg. they already have a local copy of cuDF elsewhere, for example).

Additional context
Build scripts and/or Dockerfiles written by and included with the individual RAPIDS components, such as this one.
User stories:

  • As a build user, I would like a build (container, bare metal, whatever) that only contains the component I'm working with (eg. cuDF) and nothing else.
  • As a build user, I need all RAPIDS components that the component I'm working with built automatically. As an example, if I specify cuML, cuDF should also be built.

[FEA] Add cugraph to default config (priority is for containers)

Is your feature request related to a problem? Please describe.
Containers need cugraph. Everything else could also get it, but containers specifically need it.

Describe the solution you'd like
Add CUGRAPH_REPO/BRANCH to config, add build-cugraph.sh to utils, incorporate both accordingly.

Describe alternatives you've considered
n/a

Additional context
n/a

Feature request / regression: py3.7 in production docker releases to support BlazingSQL

Issue:

Trying BlazingSQL via a container is difficult, and then going to it in production is difficult, because RAPIDS docker no longer has py3.7 and Blazing requires py3.7. This change happened about 2mo ago.

For new users, navigating dependencies is non-trivial, and for heavy users, the sizes involved make the process slow.

Workaround:

We created an interim container:

FROM gpuci/miniconda-cuda-rapidsenv:10.0-devel-ubuntu18.04-py3.7

RUN source activate rapids \
 && conda install -c rapidsai -c nvidia -c conda-forge -c defaults rapids=0.10 python=3.7

=> https://hub.docker.com/r/graphistry/graphistry-nvidia

cc @felipeblazing @mike-wendt

[FEA] Add extra packages in docker entrypoint

I would like to use the rapids docker image more interchangeable with the dask docker image.

Describe the solution you'd like
The dask image (along with others) implements an entrypoint script which installs extra packages from environment variables (EXTRA_APT_PACKAGES, EXTRA_CONDA_PACKAGES, EXTRA_PIP_PACKAGES). This allows users to customize their containers without having to build their own.

Describe alternatives you've considered
The alternative to this is for people to build their own docker images backed on the rapids image and host it somewhere like Docker Hub. It would be good if users could get started with the docker image without prior knowledge of docker.

[FEA] source activate rapids on docker entrypoint

Is your feature request related to a problem? Please describe.
Hi,

I have noticed that RAPIDS container automatically activates rapids environment only running docker run as root.

That is due to the fact that the source activate rapids command is in a root startup script.

Might I suggest a different approach (invoking a script in the entry point)? That way, it would work with any user passed to docker with the --user command.

https://stackoverflow.com/a/44079215/3687310

Thanks!
Miguel

[BUG] Docker images do not start the RAPIDS conda env when used with `docker exec`

@rlratzel reported:

docker run to start an interactive bash shell, or to run an individual command, automatically executes in the rapids conda env. However, when a container is started with --detach and has commands executed in it with docker exec, the commands do not run in the rapids conda env.

This affects jobs using the Jenkins Docker plugin.
The workaround is to issue a source /opt/conda/bin/activate rapids command in the script they define in order to use packages in the rapids env.

[BUG] cuGraph Force-Atlas2.ipynb can't execute, holoviews not installed in rapidsai:cuda10.2-runtime-ubuntu18.04, possibly others

Describe the bug
holoviews not installed in rapidsai:cuda10.2-runtime-ubuntu18.04, possibly other builds.
The cuGraph Force-Atlas2.ipynb demo relies on this.

Steps/Code to reproduce bug
execute /rapids/notebooks/cugraph/layout/Force-Atlas2.ipynb

Expected behavior
The demo should run with no errors, instead it produces:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-13-cdaa39fc78bd> in <module>
      3 import numpy as np
      4 import pandas as pd
----> 5 import holoviews as hv
      6 
      7 from colorcet import fire

ModuleNotFoundError: No module named 'holoviews'

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
    Docker
  • Method of cuDF install: [conda, Docker, or from source]
    NGC rapidsai:cuda10.2-runtime-ubuntu18.04
    • If method of install is [Docker], provide docker pull & docker run commands used
      docker pull nvcr.io/nvidia/rapidsai/rapidsai:cuda10.2-runtime-ubuntu18.04
      docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786
      -v /bulk:/rapids/notebooks/my_data
      --shm-size=8G
      rapidsai:cuda10.2-runtime-ubuntu18.04
  • Please run and attach the output of the cudf/print_env.sh script to gather relevant environment details

print_env.sh-output.txt

Additional context
Add any other context about the problem here.

[BUG] Upgrade iphyton to version 7.7

Describe the bug
Some NVIDIA DLIs make use of RAPIDS containers.

In those notebooks, they use the following code in order to terminate the Jupyter notebook, to release resources for the next notebook.

import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

There is a known bug (ipython/ipython#11646) when using that command in any version of ipython prior 7.7.

I have workarounded it by manually updating the ipython version to v7.7.

conda install ipython=7.7

Steps/Code to reproduce bug
image

Expected behaviour
No error message displayed when executing the cells containing the code above.

[FEA] Add jupyterlab-nvdashboard to docker runtime containers

Is your feature request related to a problem? Please describe.
Hi,

I am not sure if this is the right place to add this Feature request.

I think it would be great to add the jupyterlab-nvdashboard extension to RAPIDS runtime containers.

More info here:
https://medium.com/rapids-ai/gpu-dashboards-in-jupyter-lab-757b17aae1d5

Describe the solution you'd like
To have jupyterlab-nvdashboard preinstalled on RAPIDS runtime containers.

Describe alternatives you've considered
Install it myself.

[BUG] Default rapids user can't use yum

Describe the bug
In the latest rapidsai/rapidsai-nightly:0.16-cuda10.1-runtime-centos7-py3.7 image, yum install vim throws a permissions error as shown in the screenshot below.

image

[BUG] pytorch is not included even though its used by example notebook

Describe the bug
When using the cusignal notebook from a docker container /rapids/notebooks/cusignal/E2E_Example.ipynb it tries to import torch which is not installed.

Steps/Code to reproduce bug

  1. Run docker container rapidsai_cuda10.1-runtime-centos7 within Singularity.
  2. Start Jupyter server with singularity run --nv rapidsai_cuda10.1-runtime-centos7.sif
  3. Access Jupyter on localhost:8888
  4. Try running the cusignal/E2E_Example.ipynb example.

Expected behavior
Notebook completes without failure.

Environment details (please complete the following information):

  • Environment location: Singularity on local HPC system
  • Method of cuDF install: Docker container
    • If method of install is [Docker], provide docker pull & docker run commands used
      $ singularity pull docker://nvcr.io/nvidia/rapidsai/rapidsai:cuda10.1-runtime-centos7
      $ singularity run --nv rapidsai/rapidsai_cuda10.1-runtime-centos7.sif
  • Please run and attach the output of the cudf/print_env.sh script to gather relevant environment details

rapids.log

Additional context
Using Singularity on HPC so could be issues with that.

[FEA] Add info message to container entrypoint

From @miguelangel :

Just a quick observation about changes from 0.8 to 0.9 container.
More specifically, I have noticed that, from 0.9 container, a JupyterLab instance is executed when docker run the container.
That is very handy, but as a side effect, it took me a while to understand why local mapped folder was not visible anymore.
As a quick workaround, I have mapped my folder to /rapids/notebook, to make it visible to JupyterLab.

docker run --runtime=nvidia --rm -it -v /raid/miguelm:/rapids/notebooks/miguelm -p 8888:8888 -p 8787:8787 -p 8786:8786 nvcr.io/nvidia/rapidsai/rapidsai:0.9-cuda10.0-runtime-ubuntu18.04

Might I suggest to explicitly mention the above somewhere? Apologies if it is already detailed somewhere, but I couldn't find it.

What about an echo message when running the container?

@rmccorm4 added:

...NGC DL containers have some common info they output at entrypoint - something like this might be a good place to put that

[FEA] Generated Dockerfiles should be ready to use

Is your feature request related to a problem? Please describe.
If I try to use any of the Dockerfiles in /generatedDockerfiles they are missing scripts and other items that are needed for the COPY commands. This was encountered when I was trying to quickly modify the generated file to test/update pandas version. However, I was unable to due to the missing files.

Describe the solution you'd like
If we can include the current generated scripts and other files for the current branch, this would make it easier so I can run docker build -f <file> . from the directory and build the image without needing the tool

Describe alternatives you've considered
Modifying the templates and using the script generator, but for these smaller and rapid iterations, it felt like more work instead of being able to edit 1 file.

Additional context
N/A

[FEA] rapidsdevtool.sh should print a warning when RAPIDS deps are not met

Is your feature request related to a problem? Please describe.
Now that rapidsdevtool.sh allows users to specify which RAPIDS comps they want to build, there's a possibility a user could select a comp that needs a dep, and not also select the dep. For example, selecting cuML but not cuDF.

Describe the solution you'd like
Have rapidsdevtool.sh simply print a warning that explains the missing dependency. An option to fail instead of warn could also be provided.

Describe alternatives you've considered
rapidsdevtool.sh could automatically add the missing dependency, but that seemed like it could be problematic or too magical for some users.

Additional context
This is the remaining work from issue #2

[FEA] add doc building tools to `devel` image

The devel image should include doc building tools for at least two reasons:

  • Devs may need to (should be?) building docs occasionally to ensure their changes are compatible with the current documentation, as well as to allow them to view docs that they may have modified. Much like the other dev tools in devel, this will help devs be more efficient.
  • nightly doc building jobs use devel containers, and currently have to install doc building tools each time. Having them in the container will speed up doc builds without impacting devel image builds much at all.

The following tools should be added to the conda install line (in the conda_install_devel template):

sphinx sphinx_rtd_theme numpydoc sphinxcontrib-websupport nbsphinx pandoc=<2.0.0 recommonmark doxygen

cc @dillon-cullinan @mike-wendt

[BUG] Dask and nvdashboard Jupyter Lab extentions are not working

Describe the bug
When running the latest docker images the Dask and nvdashboard Jupyter Lab extensions do not display.

Steps/Code to reproduce bug

docker run -p 8888:8888 --rm -it rapidsai/rapidsai-nightly:cuda10.0-runtime-ubuntu16.04 bash

Then visit https://host:8888. Neither extensions will be available from the sidebar.

Running conda upgrade jupyterlab which updates to 1.2.3 and restarting Jupyter Lab seems to resolve this issue.

However once upgraded pynvml (needed for nvdashboard) appears to be missing some dependencies.

Traceback (most recent call last):
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/server.py", line 7, in <module>
    from jupyterlab_nvdashboard import apps
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/apps/__init__.py", line 2, in <module>
    from . import gpu
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/apps/gpu.py", line 17, in <module>
    pynvml.nvmlInit()
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 742, in nvmlInit
    _load_nvml_library()
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 736, in _load_nvml_library
    check_return(NVML_ERROR_LIBRARY_NOT_FOUND)
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 366, in check_return
    raise NVMLError(ret)
pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found
Traceback (most recent call last):
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 734, in _load_nvml_library
    nvml_lib = CDLL("libnvidia-ml.so.1")
  File "/opt/conda/envs/rapids/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvidia-ml.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/server.py", line 7, in <module>
    from jupyterlab_nvdashboard import apps
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/apps/__init__.py", line 2, in <module>
    from . import gpu
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab_nvdashboard/apps/gpu.py", line 17, in <module>
    pynvml.nvmlInit()
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 742, in nvmlInit
    _load_nvml_library()
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 736, in _load_nvml_library
    check_return(NVML_ERROR_LIBRARY_NOT_FOUND)
  File "/opt/conda/envs/rapids/lib/python3.6/site-packages/pynvml/nvml.py", line 366, in check_return
    raise NVMLError(ret)
pynvml.nvml.NVMLError_LibraryNotFound: NVML Shared Library Not Found

Expected behavior
Both extensions should be available in Jupyter Lab.

Environment details (please complete the following information):

  • Docker rapidsai/rapidsai-nightly:cuda10.0-runtime-ubuntu16.04

[FEA] Add an option to squash for buildDockerImage

Is your feature request related to a problem? Please describe.
I'd like to take advantage of fine-grained caching when building Docker images, but I don't want all the layers.

Describe the solution you'd like
Simply add an option to squash or not to the config file.

Describe alternatives you've considered
This came up specifically around the RAPIDS build steps in the Dockerfile, and if they should be RUN commands for building each individual component, or a single "build everything" RUN command. We could instead have an option around that, but the squash option serves the same purpose (mostly) while being useful elsewhere as well.

Additional context
n/a

[DOC] README.md needs to include examples

Report needed documentation

Report needed documentation
README.md needs to add a section containing common examples with descriptions of what they do. Without these, users may have a harder time understanding what command and/or option they should use.

Describe the documentation you'd like
A section in README.md, possibly just called "examples", that describes at least the following:

  • How to use buildDockerImage with one of the default templates, where the resulting files go, and what steps are automated.
  • How to use genBuildScript.sh to get a build.sh for building locally.
  • How to change the config to build a specific set of RAPIDS components.
  • How to change the config to update the version of a specific dependency.
  • How to create a new Dockerfile template and use it.

Steps taken to search for needed documentation
Reviewers pointed out that this was needed.

Cannot open Jupyter Notebooks in Docker installation

Hi everyone,

I'm using the docker version of rapids. I have successfully pulled the image and run it. However, there's no jupyter notebook available. I tried to run ./utils/start_jupyter.sh and I get this:

jupyter-lab --allow-root --ip=0.0.0.0 --no-browser --NotebookApp.token=''

[W 17:40:04.014 LabApp] All authentication is disabled.  Anyone who can connect to this server will be able to run code.
[I 17:40:04.014 LabApp] The port 8888 is already in use, trying another port.
[I 17:40:04.743 LabApp] JupyterLab extension loaded from /opt/conda/envs/rapids/lib/python3.6/site-packages/jupyterlab
[I 17:40:04.743 LabApp] JupyterLab application directory is /opt/conda/envs/rapids/share/jupyter/lab
[I 17:40:04.920 LabApp] Serving notebooks from local directory: /rapids/notebooks
[I 17:40:04.920 LabApp] The Jupyter Notebook is running at:
[I 17:40:04.920 LabApp] http://4434ae38bd85:8889/
[I 17:40:04.920 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

However, when I click on the http link, it doesn't reach anything. I feel that this has something to do with the fact that it says The port 8888 is already in use, trying another port., which I find strange because it shouldn't be already used.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.