databricks / containers Goto Github PK

Sample base images for Databricks Container Services

License: Apache License 2.0

Dockerfile 71.44% Shell 28.56%

containers's Introduction

Databricks Container Services - Example Containers

NOTE: The latest tags have been removed on most images in favor of runtime-specific tags, with the exception being the databricksruntime/standard image. If your build relied on an image tagged with latest, please update it to match the runtime version of the cluster.

This repository provides Dockerfiles for use with Databricks Container Services. These Dockerfiles are meant as a reference and a starting point, enabling users to build their own custom images to suit thier specific needs.

Warning: Runtime Incompatibility

The Dockerfiles on the master branch are currently not maintained to be backwards compatible with every Databricks Runtime version, and are not always updated for new versions.

Documentation

Azure
AWS

Images

DockerHub

The Databricks provided sample images have been published to DockerHub

How To Contribute to this Repo

Fork and Clone this Repo, locally.
Follow the example dockerfiles and ensure your docker file has liberal comments, explaining each step of your image.
Be specific when you name your image. Example: CentOS7.6RBundle
Test your image and verify it works on a Databricks Cluster.
Check it into the experimental directory, in a folder specific to the OS. Example: experimental/centos/CentOS7.6RBundle
Create a pull request and in the pull request indicate what version of Databricks Runtime you tested this with.

containers's People

Contributors

Stargazers

Watchers

Forkers

evanye rafikurlansik andysprague44 dmoore247 bradretina jmwoloso fokko dotslash dfeddad stltenny pysrikanth chaitanya176 falaki temalo mengxr prabakar2610 yuichi-imai-personalspace marcellereis vinodreddy129 nadroj94 ohadzer micahelolaniran mrbago maxdbx lorenzwalthert kurlare raosanrao2002 a0x8o ryankriegbaum jerryadf panchalhp-db jdpwas rradhakr-git shugybugy-assaf marcom-db brett-matthews amywang718 rportilla-databricks elruby rohan-viz linanqiu vadim nhaas-twist ccanalytics pravinbopps electricmattg vperiyasamy scottporter bramrodenburg norazhaoo pjcafonso sarthfrey sebastienlanglois hirademami morganmazouchi hqjatu dancingfrog abdullahi-ahmed mmf-github momataj bninjaw clementla-db dressipi zergioz ziauddin006 yuningzh-db ndubetz kolachoor abisheks01 wsilva xinzhao-db alexjinghn stemill felixkrause0607 deka108 machina-labs sbauersfeld avnyadav pullakp shouchou taoufiq0 bkdaugherty kashishsehgal73 soumyadalal luca-silvestri lu-wang-dl glenvorel mattblasa ralphliang sandhujasmine phillipf erikgridware adityasoni26 mauroslucios moisespereira prabakar-ammeappin ygong1 jixiongxiao-db philsalm aleksandrskrivickis

containers's Issues

Container for specific runtime versions

I developing Python applications to be run on Databricks, I am looking towards Databricks Containers in ensuring that I develop on an environment as close to the runtime environment as possible (Spark libraries and connectors, etc.). However, when running the containers I cannot find any of the runtime versions (7.3, 7.4, etc.), and the latest image is not packed with any of the libraries.

Where can I find Docker images pre-loaded with Spark and associated jar-packages/libraries for etc. DB Runtime 7.3 LTS?

Runtime 7.x image updates

My first question is. When will you update your Docker images to be run time 7.x?

I will need this very soon as I have to provide an image for double digit streams of work and need cluster start times to be optimal as Init scripts are not viable for so many packages that I need to install. We do not want the administration over head of maintaining custom init scripts for each stream of work and 20 minute cluster start times. with ACR image it is within 5 minutes

My other question is that do you have comprehensive documentation anywhere on using another tool as good as ganglia and how to configure it in detail for Azure Databricks Clusters since in current images it is not supported and you plan to remove support for it in the near future?

Thanks
A

Stable Databricks Docker Container Significantly Behind Current DBR version.

Currently the stable databricks docker container is significantly behind the stable DBR (8.x at the time of writing). The experimental containers are 7.x. Will these containers be kept more up to date?

hwriterPlus not compatible with new versions of R

If one uses your Dockerfile one will run into the problem that the newest R versions 3.6.3+ will not be compatible with the old hwriterPlus package (and dependencies).

Either I suggest that you hardcode the compatible R version in you Dockerfile, or that you replace the last two lines with:
RUN R -e "install.packages(c('hwriter', 'TeachingDemos', 'htmltools'))"
RUN R -e "install.packages('https://cran.r-project.org/src/contrib/Archive/hwriterPlus/hwriterPlus_1.0-3.tar.gz', repos=NULL, type='source')"
RUN R -e "install.packages('Rserve', repos='http://rforge.net/', type='source')"

Set PYTHONPATH in Dockerfile not working

If I set PYTHONPATH ins dockerfile like below, its not getting reflected in the container

ENV PYTHONPATH="$PYTHONPATH:/path/to/libraries/"

Error in loadNamespace(name)

I create a docker container that only have
FROM databricksruntime/rbase:latest in.
I can start the databricks cluster using my container, but nothing in the container works from a R point of view.
Even if I run R.Version(), I get the error:

Error in loadNamespace(name) : there is no package called 'htmltools'

Thanks

Spark in container?

Hi there - This seems the beginning of something very useful. For example, we'd like to make sure our libraries are tested in CI/CD in a databricks-like container before deploying in production. Two questions:

I get a 404 when trying to see published images @ https://cloud.docker.com/u/databricksruntime/repository/list, is that expected?
i ran the "standard" image as a container docker run -i -t databricksruntime/standard /bin/bash but don't seem to see spark or scala installed in the container (as I would expect in the runtime), is that wrong?

Thank you!

Custom containers built on databricksruntime/standard are still missing required packages?

I am trying to create Compute cluster which uses a customized container to run my Notebooks

I have followed the steps here: https://learn.microsoft.com/en-us/azure/databricks/clusters/custom-containers#step-1-build-your-base

Enabled containers for workspace
Use 13.2 runtime
Created and push image based on databricksruntime/standard

I can create the compute, connect a notebook to it; however, when I try to execute a notebook cell. I would receive an error like this:

Failure starting repl. Try detaching and re-attaching the notebook.
ModuleNotFoundError: No module named 'google'

Detaching and Reattaching had no effect.

Then instead of trying to use my custom image, I tried to use databricksruntime/standard:latest directly but received a similar error about a different package.

Expected

I expected if I use databricksruntime/standard it would essentially be recreating the default and it should work as if I had not configured the cluster to use a custom container

Actual

It seems even when using databricksruntime/standard there is still more packages or configuration that needs to be made

Is there documentation about these extra configuration or package install steps so that I may use a custom container?

Full Error

java.lang.Exception: Unable to start python kernel for ReplId-4eca4-c68d0-068a5-6, kernel exited with exit code 1.
----- stdout -----

------------------
----- stderr -----
Traceback (most recent call last):
  File &quot;/databricks/python_shell/scripts/db_ipykernel_launcher.py&quot;, line 43, in &lt;module&gt;
    from dbruntime import UserNamespaceInitializer
  File &quot;/databricks/python_shell/dbruntime/UserNamespaceInitializer.py&quot;, line 2, in &lt;module&gt;
    from dbruntime.display import Display, displayHTML
  File &quot;/databricks/python_shell/dbruntime/display.py&quot;, line 3, in &lt;module&gt;
    from pyspark.sql.connect.dataframe import DataFrame as ConnectDataFrame
  File &quot;/databricks/spark/python/pyspark/sql/connect/dataframe.py&quot;, line 65, in &lt;module&gt;
    import pyspark.sql.connect.plan as plan
  File &quot;/databricks/spark/python/pyspark/sql/connect/plan.py&quot;, line 32, in &lt;module&gt;
    import pyspark.sql.connect.proto as proto
  File &quot;/databricks/spark/python/pyspark/sql/connect/proto/__init__.py&quot;, line 18, in &lt;module&gt;
    from pyspark.sql.connect.proto.base_pb2_grpc import *
  File &quot;/databricks/spark/python/pyspark/sql/connect/proto/base_pb2_grpc.py&quot;, line 21, in &lt;module&gt;
    from pyspark.sql.connect.proto import base_pb2 as spark_dot_connect_dot_base__pb2
  File &quot;/databricks/spark/python/pyspark/sql/connect/proto/base_pb2.py&quot;, line 21, in &lt;module&gt;
    from google.protobuf import descriptor as _descriptor
ModuleNotFoundError: No module named 'google'

------------------

	at com.databricks.backend.daemon.driver.IpykernelUtils$.startReplFailure$1(JupyterDriverLocal.scala:1179)
	at com.databricks.backend.daemon.driver.IpykernelUtils$.$anonfun$startIpyKernel$3(JupyterDriverLocal.scala:1189)
	at com.databricks.backend.common.util.TimeUtils$.$anonfun$retryWithExponentialBackoff0$1(TimeUtils.scala:191)
	at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
	at scala.util.Try$.apply(Try.scala:213)
	at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff0(TimeUtils.scala:191)
	at com.databricks.backend.common.util.TimeUtils$.retryWithExponentialBackoff(TimeUtils.scala:145)
	at com.databricks.backend.common.util.TimeUtils$.retryWithTimeout(TimeUtils.scala:94)
	at com.databricks.backend.daemon.driver.IpykernelUtils$.startIpyKernel(JupyterDriverLocal.scala:1187)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal.$anonfun$startPython$1(JupyterDriverLocal.scala:908)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at scala.util.Try$.apply(Try.scala:213)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal.com$databricks$backend$daemon$driver$JupyterDriverLocal$$withRetry(JupyterDriverLocal.scala:872)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal$$anonfun$com$databricks$backend$daemon$driver$JupyterDriverLocal$$withRetry$1.applyOrElse(JupyterDriverLocal.scala:875)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal$$anonfun$com$databricks$backend$daemon$driver$JupyterDriverLocal$$withRetry$1.applyOrElse(JupyterDriverLocal.scala:872)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
	at scala.util.Failure.recover(Try.scala:234)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal.com$databricks$backend$daemon$driver$JupyterDriverLocal$$withRetry(JupyterDriverLocal.scala:872)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal.startPython(JupyterDriverLocal.scala:889)
	at com.databricks.backend.daemon.driver.JupyterDriverLocal.&lt;init&gt;(JupyterDriverLocal.scala:371)
	at com.databricks.backend.daemon.driver.PythonDriverWrapper.instantiateDriver(DriverWrapper.scala:829)
	at com.databricks.backend.daemon.driver.DriverWrapper.setupRepl(DriverWrapper.scala:369)
	at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:258)
	at java.lang.Thread.run(Thread.java:750)

docker run has errors on MAC OS

it does not seem to support mac os

`$ docker run databricksruntime/standard

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
$`

`Hardware Overview:

Model Name: MacBook Pro
Model Identifier: MacBookPro18,1
Chip: Apple M1 Pro
`

Docker Hub builds maintained?

A lot of docker hub builds are quite old (> 1 year) while I see some more recent commits in this repo. Are these builds still maintained or not? Would be comfortable to not have to build the build images (and their base images) myself.

is there a rapids container?

I am trying the container below but I am getting an error indicating it is not found. Am I missing something? Thanks

containers/ubuntu/gpu/cuda-11/build.sh

Line 9 in 74fbb7f

docker build -t databricksruntime/gpu-rapids:cuda11 rapids/

Ubuntu 16.04 EoL

Hi there,

since Ubuntu 16.04 is EoL in April, do you intend to offer Ubuntu 18.04 versions of the dockerfiles?
Of course you could switch the apt-get commands to the old-releases.ubuntu.com source, but the file would still need to be changed.

For my project, I tried to combine the experimental ubuntu18.04-minimal image with the ubuntu/python one by just changing the FROM command from 16.04 to 18.04.
Seems like it works, I could use the docker image as normal in a DB cluster, and I could also import code in a notebook from a python package that has been installed to the dcs-minimal conda env in the docker container.

But of course that's not a sophisticated test case.

Autoloader Support

Is autoloader support possible on the standard runtime container? This would be helpful for unit testing streams that utilizes this feature. Currently, readStream.format("cloudFiles") throws java.lang.ClassNotFoundException: Failed to find data source: cloudFiles. when using the cloudFiles format.

Possibility for spark init scripts

I'm not sure whether this is the right place, but:

We are using clusters with our own containers to bundle our jars and python libs.
One issue we are facing though: we are using https://sedona.apache.org/ to support geometry datatypes.

In notebooks or jobs this is fine, because we can run the required registration of datatypes and SQL functions ourselves:

SedonaRegistrator.registerAll(spark)

Is there a way to include such a code-snippet as part of the cluster-startup, e.g. for use on clusters that only provide the SQL endpoint?
As I understand, the existing cluster-init scripts run before any spark-context exists, so this can't be placed there.

Support AAD authentication and managed identity on azure

Currently, Databricks containers in Azure only support user name and password for authentication, do you have plan to support Azure Active Directory authentication and Managed Identity?

Support for Ganglia UI

Hi, just opening an issue for the lack of Ganglia support, despite been written in the docs:

Of course, the minimal requirements listed above do not include Python, R, Ganglia, and many other features that you typically expect in Databricks clusters. To get these features, build off the appropriate base image (that is, databricksruntime/rbase for R), or reference the Dockerfiles in GitHub to determine how to build in support for the specific features you want.

On the other hand the docs for the standard container at least are explicit about the lack of Ganglia support

When it is expected to have ganglia in the container services?

Clean build gpu-pytorch breaks Databricks notebooks environment

I have a pipeline which is adapted from the gpu-pytorch Dockerfile provided in this repository. However somewhere this week the conda environment must choose a different combination of packages in its environment, since new clean images build from this Dockerfile no longer run on Databricks. The problem is that any executed command in a notebook will have a Waiting to run... status and then gets cancelled after a minute. The driver logs seem to mention this problem:

Traceback (most recent call last):
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 1732, in <module>
    raise e
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 1729, in <module>
    launch_process()
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 1716, in launch_process
    console_buffer, error_buffer)
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 783, in __init__
    self.shell = self.create_shell()
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 1153, in create_shell
    autocomplete_verbose_output=self.autocomplete_verbose_output()
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-8f2ea4ca-f011-4eb0-8d56-13e442652e87/lib/python3.7/site-packages/traitlets/config/configurable.py", line 510, in instance
    inst = cls(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-8f2ea4ca-f011-4eb0-8d56-13e442652e87/lib/python3.7/site-packages/IPython/terminal/embed.py", line 159, in __init__
    super(InteractiveShellEmbed,self).__init__(**kw)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-8f2ea4ca-f011-4eb0-8d56-13e442652e87/lib/python3.7/site-packages/IPython/terminal/interactiveshell.py", line 519, in __init__
    super(TerminalInteractiveShell, self).__init__(*args, **kwargs)
  File "/local_disk0/pythonVirtualEnvDirs/virtualEnv-8f2ea4ca-f011-4eb0-8d56-13e442652e87/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 676, in __init__
    self.init_completer()
  File "/local_disk0/tmp/1600798981779-0/PythonShell.py", line 1003, in init_completer
    if self.use_jedi:
AttributeError: 'IPythonShell' object has no attribute 'use_jedi'

Strangely enough the latest images of databricksruntime/gpu-pytorch:cuda10.1 on Dockerhub (build 3 months) ago still seem to run fine in Databricks. I like to append extra packages to the existing to the conda environment, so using FROM databricksruntime/gpu-pytorch:cuda10.1 in my Dockerfile is not really preferred. Also I was able to build and run my images succesfully up until now. Something has changed in how conda resolves the environment upon installation that breaks compatibility with databricks notebooks. Anyone knows what the issue might be?

can't pull databricksruntime/python-conda:latest

in a docker file I have:

FROM databricksruntime/python-conda:latest

at the top of the docker file, but when building the image I get the following error:

[+] Building 2.8s (4/4) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 1.09kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => ERROR [internal] load metadata for docker.io/databricksruntime/python-conda:latest 2.7s => [auth] databricksruntime/python-conda:pull token for registry-1.docker.io 0.0s [internal] load metadata for docker.io/databricksruntime/python-conda:latest: failed to solve with frontend dockerfile.v0: failed to create LLB definition: docker.io/databricksruntime/python-conda:latest: not found

any ideas how to solve this or why this is happening?

gpu-conda cuda 11 based dockerfile is broken

I am trying to build a image from gpu-conda dockerfile based on cuda 11 and it seems to be broken. I get an error when it tries to run the installation of miniconda. See the screenshot attached. The docker file I am trying to run is also attached.

Question about Containers

I want to test Databricks using a JDBC tool. Which container do I need?
How should I start it ( to make the port reachable ), which is the username and password to connect?

Encoding issue on standard runtime 11.3 with unity catalog

Hi,

We ran into an encoding error:
java.nio.charset.MalformedInputException: Input length = 1

when running sql select from pyspark selecting from a unity catalog on a new cluster with a base docker image running with the newest standard runtime (11.3).

When enforcing UTF8 encoding in the Dockerfile with
ENV JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF8"
we were able to fix the problem.

I don't know where the encoding mismatch happens, but I think it might be a misconfiguration between the encoding of the parsed string from the jvm in the Docker image and the unity catalog.

Custom Jars - Databricks Docker Cluster

BASE IMAGE - databricksruntime/python:10.4-LTS

I successfully installed the Python dependencies, and the tasks that depend on python in the workflow run fine, but I’m struggling to install the Maven and Jar dependencies.

The jar files are in the docker image (databricks/jars) and are visible in the spark environment path when the cluster starts, but when I trigger the workflow, I see a “Java Package not callable error” since the script is unable to use the classes in the jar files.

python:9.x is version 3.8.0 but python-conda:9.x is 3.8.5

Not a big deal - I'm happy to leave conda behind and go with the default recipe, but .0 versions are buggier and 3.8.5 is closer to what's actually in the databricks 8.4 runtime now.

It's a pain pulling a specific version without building from source, but just putting this here in case you'd advise sticking to python-conda for now.

`/bin/spark-submit` script not found

In the Standard image, it lists Spark Submit jobs, but after cding into the /bin directory, I don't see any spark-submit script...

unbuntu version

is there any plan to support ubuntu 20.04?

10.x standard images?

Any plans on publishing 10.x runtime base images?

Missing python

Steps to reproduce:

$ docker run -it databricksruntime/standard bash

root@87ddd3de3929:/#  python --version
bash: python: command not found

How to prevent my python module being override by Databricks runtime?

I am try to use my own version of scikit-learn by providing my custom docker image.
But I found from the release notes that Databricks runtime also includes scikit-learn module by default (for example Databricks runtime 7.1)

Will databricks runtime override my scikit-learn version when building the final image for docker container? If yes, is there a way to prevent it?

Thanks,
Haifeng

Custom jar as part of init script for databricks sqlendpoint

We are using a custom jar in our job clusters that we add as part of the init script. Is it possible to add the same jar while creating a SQL endpoint. Can we attach some init script to SQL endpoint before the start of the cluster?

Custom Docker image returning: Spark error: Driver down cause: driver state change

I'm trying to build a custom 12.2 LTS docker image to support Pytorch 2.0 / Pytorch Lightning. While I can create a valid docker image, when I push this to Databricks, Spark errors without any helpful error message or any logs to investigate.

As spark is injected by databricks in a non-transparent way, and the error messages are also not very helpful, I'm not able to get to the bottom of the issue.

Message
Failed to add 1 container to the cluster. Will attempt retry: true. Reason: Spark error
Help

Spark encountered an error on startup. This issue can be caused by invalid Spark configurations or malfunctioning [init scripts](https://docs.databricks.com/clusters/init-scripts.html#global-and-cluster-named-init-script-logs). Please refer to the Spark driver logs to troubleshoot this issue, and contact Databricks if the problem persists.
Internal error message: Spark error: Driver down cause: driver state change

Here is the Dockerfile, it mirrors the same approach as previous versions of GPU images.

Dockerfile


# Base image with CUDA support
FROM nvidia/cuda:11.6.1-base-ubuntu20.04 as cuda_base
# Disable NVIDIA repos to prevent accidental upgrades.
RUN cd /etc/apt/sources.list.d && \ mv cuda.list cuda.list.disabled && \ test -f nvidia-ml.list && mv nvidia-ml.list nvidia-ml.list.disabled || true

ENV DEBIAN_FRONTEND=noninteractive

# See https://github.com/databricks/containers/blob/master/ubuntu/minimal/Dockerfile
RUN apt-get update \ && apt-get -y upgrade \ && apt-get install --yes \ openjdk-8-jdk \ iproute2 \ bash \ sudo \ coreutils \ procps \ gcc \ g++ \ wget \ bzip2 \ tar \ && /var/lib/dpkg/info/ca-certificates-java.postinst configure \ && apt-get clean \ && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Final image with conda environment and application
FROM cuda_base as final_image

ENV PATH /databricks/conda/bin:$PATH

RUN wget -q https://repo.continuum.io/miniconda/Miniconda3-py38_4.9.2-Linux-x86_64.sh -O miniconda.sh && \ bash miniconda.sh -b -p /databricks/conda && \ rm miniconda.sh && \ 
# Source conda.sh for all login and interactive shells. 
ln -s /databricks/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && \ echo ". /etc/profile.d/conda.sh" >> ~/.bashrc && \ 
# Set always_yes for non-interactive shells. conda config --system --set always_yes True && \ conda clean --all

COPY docker/pytorch.yml /tmp/env.yml

RUN --mount=type=cache,target=/databricks/conda/pkgs conda env create --file /tmp/env.yml && \ rm -f /tmp/env.yml && \ rm -rf $HOME/.cache/pip/*

SHELL ["conda", "run", "-n", "pytorch", "/bin/bash", "-c"]

ARG MYPACKAGE_SRC_DIR=/opt/package-src

# Copy my package files
COPY . $MYPACKAGE_SRC_DIR

RUN pip install --no-deps $MYPACKAGE_SRC_DIR

# Set environment variables used by Databricks to decide which conda environment to activate by default.
ENV DATABRICKS_ROOT_CONDA_ENV=pytorch
ENV PYSPARK_PYTHON=/databricks/conda/bin/conda

and the corresponding env

pytorch.yaml

name: pytorch
channels:
  - defaults
  - conda-forge
dependencies:
- cudatoolkit=11
- ipykernel>=6.15.3  # required by Python notebooks
- ipython>=8.5.0  # required by Python notebooks
- numpy>=1.21.5  # required by PySpark MLlib
- pandas>=1.4.2  # required by PySpark
- pip>=21.2.4
- python=3.9.5
- six=1.16.0  # required by PySpark
- jedi>=0.18.1  # required by Python notebooks
- matplotlib>=3.5.1  # required by PySpark MLlib
- jinja2>=2.11.3  # required by PySpark
- traitlets>=5.1.1
- pytorch::pytorch=2.0.0
- pyarrow=7.0.0  # required by PySpark
- pytorch-lightning=2.0.1
- transformers=4.27.4
- wandb=0.14
- scikit-learn>=1.2.0

Could you help me understand the issue, or release a 12.2 LTS GPU image as you did with previous runtimes

`NVTabular` support

Is NVTabular supported for any of the containers listed here? This is a wild shot, but it seems that the NVTabular team was referencing a container built by the RAPIDS team that can "easily" be extended to use NVTabular. Now, I thus far was not able to successfully go through the steps described in the source below, but I happened to have come across a Databricks RAPIDS container and was wondering if it would be possible to also support NVTabular and/or if anyone had tried that before?

Source: https://github.com/NVIDIA-Merlin/NVTabular/blob/main/docs/source/resources/cloud_integration.md#databricks

P.S. Once again, this is a wild shot.

Add conda to PATH variable in `standard` container

Hi there,

I did a little testing with the databricksruntime/standard container, and while I see that conda is installed, I am not able to call it. I had to create a Dockerfile with the following code

FROM databricksruntime/standard:latest

ENV PATH /databricks/conda/bin:$PATH

It seems a little silly to have conda installed but not accessible. Would it be possible to modify the databricksruntime/standard image (and other related images. python?) to have the ENV PATH /databricks/conda/bin:$PATH line included?

Internal error message: Spark failed to start: Timed out after 60 seconds

I created a custom docker image that contains Opencv 4, LightGBM, Torch, and a few other libraries compiled for CUDA on a g4dn.xlarge ec2 instance. The image is 17.1GB according docker image ls.
I am getting the following error message when starting a g4dn.xlarge cluster:Internal error message: Spark failed to start: Timed out after 60 seconds
This error does not occur if the image runs in a g4dn.2xlarge. There are no init scripts being used. Is there a Spark config or something else that needs to be set to use a g4dn.xlarge cluster without getting a timeout?

Runtime(s): 7.2 (error), 7.3 LTS (error)

Databricks Runtime 11.0 image

Is there an ETA on when this image will be released as a docker container?

some of databricks runtime libraries not present in databricks docker container

IIUC, databricks runtime libraries will be injected while creating cluster. But some of runtime libraries like seaborn, matplotlib as given in https://docs.databricks.com/release-notes/runtime/7.6.html not present in the databricks cluster that created using docker container

env.yml
name: dcs-minimal
channels:

default
dependencies:
pip:
- pyarrow==0.13.0
python=3.7.3
six=1.12.0
nomkl=3
ipython=7.4.0
numpy=1.16.2
pandas=0.24.2

Want to how to have all runtime libraries as given in the below link

https://docs.databricks.com/release-notes/runtime/7.6.html

R dockerfile fails

this dockerfile fails: https://github.com/databricks/containers/blob/master/ubuntu/R/Dockerfile

It expects the the Ubuntu 16.04 version in the minimal configuration of databricks but it contains the Ubuntu 18.04 version.

The logs I get:

The following packages have unmet dependencies:
r-base : Depends: r-base-core (>= 3.6.3-1xenial) but it is not going to be installed
Depends: r-recommended (= 3.6.3-1xenial) but it is not going to be installed
Recommends: r-base-html but it is not going to be installed
r-base-dev : Depends: r-base-core (>= 3.6.3-1xenial) but it is not going to be installed
E: Unable to correct problems, you have held broken packages.
The command '/bin/sh -c apt-get update && apt-get install --yes software-properties-common apt-transport-https && gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 && gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add - && add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial-cran35/' && apt-get update && apt-get install --yes libssl-dev r-base r-base-dev && add-apt-repository -r 'deb [arch=amd64,i386]
##[error]The command '/bin/sh -c apt-get update && apt-get install --yes software-properties-common apt-transport-https && gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 && gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add - && add-apt-repository 'deb [arch=amd64,i386] https://cran.rstudio.com/bin/linux/ubuntu xenial-cran35/' && apt-get update && apt-get install --yes libssl-dev r-base r-base-dev && add-apt-repository -r 'deb [arch=amd64,i386]
##[error]The process '/usr/bin/docker' failed with exit code 100

Support for arm64 docker images

Hi, I see on Dockerhub there's only support for amd64 architectures. Seeing that arm64 is becoming increasingly popular due to the M1 Macs it would be a great improvement if the Databricks runtime images supported multiple architectures. Is this something that will be available soon?

Update gpu-rapids container to sync up with rapidsai's repo

Hi Databricks team! Rapids.ai team has updated the container specs on their site to rapids v22.06 release, wonder if you guys can also sync up with the update in this repo, and update the container on docker hub as well? Thanks!

Please find the following link for the latest spec from rapids: https://github.com/rapidsai/cloud-ml-examples/blob/main/databricks/docker/rapids-spec.txt

Python 3.7 support

Hey, we are currently experimenting with databricks container services and I've noticed that in databricksruntime/standard image the system python is 3.5, but in release notes of Databricks Runtime 6.1 there are python 3.7 version. So, since container customization is >=6.1 feature, we can expect that base all base databricks images must have python3.7 by default.

Build for arm64

I don't think this would be too hard and it would allow compatible images to be run locally in connect situation and remotely.

Bug in conda version 4.5.12 leads to corrupt docker image if certain python packages are installed

Hi there,

the current python docker images use conda 4.5.12 as a base.
However, there's a major bug in this version that leads to an error in the conda list command if you've installed certain packages via pip.

I noticed this, when I added the sphinx-autoapi package to my databricks docker image.
I can replicate that installing this package leads to an error when starting a cluster with the associated docker image.

In the logs of the cluster, you can see that a conda list command is run during the startup of the cluster.
And due to the bug mentioned above, the list command raises an error, such that the cluster does not start successfully.

I've upgraded the conda version in the image manually by changing the version from 4.5.12 to 4.7.12 (fixes the bug mentioned above) and the issue is fixed.

However:
Is it safe to upgrade the conda in the docker image, in order to get the conda bugfix in 4.7.12, or does this lead to problems (you probably chose this particular conda version on purpose)?
I couldn't find any issues till now, but of course I didn't do a proper & thorough testing.

Upgrade to Ubuntu 20.04

Is Databricks Runtime 9.x on Ubuntu 20.04 or 18.04?
I see 18.04 in the Dockerfiles, but internally people are saying it is 20.04

If 20.04, could the Dockerfiles be updated and rebuilt to reflect this please! Otherwise please let me know when 20.04 will be available

Missing Dockerfiles for runtime 13

Please create a branch for these latest runtimes. Thanks!

Ubuntu Docker Image Vulnerabilities

Hello, I'm not sure if this is where I should put this, but there are 4 vulnerabilities for the underlying 18.04 ubuntu image or at least those 4 are getting flagged by our scanning tool. Could we get those resolved by probably just running an update on the images?

https://ubuntu.com/security/notices/USN-5189-1
https://ubuntu.com/security/notices/USN-5168-1
https://ubuntu.com/security/notices/USN-5133-1
https://launchpad.net/ubuntu/+source/python3.7/3.7.5-2~18.04.4

9.x Dockerfiles still based on Ubuntu 18.04, but runtime has moved to 20.04

All of the version 9.x dockerfiles are built on minimal which uses ubuntu:18.04 as its base image. However databricks runtime 9.x uses ubuntu:20.04

base-gpu container issue

I am starting a server with this container: databricksruntime/gpu-base:cuda11
It starts fine but when I attach a notebook and run a cell, I get the error below:

I also checked the driver log and seeing something like below, I am not sure if this is related though:

How to add jars to a custom container?

Creating a new issue from the question in #11

Is there a way to also install jars in a custom container so they end up in the spark classpath after launch?

Containers in Databricks Runtime 6

Hello, guys

There is something I could not get from testing with these containers. I cloned the repository and built the databricks-standard image. I started a bash process inside a container using this image and I could only find a /conda environment inside the /databricks folder (older versions used to have a folder /python inside, with a "pure" python environment).

But that's ok for me, except that nothing that I installed in this conda environment is actually being used by the cluster. Even if I set the environment variable DATABRICKS_ROOT_CONDA_ENV to "databricks-minimal", or "databricks-standard", nothing happens.

I did run a /databricks/conda/bin/pip freeze inside a shell cell in a notebook, and the packages I installed are all there, but the cluster seems to be looking for the packages in /databricks/python environment (apparently not supported by the new standard container)

I thought the problem lied on the runtime I used (6.5), so I decided to try with 6.5 ML (the one that should use conda environment), but got an error saying that ML runtimes does not work with custom containers.

So, what am I doing wrong?

Different python versions when compared with the 10.4 cluster

Hi,

The python dependencies` versions listed in the python dockerfile mismatch the versions listed in the release's documentation page. To be exact:

  ipython==7.19.0 \
  numpy==1.19.2 \

is not

ipython	7.22.0
numpy	1.20.1
--	--

Is such difference acceptable? If so, how one can guarantee a custom python package compatibility and which - cluster's or docker image's - dependencies' versions should be locked?

Use databricks container service for CI/CD

Hi,
we are thinking about using container services and generate the docker images as the final artifacts, the same it is usually done in kubernetes or in any modern cloud facility supporting docker.
For that, we create a dockerfile that copies all of our generated jars, including dependencies, into /databricks/jars and publish a docker image in a container registry.
For example: our CI pipeline generates a software version 1.2.0-development.aabbbccdd that gets deployed as the docker image with that version. All of our libraries and dependencies are in /databricks/jars.

If we create a cluster and set the docker container image of that specific version, we can run a scala notebook which has access to our classes, so this is great in terms of CI/CD.

However, we have a "problem". We are launching jobs from Azure Data Factory which in turn, create DATABRICKS JOBS that will run in our cluster. The problem is that it seems mandatory to always add an "additional library". We understand that this requirement comes from the fact that this is emulating and spark-submit and we have to pass the jar with the main class. However, in our containers, we already have the jars inside and are in the classpath.
It seems we can add any file (with any extension) as an additional library, as long as it exists in dbfs. So, we pass a dummy jar, and everything works.

We are worried about defining our CI/CD with something that may break in the future but we totally dislike the way databricks handles libraries and would prefer the docker approach.

So we have a few questions:

Have you had this problem before and it is something that it is being requested by customers?
is there any alternative?
do you suggest to not use containers this way?

best regards

databricks / containers Goto Github PK

containers's Introduction

Databricks Container Services - Example Containers

Warning: Runtime Incompatibility

Documentation

Images

DockerHub

How To Contribute to this Repo

containers's People

Contributors

Stargazers

Watchers

Forkers

containers's Issues

Expected

Actual

Full Error

Recommend Projects

Recommend Topics

Recommend Org

Jobs