GithubHelp home page GithubHelp logo

ml-tooling / ml-hub Goto Github PK

View Code? Open in Web Editor NEW
293.0 18.0 64.0 852 KB

๐Ÿงฐ Multi-user development platform for machine learning teams. Simple to setup within minutes.

License: Apache License 2.0

Dockerfile 7.78% Python 55.05% Shell 4.02% HTML 14.18% CSS 0.88% JavaScript 9.76% Smarty 8.34%
jupyterhub machine-learning data-science docker python jupyter

ml-hub's Introduction

ML Hub

Multi-user hub which spawns, manages, and proxies multiple workspace instances.

Highlights โ€ข Getting Started โ€ข Features & Screenshots โ€ข Support โ€ข Report a Bug โ€ข Contribution

MLHub is based on JupyterHub with complete focus on Docker and Kubernetes. MLHub allows to create and manage multiple workspaces, for example to distribute them to a group of people or within a team. The standard configuration allows a setup within seconds.

Highlights

  • ๐Ÿ’ซ Create, manage, and access Jupyter notebooks. Use it as an admin to distribute workspaces to other users, use it in self-service mode, or both.
  • ๐Ÿ–Š๏ธ Set configuration parameters such as CPU-limits for started workspaces.
  • ๐Ÿ–ฅ Access additional tools within the started workspaces by having secured routes.
  • ๐ŸŽ› Tunnel SSH connections to workspace containers.
  • ๐Ÿณ Focused on Docker and Kubernetes with enhanced functionality.

Overview in a Nutshell

  • MLHub can be configured like JupyterHub with a normal JupyterHub configuration, with minor adjustments in the Kubernetes scenario.
  • The documentation provides an overview of how to use and configure it in Docker-local and Kubernetes mode.
  • More information about the Helm chart resources for Kubernetes can be found here.
  • We created two custom Spawners that are based on the official DockerSpawner and KubeSpawner and, hence, support their configurations set via the JupyterHub config.

Getting Started

Prerequisites

  • Docker
  • Kubernetes (for Kubernetes modes)
  • Helm (for easy deployment via our helm chart)

Most parts will be identical to the configuration of JupyterHub 1.0.0. One of the things done differently is that ssl will not be activated on proxy or hub-level, but on our nginx proxy.

Quick Start

Following commands will start the hub with the default config.

Start an instance via Docker

docker run \
    -p 8080 \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v jupyterhub_data:/data \
    mltooling/ml-hub:latest

To persist the hub data, such as started workspaces and created users, mount a directory to /data. Any given name (--name) will be overruled by the environment variable HUB_NAME.

Start an instance via Kubernetes

Via Helm:

RELEASE=mlhub # change if needed
NAMESPACE=$RELEASE # change if needed

helm upgrade --install $RELEASE mlhub-chart-2.0.0.tgz --namespace $NAMESPACE

# In case you just want to use the templating mechanism of Helm without deploying tiller on your cluster
# 1. Use the "helm template ..." command. The template command also excepts flags such as --config and --set-file as described in the respective Sections in this documentation.
# 2. kubectl apply -f templates/hub && kubectl apply -f templates/proxy

You can find the chart file attached to the release.

Configuration

Default Login

When using the default config - so leaving the JupyterHub config c.Authenticator.admin_users as it is -, a user named admin can access the hub with admin rights. If you use the default NativeAuthenticator as authenticator, you must register the user admin with a password of your choice first before login in. If you use a different authenticator, you might want to set a different user as initial admin user as well, for example in case of using oauth you want to set c.Authenticator.admin_users to a username returned by the oauth login.

Environment Variables

MLHub is based on SSH Proxy. Check out SSH Proxy for ssh-related configurations. Check the Configuration Section for details how to pass them, especially in the Kubernetes setup. Here are the additional environment variables for the hub:

Variable Description Default
HUB_NAME In Docker-local mode, the container will be (re-)named based on the value of this environment variable. All resources created by the hub will take this name into account. Hence, you can have multiple hub instances running without any naming conflicts. Further, we let the workspace containers connect to the hub not via its docker id but its docker name. This way, the workspaces can still connect to the hub in case it was deleted and re-created (for example when the hub was updated). The value must be DNS compliant and must be between 1 and 5 characters long. mlhub
SSL_ENABLED Enable SSL. If you don't provide an ssl certificate as described in Section "Enable SSL/HTTPS", certificates will be generated automatically. As this auto-generated certificate is not signed, you have to trust it in the browser. Without ssl enabled, ssh access won't work as the container uses a single port and has to tell https and ssh traffic apart. false
EXECUTION_MODE Defines in which execution mode the hub is running in. Value is one of [local | k8s] local
(if you use the helm chart, the value is already set to k8s)
DYNAMIC_WHITELIST_ENABLED Enables each Authenticator to use a file as a whitelist of usernames. The file must contain one whitelisted username per line and must be mounted to /resources/users/dynamic_whitelist.txt. The file can be dynamically modified. The c.Authenticator.whitelist configuration is not considered! If set to true but the file does not exist,the normal whitelist behavior of JupyterHub is used. Keep in mind that already logged in users stay authenticated even if removed from the list - they just cannot login again. false
CLEANUP_INTERVAL_SECONDS Interval in which expired and not-used resources are deleted. Set to -1 to disable the automatic cleanup. For more information, see Section Cleanup Service. 3600
(currently disabled in Kubernetes)

JupyterHub Config

JupyterHub and the used Spawner are configured via a config.py file as stated in the official documentation. In case of MLHub, a default config file is stored under /resources/jupyterhub_config.py. If you want to override settings or set extra ones, you can put another config file under /resources/jupyterhub_user_config.py.

Important: When setting properties for the Spawner, please use the general form c.Spawner. instead of c.DockerSpawner., c.KubeSpawner. etc. so that they are merged with default values accordingly.

Our custom Spawners support the additional configurations:

  • c.Spawner.workspace_images - set the images that appear in the dropdown menu when a new named server should be created, e.g. c.Spawner.workspace_images = [c.Spawner.image, "mltooling/ml-workspace-gpu:0.8.7", "mltooling/ml-workspace-r:0.8.7"]

Following settings should probably not be overriden:

  • c.Spawner.prefix and c.Spawner.name_template - if you change those, check whether your SSH environment variables permit those names a target. Also, think about setting c.Authenticator.username_pattern to prevent a user having a username that is also a valid container name.
  • If you override ip and port connection settings, make sure to use Docker images and an overall setup that can handle those.

An examplary custom config file could look like this:

# jupyterhub_user_config.py
c.Spawner.environment = {"FOO": "BAR"}
c.Spawner.workspace_images = ["mltooling/ml-workspace-r:0.8.7"]

Docker-local

In Docker, mount a custom config like -v /jupyterhub_user_config:/resources/jupyterhub_user_config.py. Have a look at the DockerSpawner properties to see what can be configured.

Kubernetes

When using Helm, you can pass the configuration to the installation command via --set-file userConfig=./jupyterhub_user_config.py. So the complete command could look like helm upgrade --install mlhub mlhub-chart-1.0.1.tgz --namespace mlhub --set-file userConfig=./jupyterhub_user_config.py. Have a look at the KubeSpawner properties to see what can be configured for the Spawner.

Additionally to the jupyterhub_user_config.py, which can be used to configure JupyterHub or the KubeSpawner, you can provide a config.yaml where you can make some Kubernetes-deployment specific configurations. Check out the helmchart/ directory for more information.

You can think of it like this: everything that has to be configured for the deployment itself, such as environment variables or volumes for the hub / proxy itself, goes to the config.yaml. Everything related to JupyterHub's way of working such as how to authenticate or what the spawned user pods will mount goes to the jupyterhub_user_config.py.

โ„น๏ธ Some JupyterHub configurations cannot be set in the jupyterhub_user_config.py as they have to be shared between services and, thus, have to be known during deployment. Instead, if you want to specify them, you have to do it in the config.yaml (see below).

A config.yaml where you can set those values could look like following: (click to expand...)
mlhub:
  baseUrl: "/mlhub" # corresponds to c.JupyterHub.base_url
  debug: true # corresponds to c.JupyterHub.debug
  secretToken: <32 characters random string base64 encoded> # corresponds to c.JupyterHub.proxy_auth_token
  env: # used to set environment variables as described in the Section "Environment Variables"
    DYNAMIC_WHITELIST_ENABLED: true

You can pass the file via --values config.yaml. The complete command would look like helm upgrade --install mlhub mlhub-chart-1.0.1.tgz --namespace mlhub --values config.yaml. The --set-file userConfig=./jupyterhub_user_config.py flag can additionally be set. You can find the Helm chart resources, including the values file that contains the default values, in the directory helmchart).

Enable SSL/HTTPS

MLHub will start in HTTP mode by default. Note that in HTTP mode, the ssh tunnel feature does not work. You can activate ssl via the environment variable SSL_ENABLED. If you don't provide a certificate, it will generate one during startup. This is to make routing SSH connections possible as we use nginx to handle HTTPS & SSH on the same port.

Details (click to expand...)

If you have an own certificate, mount the certificate and key files as cert.crt and cert.key, respectively, as read-only at /resources/ssl, so that the container has access to /resources/ssl/cert.crt and /resources/ssl/cert.key.

Docker-local

For Docker, mount a volume at the path like -v my-ssl-files:/resources/ssl.

Kubernetes

For Kubernetes, add following lines to the config.yaml file (based on setup-manual-https.):

mlhub:
  env:
    SSL_ENABLED: true

proxy:
  https:
    hosts:
      - <your-domain-name>
    type: manual
    manual:
      key: |
        -----BEGIN RSA PRIVATE KEY-----
        ...
        -----END RSA PRIVATE KEY-----
      cert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----

If you use a (cloud provider) LoadBalancer in your cluster where SSL is already terminated, just do not enable SSL on Hub-level and point the LoadBalancer regularly to the Hub's port.
If you do not have a certificate, for example from your cloud provider, you can have a look at the Let's Encrypt project for how to generate one. For that, your domain must be publicly reachable. It is not built-in the MLHub project, but one idea would be to have a pod that creates & renews certificates for your domain, copying them into the proxy pod and re-starting nginx there.

Spawner

We override DockerSpawner and KubeSpawner for Docker and Kubernetes, respectively. We do so to add convenient labels and environment variables. Further, we return a custom option form to configure the resouces of the workspaces. The overriden Spawners can be configured the same way as the base Spawners as stated in the Configuration Section.

All resources created by our custom spawners are labeled (Docker / Kubernetes labels) with the labels mlhub.origin set to the Hub name $ENV_HUB_NAME, mlhub.user set to the JupyterHub user the resources belongs to, and mlhub.server_name to the named server name. For example, if the hub name is "mlhub" and a user named "foo" has a named server "bar", the labels would be mlhub.origin=mlhub, mlhub.user=foo, mlhub.server_name=bar.

DockerSpawner

  • We create a separate Docker network for each user, which means that (named) workspaces of the same user can see each other but workspaces of different users cannot see each other. Doing so adds another security layer in case a user starts a service within the own workspace and does not properly secure it.

KubeSpawner

  • Create / delete services for a workspace, so that the hub can access them via Kubernetes DNS.

Support

The ML Hub project is maintained by @raethlein and @LukasMasuch. Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.

Type Channel
๐Ÿšจ Bug Reports
๐ŸŽ Feature Requests
๐Ÿ‘ฉโ€๐Ÿ’ป Usage Questions
๐Ÿ—ฏ General Discussion

Features

We have the three following scenarios in mind for the hub and want to point them out as a guideline. These three scenarios are thought of as an inspiration and are based on the default configuration by using native-authenticator as the hub authenticator. If you start the hub with a different authenticator or change other settings, you might want to or have to do things differently.

Scenarios

Multi-user hub without self-service

In this scenario, the idea is that just the admin user exists and can access the hub. The admin user then creates workspaces and distributes them to users.

Go to the admin panel (1) and create a new user (2). You can then start the standard workspace for that user or create a new workspace (see second image). Via the ssh access button (3), you can send the user a command to connect to the started workspace via ssh. For more information about the ssh-feature in the workspace, checkout this documentation section. If you created a workspace for another user, it might be necessary to click access on the workspace and authorize once per user to be able to use the ssh-access button. A user can also access the UI via ssh-ing into the workspace, printing the API token via echo $JUPYTERHUB_API_TOKEN, and then accessing the url of the hub in the browser under /user/<username>/<workspace-name>/tree?token=<jupyterhub-api-token>. The JUPYTERHUB_API_TOKEN gives access to all named servers of a user, so use different users for different persons in this scenario.

โ„น๏ธ Do not create different workspaces for the same Hub user and then give access to them to different persons. Via the $JUPYTERHUB_API_TOKEN you get access to all workspaces of a user. In other words, if you create multiple named workspaces for the user 'admin' and distribute it to different persons, they can access all named workspaces for the 'admin' user.

Picture of admin panel

Picture of admin panel

Multi-user hub with self-service

Give also non-admin users the permission to create named workspaces.

To give users access, the admin just has to authorize registered users.

Picture of admin panel

User hub

Users can login and get a default workspace. No additional workspaces can be created.

To let users login and get a default workspace but not let them create new servers, just set the config option c.JupyterHub.allow_named_servers to False when starting the hub. Note that this also disables the ability for starting named servers for the admin. Currently, the workaround would be to have a second hub container just for the admin.

Named Server Options Form

When named servers are allowed and the hub is started with the default config, you can create named servers. When doing so, you can set some configurations for the new workspace, such as resource limitations or mounting GPUs. Mounting GPUs is not possible in Kuberntes mode currently. The "Days to live" flag is purely informational currently and can be seen in the admin view; it should help admins to keep an overview of workspaces.

Picture of admin panel

Cleanup Service

JupyterHub was originally not created with Docker or Kubernetes in mind, which can result in unfavorable scenarios such as that containers are stopped but not deleted on the host. Furthermore, our custom spawners might create some artifacts that should be cleaned up as well. MLHub contains a cleanup service that is started as a JupyterHub service inside the hub container; both in the Docker and the Kubernetes setup. It can be accessed as a REST-API by an admin, but it is also triggered automatically every X timesteps when not disabled (see config for CLEANUP_INTERVAL_SECONDS). The service enhances the JupyterHub functionality with regards to the Docker and Kubernetes world. "Containers" is hereby used interchangeably for Docker containers and Kubernetes pods. The service has two endpoints which can be reached under the Hub service url /services/cleanup-service/* with admin permissions.

  • GET /services/cleanup-service/users: This endpoint is currently doing anything only in Docker-local mode. There, it will check for resources of deleted users, so users who are not in the JupyterHub database anymore, and delete them. This includes containers, networks, and volumes. This is done by looking for labeled Docker resources that point to containers started by hub and belonging to the specific users.

  • GET /services/cleanup-service/expired: When starting a named workspace, an expiration date can be assigned to it. This endpoint will delete all containers that are expired. The respective named server is deleted from the JupyterHub database and also the Docker/Kubernetes resource is deleted.

FAQ

How to change the logo shown in the webapp? (click to expand...)

If you want to have your own logo in the corner, place it at /usr/local/share/jupyterhub/static/images/jupyter.png inside the hub container.

Do you have an example for Kubernetes? (click to expand...)

Following setup is tested and should work. It uses AzureOAuth as the authenticator and has HTTPS enabled.

Command

helm upgrade \
    --install mlhub \
    mlhub-chart-2.0.0.tgz \
    --namespace mlhub \
    --values config.yaml \
    --set-file userConfig=./jupyterhub_user_config.py

Folder structure

 .
  /config.yaml
  /jupyterhub_user_config.yaml

config.yaml

mlhub:
  env:
    SSL_ENABLED: true
    AAD_TENANT_ID: "<azure-tenant-id>"

proxy:
  https:
    hosts:
      - mydomain.com
    type: manual
    manual:
      key: |
        -----BEGIN RSA PRIVATE KEY-----
        ...
        -----END RSA PRIVATE KEY-----
      cert: |
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----

jupyterhub_user_config.py

import os
c.KubeSpawner.environment = {"FOO_TEST": "BAR_TEST"}

c.JupyterHub.authenticator_class = "oauthenticator.azuread.AzureAdOAuthenticator"
c.AzureAdOAuthenticator.oauth_callback_url = "https://mydomain.com:8080/hub/oauth_callback"
c.AzureAdOAuthenticator.client_id = "<id>"
c.AzureAdOAuthenticator.client_secret = "<secret>"
c.AzureAdOAuthenticator.admin_users = ["some-user"]
c.AzureAdOAuthenticator.tenant_id = os.environ.get('AAD_TENANT_ID')
What are the additional environment variables I have seen in the code? (click to expand...)

Via the START_* environment variables you can define what is started within the container. It's like this since the MLHub image is used in our Kubernetes setup for both, the hub and the proxy container. We did not want to break those functionalities into different images for now. They are probably configured in the provided Helm chart and, thus, do not have to be configured by you.

Variable Description Default
START_SSH Start the sshd process which is used to tunnel ssh to the workspaces. true
START_NGINX Whether or not to start the nginx proxy. If the Hub should be used without additional tool routing to workspaces, this could be disabled. SSH port 22 would need to be published separately then. This option is built-in to work with our Kubernetes Helm chart. true
START_JHUB Start the JupyterHub hub. true
START_CHP Start the JupyterHub proxy process separately (The hub should not start the proxy itself, which can be configured via the JupyterHub config file. This option is built-in to work with our Kubernetes Helm chart, where the image is also used as the Configurable-Http-Proxy (CHP) image. Additional arguments to the chp-start command can be passed to the container by passing an environment variable ADDITIONAL_ARGS, e.g. --env ADDITIONAL_ARGS="--ip=0.0.0.0 --api-ip=0.0.0.0". false

Contribution


Licensed Apache 2.0. Created and maintained with โค๏ธ by developers from SAP in Berlin.

ml-hub's People

Contributors

clementgautier avatar dino0633 avatar lukasmasuch avatar raethlein avatar skurfuerst avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-hub's Issues

Could not connect mlhub to the network / spawner fails

  • General

I'm trying to run ml-hub with ml-workspace but having problems starting the spawner. The hub is up and running, logging in users (ldap) is working fine but it fails when i start the spawner.

The setup is pretty much standard except that

  1. i'm using podman instead of docker (tried both running as root and rootless)
  2. overriding the jupyterhub_config.py to be able to modify the function that renames the container (not working in podman)
  • Technical

Log output

Starting ML Hub
No certificate was provided for SSL/HTTPS.
Generate self-signed certificate for SSL/HTTPS.
Start SSH Daemon service
Start JupyterHub
Warning: If you want to use the SSH feature, you have to start the hub with ssl enabled.
Start nginx
[I 2021-11-02 06:10:54.127 JupyterHub app:2120] Using Authenticator: ldapauthenticator.ldapauthenticator.LDAPAuth                                  enticator-1.3.2
[I 2021-11-02 06:10:54.128 JupyterHub app:2120] Using Spawner: mlhubspawner.mlhubspawner.MLHubDockerSpawner
[I 2021-11-02 06:10:54.136 JupyterHub app:1257] Loading cookie_secret from /data/jupyterhub_cookie_secret
[D 2021-11-02 06:10:54.148 JupyterHub app:1424] Connecting to db: sqlite:////data/jupyterhub.sqlite
[D 2021-11-02 06:10:54.184 JupyterHub orm:749] database schema version found: 4dc2d5a8c53c
[I 2021-11-02 06:10:54.198 JupyterHub proxy:460] Generating new CONFIGPROXY_AUTH_TOKEN
[I 2021-11-02 06:10:54.253 JupyterHub app:1563] Not using whitelist. Any authenticated user will be allowed.
[D 2021-11-02 06:10:54.368 JupyterHub app:1910] Loading state for xxx001 from db
[D 2021-11-02 06:10:54.369 JupyterHub app:1910] Loading state for admin from db
[D 2021-11-02 06:10:54.369 JupyterHub app:1926] Loaded users:
      xxx001 admin
       admin admin
[I 2021-11-02 06:10:54.382 JupyterHub app:2337] Hub API listening on http://0.0.0.0:8081/hub/
[I 2021-11-02 06:10:54.383 JupyterHub app:2339] Private Hub API connect url http://188f0675ac42:8081/hub/
[W 2021-11-02 06:10:54.400 JupyterHub proxy:642] Running JupyterHub without SSL.  I hope there is SSL termination                                   happening somewhere else...
[I 2021-11-02 06:10:54.400 JupyterHub proxy:645] Starting proxy @ http://:8000
[D 2021-11-02 06:10:54.400 JupyterHub proxy:646] Proxy cmd: ['configurable-http-proxy', '--ip', '', '--port', '80                                  00', '--api-ip', '127.0.0.1', '--api-port', '8001', '--error-target', 'http://188f0675ac42:8081/hub/error']
[D 2021-11-02 06:10:54.426 JupyterHub proxy:561] Writing proxy pid file: jupyterhub-proxy.pid
06:10:54.850 [ConfigProxy] info: Proxying http://*:8000 to (no default)
06:10:54.852 [ConfigProxy] info: Proxy API at http://127.0.0.1:8001/api/routes
[D 2021-11-02 06:10:55.331 JupyterHub proxy:681] Proxy started and appears to be up
[I 2021-11-02 06:10:55.331 JupyterHub app:2362] Starting managed service cleanup-service at http://127.0.0.1:9000
[I 2021-11-02 06:10:55.332 JupyterHub service:316] Starting service 'cleanup-service': ['/usr/bin/python3', '/res                                  ources/cleanup-service.py']
[I 2021-11-02 06:10:55.339 JupyterHub service:121] Spawning /usr/bin/python3 /resources/cleanup-service.py
[D 2021-11-02 06:10:55.351 JupyterHub spawner:1084] Polling subprocess every 30s
WARNING:tornado.access:404 GET /services/cleanup-service/ (127.0.0.1) 0.55ms
[D 2021-11-02 06:10:57.353 JupyterHub utils:218] Server at http://127.0.0.1:9000/services/cleanup-service/ respon                                  ded with 404
[D 2021-11-02 06:10:57.353 JupyterHub proxy:314] Fetching routes to check
[D 2021-11-02 06:10:57.354 JupyterHub proxy:765] Proxy: Fetching GET http://127.0.0.1:8001/api/routes
[I 2021-11-02 06:10:57.361 JupyterHub proxy:319] Checking routes
[I 2021-11-02 06:10:57.361 JupyterHub proxy:399] Adding default route for Hub: / => http://188f0675ac42:8081
[W 2021-11-02 06:10:57.362 JupyterHub proxy:373] Adding missing route for cleanup-service (Server(url=http://127.                                  0.0.1:9000/services/cleanup-service/, bind_url=http://127.0.0.1:9000/services/cleanup-service/))
[D 2021-11-02 06:10:57.363 JupyterHub proxy:765] Proxy: Fetching POST http://127.0.0.1:8001/api/routes/
[I 2021-11-02 06:10:57.364 JupyterHub proxy:242] Adding service cleanup-service to proxy /services/cleanup-servic                                  e/ => http://127.0.0.1:9000
06:10:57.372 [ConfigProxy] info: 200 GET /api/routes
[D 2021-11-02 06:10:57.373 JupyterHub proxy:765] Proxy: Fetching POST http://127.0.0.1:8001/api/routes/services/c                                  leanup-service
06:10:57.375 [ConfigProxy] info: Adding route / -> http://188f0675ac42:8081
06:10:57.376 [ConfigProxy] info: Route added / -> http://188f0675ac42:8081
06:10:57.376 [ConfigProxy] info: 201 POST /api/routes/
06:10:57.377 [ConfigProxy] info: Adding route /services/cleanup-service -> http://127.0.0.1:9000
06:10:57.377 [ConfigProxy] info: Route added /services/cleanup-service -> http://127.0.0.1:9000
06:10:57.377 [ConfigProxy] info: 201 POST /api/routes/services/cleanup-service
[I 2021-11-02 06:10:57.378 JupyterHub app:2422] JupyterHub is now running at http://:8000
[I 2021-11-02 06:11:16.425 JupyterHub log:174] 302 GET / -> /hub/ (@::ffff:10.162.2.18) 1.59ms
[D 2021-11-02 06:11:16.454 JupyterHub base:289] Refreshing auth for xxx001
[I 2021-11-02 06:11:16.455 JupyterHub log:174] 302 GET /hub/ -> /hub/home (xxx001@::ffff:10.162.2.18) 13.83ms
[D 2021-11-02 06:11:16.522 JupyterHub user:240] Creating <class 'mlhubspawner.mlhubspawner.MLHubDockerSpawner'> f                                  or xxx001:
[I 2021-11-02 06:11:17.062 JupyterHub log:174] 200 GET /hub/home (xxx001@::ffff:10.162.2.18) 577.46ms
[D 2021-11-02 06:11:17.206 JupyterHub log:174] 200 GET /hub/static/js/home.js?v=20211102061054 (@::ffff:10.162.2.                                  18) 1.56ms
[I 2021-11-02 06:11:17.238 JupyterHub log:174] 200 GET /hub/api/users (xxx001@::ffff:10.162.2.18) 29.34ms
[D 2021-11-02 06:11:17.243 JupyterHub log:174] 200 GET /hub/static/js/jhapi.js?v=20211102061054 (@::ffff:10.162.2                                  .18) 1.67ms
[D 2021-11-02 06:11:17.244 JupyterHub log:174] 200 GET /hub/static/js/utils.js?v=20211102061054 (@::ffff:10.162.2                                  .18) 0.67ms
[D 2021-11-02 06:11:17.246 JupyterHub log:174] 200 GET /hub/static/components/moment/moment.js?v=20211102061054 (                                  @::ffff:10.162.2.18) 6.70ms
[D 2021-11-02 06:11:20.565 JupyterHub log:174] 304 GET /hub/home (xxx001@::ffff:10.162.2.18) 64.74ms
[I 2021-11-02 06:11:20.697 JupyterHub log:174] 200 GET /hub/api/users (xxx001@::ffff:10.162.2.18) 41.45ms
[I 2021-11-02 06:11:23.437 JupyterHub login:43] User logged out: xxx001
[I 2021-11-02 06:11:23.480 JupyterHub log:174] 302 GET /hub/logout -> /hub/login (xxx001@::ffff:10.162.2.18) 54.6                                  4ms
[I 2021-11-02 06:11:23.515 JupyterHub log:174] 200 GET /hub/login (@::ffff:10.162.2.18) 6.75ms
[D 2021-11-02 06:11:33.389 JupyterHub ldapauthenticator:379] Attempting to bind xxx001with CN=xxx001,OU=Users,OU                                  =People,OU=SE,OU=***,DC=******,DC=***,DC=biz
[D 2021-11-02 06:11:34.079 JupyterHub ldapauthenticator:392] Status of user bind xxx001with CN=xxx001,OU=Users,O                                  U=People,OU=SE,OU=***,DC=******,DC=***,DC=biz : True
[D 2021-11-02 06:11:34.084 JupyterHub base:482] Setting cookie for xxx001: jupyterhub-services
[D 2021-11-02 06:11:34.084 JupyterHub base:478] Setting cookie jupyterhub-services: {'httponly': True, 'path': '/                                  services'}
[D 2021-11-02 06:11:34.085 JupyterHub base:478] Setting cookie jupyterhub-session-id: {'httponly': True}
[D 2021-11-02 06:11:34.085 JupyterHub base:482] Setting cookie for xxx001: jupyterhub-hub-login
[D 2021-11-02 06:11:34.085 JupyterHub base:478] Setting cookie jupyterhub-hub-login: {'httponly': True, 'path': '                                  /hub/'}
[I 2021-11-02 06:11:34.085 JupyterHub base:663] User logged in: xxx001
[I 2021-11-02 06:11:34.086 JupyterHub log:174] 302 POST /hub/login?next= -> /hub/home (xxx001@::ffff:10.162.2.18)                                   698.01ms
[D 2021-11-02 06:11:34.471 JupyterHub log:174] 304 GET /hub/home (xxx001@::ffff:10.162.2.18) 364.68ms
[I 2021-11-02 06:11:34.587 JupyterHub log:174] 200 GET /hub/api/users (xxx001@::ffff:10.162.2.18) 18.62ms
[D 2021-11-02 06:11:51.696 JupyterHub pages:165] Triggering spawn with default options for xxx001
[D 2021-11-02 06:11:51.696 JupyterHub base:780] Initiating spawn for xxx001
[D 2021-11-02 06:11:51.697 JupyterHub base:787] 0/100 concurrent spawns
[D 2021-11-02 06:11:51.697 JupyterHub base:792] 0 active servers
[D 2021-11-02 06:11:51.718 JupyterHub user:542] Calling Spawner.start for xxx001
[I 2021-11-02 06:11:51.933 JupyterHub mlhubspawner:283] Create network mlhub-xxx001 with subnet 172.33.1.0/24
[E 2021-11-02 06:11:51.949 JupyterHub mlhubspawner:298] Could not connect mlhub to the network and, thus, cannot create the container.
[D 2021-11-02 06:11:52.012 JupyterHub dockerspawner:811] Getting container 'ws-xxx001-mlhub'
[I 2021-11-02 06:11:52.016 JupyterHub dockerspawner:818] Container 'ws-xxx001-mlhub' is gone
[I 2021-11-02 06:11:52.056 JupyterHub mlhubspawner:264] Network mlhub-xxx001 already exists
[E 2021-11-02 06:11:52.060 JupyterHub mlhubspawner:298] Could not connect mlhub to the network and, thus, cannot create the container.
[D 2021-11-02 06:11:52.089 JupyterHub dockerspawner:907] Starting host with config: {'binds': {}, 'links': {}, 'network_mode': 'mlhub-xxx001'}
[I 2021-11-02 06:11:52.161 JupyterHub dockerspawner:1028] Created container ws-xxx001-mlhub (id: 609b2b5) from image localhost/dlab_mlworkspace
[I 2021-11-02 06:11:52.161 JupyterHub dockerspawner:1051] Starting container ws-xxx001-mlhub (id: 609b2b5)
[D 2021-11-02 06:11:52.680 JupyterHub spawner:1084] Polling subprocess every 30s
[I 2021-11-02 06:11:52.726 JupyterHub log:174] 302 GET /hub/spawn/xxx001 -> /hub/spawn-pending/xxx001 (xxx001@::ffff:10.162.2.18) 1035.21ms
[I 2021-11-02 06:11:52.831 JupyterHub pages:303] xxx001 is pending spawn
[I 2021-11-02 06:11:52.836 JupyterHub log:174] 200 GET /hub/spawn-pending/xxx001 (xxx001@::ffff:10.162.2.18) 74.55ms
[W 2021-11-02 06:11:56.881 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[W 2021-11-02 06:11:57.031 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[D 2021-11-02 06:11:57.385 JupyterHub app:1812] Managed service cleanup-service running at http://127.0.0.1:9000
[W 2021-11-02 06:12:01.478 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[D 2021-11-02 06:12:01.701 JupyterHub dockerspawner:811] Getting container 'ws-xxx001-mlhub'
[D 2021-11-02 06:12:01.715 JupyterHub dockerspawner:796] Container 609b2b5 status: {'Dead': False,
     'Error': '',
     'ExitCode': 0,
     'FinishedAt': '0001-01-01T00:00:00Z',
     'Health': {'FailingStreak': 0, 'Log': None, 'Status': ''},
     'OOMKilled': False,
     'Paused': False,
     'Pid': 17099,
     'Restarting': False,
     'Running': True,
     'StartedAt': '2021-11-02T06:11:52.563936822Z',
     'Status': 'running'}
[W 2021-11-02 06:12:01.715 JupyterHub base:932] User xxx001 is slow to become responsive (timeout=10)
[D 2021-11-02 06:12:01.715 JupyterHub base:937] Expecting server for xxx001 at: http://172.33.1.2:8080/user/xxx001/
[W 2021-11-02 06:12:06.486 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[W 2021-11-02 06:12:11.493 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[W 2021-11-02 06:12:16.498 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502
[W 2021-11-02 06:12:21.502 JupyterHub utils:215] Server at http://172.33.1.2:8080/user/xxx001/ responded with error: 502

Config

"""
Basic configuration file for jupyterhub.
"""

import os
import re
import signal
import socket
import sys

import docker.errors

import json

from traitlets.log import get_logger
logger = get_logger()

from mlhubspawner import utils
from subprocess import call

c = get_config()

c.Application.log_level = 'DEBUG'
c.Spawner.debug = True

# Override the Jupyterhub `normalize_username` function to remove problematic characters from the username - independent from the used authenticator.
# E.g. when the username is "lastname, firstname" and the comma and whitespace are not removed, they are encoded by the browser, which can lead to brok$
# especially for the tools-part.
# Everybody who starts the hub can override this behavior the same way we do in a mounted `jupyterhub_user_config.py` (Docker local) or via the `hub.ex$
from jupyterhub.auth import Authenticator
original_normalize_username = Authenticator.normalize_username
def custom_normalize_username(self, username):
    username = original_normalize_username(self, username)
    more_than_one_forbidden_char = False
    for forbidden_username_char in [" ", ",", ";", ".", "-", "@", "_"]:
        # Replace special characters with a non-special character. Cannot just be empty, like "", because then it could happen that two distinct user n$
        # Example: "foo, bar" and "fo, obar" would both become "foobar".
        replace_char = "0"
        # If there is more than one special character, just replace one of them. Otherwise, "foo, bar" would become "foo00bar" instead of "foo0bar"
        if more_than_one_forbidden_char == True:
            replace_char = ""
        temp_username = username
        username = username.replace(forbidden_username_char, replace_char, 1)
        if username != temp_username:
            more_than_one_forbidden_char = True

    return username

Authenticator.normalize_username = custom_normalize_username

original_check_whitelist = Authenticator.check_whitelist
def dynamic_check_whitelist(self, username, authentication=None):
    dynamic_whitelist_file = "/resources/users/dynamic_whitelist.txt"

    if os.getenv("DYNAMIC_WHITELIST_ENABLED", "false") == "true":
        # TODO: create the file and warn the user that the user has to go into the hub pod and modify it there
        if not os.path.exists(dynamic_whitelist_file):
            logger.error("The dynamic white list has to be mounted to '{}'. Use standard JupyterHub whitelist behavior.".format(dynamic_whitelist_file))
        else:
            with open(dynamic_whitelist_file, "r") as f:
                #whitelisted_users = f.readlines()
                # rstrip() will remove trailing whitespaces or newline characters
                whitelisted_users = [line.rstrip() for line in f]
                return username.lower() in whitelisted_users

    return original_check_whitelist(self, username, authentication)
Authenticator.check_whitelist = dynamic_check_whitelist

### Helper Functions ###

def get_or_init(config: object, config_type: type) -> object:
    if not isinstance(config, config_type):
        return config_type()
    return config

def combine_config_dicts(*configs) -> dict:
    combined_config = {}
    for config in configs:
        if not isinstance(config, dict):
            config = {}
        combined_config.update(config)
    return combined_config

### END HELPER FUNCTIONS###

ENV_NAME_HUB_NAME = 'HUB_NAME'
ENV_HUB_NAME = os.environ[ENV_NAME_HUB_NAME]
ENV_EXECUTION_MODE = os.environ[utils.ENV_NAME_EXECUTION_MODE]

# User containers will access hub by container name on the Docker network
c.JupyterHub.hub_ip = '0.0.0.0' #'research-hub
c.JupyterHub.port = 8000

# Persist hub data on volume mounted inside container
# TODO: should really be persisted?
data_dir = os.environ.get('DATA_VOLUME_CONTAINER', '/data')
if not os.path.exists(data_dir):
    os.makedirs(data_dir)
c.JupyterHub.cookie_secret_file = os.path.join(data_dir, 'jupyterhub_cookie_secret')
c.JupyterHub.db_url = os.path.join(data_dir, 'jupyterhub.sqlite')
c.JupyterHub.admin_access = True
# prevents directly opening your workspace after login
c.JupyterHub.redirect_to_server=False
c.JupyterHub.allow_named_servers = True

c.Spawner.port = int(os.getenv("DEFAULT_WORKSPACE_PORT", 8080))

# Set default environment variables used by our ml-workspace container
default_env = {"AUTHENTICATE_VIA_JUPYTER": "true", "SHUTDOWN_INACTIVE_KERNELS": "true"}

# Workaround to prevent api problems
#c.Spawner.will_resume = True

# --- Spawner-specific ----
c.JupyterHub.spawner_class = 'mlhubspawner.MLHubDockerSpawner' # override in your config if you want to have a different spawner. If it is the or inher$


#c.Spawner.image = "mltooling/ml-workspace:0.8.7"
#c.Spawner.workspace_images = [c.Spawner.image, "mltooling/ml-workspace-gpu:0.8.7", "mltooling/ml-workspace-r:0.8.7", "mltooling/ml-workspace-spark:0.8$

c.Spawner.image = 'localhost/dlab_mlworkspace'
c.Spawner.workspace_images = [c.Spawner.image]
c.Spawner.notebook_dir = '/workspace'

# Connect containers to this Docker network
c.Spawner.use_internal_ip = True

c.Spawner.prefix = 'ws'
c.Spawner.name_template = c.Spawner.prefix + '-{username}-' + ENV_HUB_NAME + '{servername}' # override in your config when you want to have a different$

# Don't remove containers once they are stopped - persist state
c.Spawner.remove_containers = False

c.Spawner.start_timeout = 600 # should remove errors related to pulling Docker images (see https://github.com/jupyterhub/dockerspawner/issues/293)
c.Spawner.http_timeout = 120

# --- Authenticator ---
c.Authenticator.admin_users = {"xxx001"} # override in your config when needed, for example if you use a different authenticator (e.g. set Github usern$
# Forbid user names that could collide with a named server (check ) to prevent security & routing problems
c.Authenticator.username_pattern = '^((?!-hub).)*$'

NATIVE_AUTHENTICATOR_CLASS = 'nativeauthenticator.NativeAuthenticator'
#c.JupyterHub.authenticator_class = NATIVE_AUTHENTICATOR_CLASS # override in your config if you want to use a different authenticator
c.JupyterHub.authenticator_class = 'ldapauthenticator.LDAPAuthenticator'
c.LDAPAuthenticator.server_address = 'ldap://***.***.**'
c.LDAPAuthenticator.lookup_dn = False
c.LDAPAuthenticator.bind_dn_template = [
    'CN={username},OU=Users,OU=People,OU=SE,OU=***,DC=******,DC=***,DC=biz',
    'CN={username},OU=Impersonal Accounts,OU=Admin,OU=Central,OU=***,DC=******,DC=***,DC=biz',
]


# --- Load user config ---
# Allow passing an additional config upon mlhub container startup.
# Enables the user to override all configurations occurring above the load_subconfig command; be careful to not break anything ;)
# An empty config file already exists in case the user does not mount another config file.
# The extra config could look like:
    # jupyterhub_user_config.py
    # > c = get_config()
    # > c.DockerSpawner.extra_create_kwargs.update({'labels': {'foo': 'bar'}})
# See https://traitlets.readthedocs.io/en/stable/config.html#configuration-files-inheritance
load_subconfig("{}/jupyterhub_user_config.py".format(os.getenv("_RESOURCES_PATH")))
c.Spawner.environment = get_or_init(c.Spawner.environment, dict)
c.Spawner.environment.update(default_env)

service_environment = {
    ENV_NAME_HUB_NAME: ENV_HUB_NAME,
    utils.ENV_NAME_EXECUTION_MODE: ENV_EXECUTION_MODE,
    utils.ENV_NAME_CLEANUP_INTERVAL_SECONDS: os.getenv(utils.ENV_NAME_CLEANUP_INTERVAL_SECONDS),
}


# shm_size can only be set for Docker, not Kubernetes (see https://stackoverflow.com/questions/43373463/how-to-increase-shm-size-of-a-kubernetes-contai$
#c.Spawner.extra_host_config = { 'shm_size': '256m' }

client_kwargs = {**get_or_init(c.Spawner.client_kwargs, dict)} # {**get_or_init(c.DockerSpawner.client_kwargs, dict), **get_or_init(c.MLHubDockerSpawne$
tls_config = {**get_or_init(c.Spawner.tls_config, dict)} # {**get_or_init(c.DockerSpawner.tls_config, dict), **get_or_init(c.MLHubDockerSpawner.tls_con$

#docker_client = utils.init_docker_client(client_kwargs, tls_config)
#try:
#    container = docker_client.containers.list(filters={"id": socket.gethostname()})[0]
#    if container.name.lower() != ENV_HUB_NAME.lower():
#        container.rename(ENV_HUB_NAME.lower())
#except docker.errors.APIError as e:
#    logger.error("Could not correctly start MLHub container. " + str(e))
#    os.kill(os.getpid(), signal.SIGTERM)

# For cleanup-service
service_environment.update({"DOCKER_CLIENT_KWARGS": json.dumps(client_kwargs), "DOCKER_TLS_CONFIG": json.dumps(tls_config)})
service_host = "127.0.0.1"

# Consider the case where the user-config contains c.DockerSpawner.environment instead of c.Spawner.environment
# c.DockerSpawner.environment = get_or_init(c.DockerSpawner.environment, dict)
# c.Spawner.environment.update(c.DockerSpawner.environment)
#c.MLHubDockerSpawner.hub_name = ENV_HUB_NAME

# Add nativeauthenticator-specific templates
if c.JupyterHub.authenticator_class == NATIVE_AUTHENTICATOR_CLASS:
    import nativeauthenticator
    # if template_paths is not set yet in user_config, it is of type traitlets.config.loader.LazyConfigValue; in other words, it was not initialized yet
    c.JupyterHub.template_paths = get_or_init(c.JupyterHub.template_paths, list)
    # if not isinstance(c.JupyterHub.template_paths, list):
    #     c.JupyterHub.template_paths = []
    c.JupyterHub.template_paths.append("{}/templates/".format(os.path.dirname(nativeauthenticator.__file__)))

# TODO: add env variable to readme
if (os.getenv("IS_CLEANUP_SERVICE_ENABLED", "true") == "true"):
    c.JupyterHub.services = [
        {
            'name': 'cleanup-service',
            'admin': True,
            'url': 'http://{}:9000'.format(service_host),
            'environment': service_environment,
            'command': [sys.executable, '/resources/cleanup-service.py']
        }
    ]



  • Hub version mltooling/ml-workspace-minimal:0.13.2, mltooling/ml-hub:1.0.0
  • Docker version: podman 3.2.3
  • Host Machine OS (Windows/Linux/Mac): Linux x86
  • Browser (Chrome/Firefox/Safari): Chrome
  • Command used to start the hub:
podman run \
        -p 8000:8000 \
        --name mlhub \
        -e SERVICE_ACCOUNT_USERNAME=$SERVICE_ACCOUNT_USERNAME \
        -e SERVICE_ACCOUNT_PASSWORD=$SERVICE_ACCOUNT_PASSWORD \
        -v /workspace:/data \n
        -v /run/podman/podman.sock:/var/run/docker.sock \
        -v ./jupyterhub_config.py:/resources/jupyterhub_config.py \
        localhost/dlab_mlhub

Readiness probe failed when hub pod created

Describe the issue:

An error occurred while building mlhub using the Amazon EKS service.

When I installed mlhub on EKS, two pods were generated, but when I checked the status, one pod was not in the Ready status, so mlhub was unavailable.

Input
kubectl --namespace=default get pod

Output

NAME                     READY   STATUS    RESTARTS   AGE
hub-6b567dbfd8-6k2tn     0/1     Running   0          3h53m
proxy-84f5d55b94-4d9dz   1/1     Running   0          3h53m

When I check the detailed status of the pod hub-6b567dbfd8-6k2tn, I get the following error:

Input

kubectl describe pods hub-6b567dbfd8-6k2tn

Output

Name:         hub-6b567dbfd8-6k2tn
Namespace:    default
Priority:     0
Node:         ip-192-168-10-148.ap-northeast-1.compute.internal/192.168.10.148
Start Time:   Tue, 17 Aug 2021 11:19:23 +0800
Labels:       app=mlhub
              component=hub
              hub.jupyter.org/network-access-proxy-api=true
              hub.jupyter.org/network-access-proxy-http=true
              hub.jupyter.org/network-access-singleuser=true
              pod-template-hash=6b567dbfd8
              release=mlhub
Annotations:  checksum/config-map: d8905435057c013d65412bb01dd6e6ab73a266eabdfd847bf3d85954c013f48c
              checksum/secret: a95bc8602c127abbfd6a9b08d89835f1c10c4e99e3f95af2688dc4e9e4f07ef6
              kubernetes.io/psp: eks.privileged
Status:       Running
IP:           192.168.12.34
IPs:
  IP:           192.168.12.34
Controlled By:  ReplicaSet/hub-6b567dbfd8
Containers:
  hub:
    Container ID:   docker://47056a6bd6af45b9e117b2325de6510bb64ecb8c83a87b246320e2a3faab7193
    Image:          mltooling/ml-hub:1.0.0
    Image ID:       docker-pullable://mltooling/ml-hub@sha256:71fd4787ba74cd0ae5b6d127abf3d8d817f61e4016ee78d17ba6e6ec70b30aec
    Ports:          8081/TCP, 22/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Tue, 17 Aug 2021 11:19:47 +0800
    Ready:          False
    Restart Count:  0
    Requests:
      cpu:      200m
      memory:   512Mi
    Readiness:  http-get http://:hub/mlhub/hub/health delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      ADDITIONAL_ARGS:            --config /resources/jupyterhub_config.py --debug
      START_NGINX:                false
      EXECUTION_MODE:             k8s
      PYTHONUNBUFFERED:           1
      HELM_RELEASE_NAME:          mlhub
      POD_NAMESPACE:              default (v1:metadata.namespace)
      CONFIGPROXY_AUTH_TOKEN:     <set to the key 'proxy.token' in secret 'hub-secret'>  Optional: false
      DYNAMIC_WHITELIST_ENABLED:  true
    Mounts:
      /etc/jupyterhub/config/ from config (rw)
      /etc/jupyterhub/secret/ from secret (rw)
      /resources/jupyterhub_user_config.py from user-config (rw,path="jupyterhub_user_config.py")
      /var/run/secrets/kubernetes.io/serviceaccount from hub-token-lbwq8 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub-config
    Optional:  false
  secret:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-secret
    Optional:    false
  user-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      hub-user-config
    Optional:  false
  hub-token-lbwq8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  hub-token-lbwq8
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  68s (x1348 over 3h45m)  kubelet  Readiness probe failed: Get http://192.168.12.34:8081/mlhub/hub/health:  dial tcp 192.168.12.34:8081: connect: connection refused

Technical details:

  • Hub version : 1.0.0
  • Docker version : 19.3.13
  • Host Machine OS (Windows/Linux/Mac): Amazon linux 2
  • Command used to start the hub :
  • Browser (Chrome/Firefox/Safari):Chrome

Adding support for R in ml-hub

As a data scientist, I use both R and Python and would really appreciate if R (say, version 4.0.3) was part of ML-Hub. Adding R support could be optional (something like docker build -t . --build_arg R_SUPPPORT=True). R can be installed, for instance, using apt-get.

custom image not working label:bug

Describe the issue:
I tried to spawn custom image jupyter/scipy-notebook, but it faild.
image

After clicking spawn in the above page, a new server is starting up.
However, it failed with a message "Spawn failed: Server at http://192.168.47.61:8080/mlhub/user/admin/customserver/ didn't respond in 120 seconds".

image

Technical details:

  • Hub version :
  • Docker version :
  • Host Machine OS (Windows/Linux/Mac):
  • Command used to start the hub :
  • Browser (Chrome/Firefox/Safari):

upgrade to adopt the latest jupyterhub feature

Feature description:

Upgrade the helm chart to adopt the latest feature that jupyterhub offer , such as Customizing User Storage

Problem and motivation:

This will ensure that the latest feature in jupyterhub is always available in ml-hub.

Is this something you're interested in working on?

Yes, l'd love to . But I need some guideline on where to start.

Compile to ARM64 arch

Feature description:
Compile docker images for ARM64 arch

Problem and motivation:

I want to do machine learning on my raspberry cluster

Is this something you're interested in working on?

Yes why not =)

Using --hostname <domain> argument of docker/docker-compose leads to config error

Describe the bug:

Using the --hostname argument of docker/docker-compose will lead to the following error

[E 2020-06-16 16:33:50.626 JupyterHub app:2718]
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2715, in launch_instance_async                                                          await self.initialize(argv)
File "/usr/local/lib/python3.6/dist-packages/jupyterhub/app.py", line 2238, in initial
self.load_config_file(self.config_file)
File "<decorator-gen-5>", line 2, in load_config_file
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 87, in catch_config_error
return method(app, *args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 602, in load_config_file
raise_config_file_errors=self.raise_config_file_errors,
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 563, in _load_config_files
config = loader.load_config()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/loader.py", line 457, in load_config                                                            self._read_file_as_dict()
File "/usr/local/lib/python3.6/dist-packages/traitlets/config/loader.py", line 489, in _read_file_as_dict                                                     py3compat.execfile(conf_filename, namespace)
File "/usr/local/lib/python3.6/dist-packages/ipython_genutils/py3compat.py", line 198, in execfile                                                            exec(compiler(f.read(), fname, 'exec'), glob, loc)
File "/resources/jupyterhub_config.py", line 188, in <module>
container = docker_client.containers.list(filters={"id": socket.gethostname()})[0]
IndexError: list index out of range  

Without setting hostname and without error:

# docker exec -it mlhub /bin/bash
root@4f1f287080b5:/# python
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.gethostname()
'4f1f287080b5' 

With setting hostname and raising the error above:

# docker exec -it mlhub /bin/bash
root@hub:/# python
Python 3.6.8 (default, Jan 14 2019, 11:02:34)
[GCC 8.0.1 20180414 (experimental) [trunk revision 259383]] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.gethostname()
'hub.local.domain.de' 

In the first case the container ID is similiar to the hostname, therefore there is an element in the resulting list. In the second case the container ID is different from the hostname, therefore the resulting list is empty and indexing the first element will give the resulting error.

I guess this approach is generic and necessary for identifying the current jupyterhub container?!
I don't understand the purpose completely.

Expected behaviour:

I would expect, the success of execution of the jupyterhub config should be independent from setting an additional hostname.

Reproduce the Bug:

Start the ml-hub container with hostname option:

docker run --rm --network rp_backend --hostname hub.local.domain.de -v /var/run/docker.sock:/var/run/docker.sock --name mlhub  ml_hub:latest 

or set hostname: field in docker-compose:

Technical details:

  • Hub version : latest
  • Docker version : 18.09.1
  • Host Machine OS (Windows/Linux/Mac): Debian 4.19.118-2 (2020-04-29)
  • Command used to start the hub : docker run --rm --network rp_backend --hostname hub.local.domain.de -v /var/run/docker.sock:/var/run/docker.sock --name mlhub ml_hub:latest
  • Browser (Chrome/Firefox/Safari): Firefox 77.01

Possible Fix:

  1. Potential Option: Ignore block completely
docker_client = utils.init_docker_client(client_kwargs, tls_config)
try:
        pass
        #container = docker_client.containers.list(filters={"id": socket.gethostname})[0]
        #container_name = socket.gethostname()
        #if container_name.lower() != ENV_HUB_NAME:
            #container_name.rename(ENV_HUB_NAME.lower())
except docker.errors.APIError as e: 
        logger.error("Could not correctly start MLHub container. " + str(e))
        os.kill(os.getpid(), signal.SIGTERM)   
  1. Potential Option: Use another method of getting the container ID from inside the container, like reading and grepping the /proc/... info

Docker 19.03 GPU support

Describe the issue:
Docker 19.03 and nvidia-docker changed how gpus are bound to a container. Instead of calling the runtime, all you need to do is pass in --gpus 2 to docker run. Would it be possible to support this? My systems are running v19.03. Thanks!

Technical details:

  • Hub version : 1.0.0
  • Docker version : 19.03.10
  • Host Machine OS (Windows/Linux/Mac): Linux
  • Command used to start the hub : sudo docker run --rm -p 8095:8080 -v /var/run/docker.sock:/var/run/docker.sock mltooling/ml-hub:latest
  • Browser (Chrome/Firefox/Safari): Chromium

JupytherHub 2.0

Feature description:

There is a new version of JupytherHub JupytherHub 2.0

Problem and motivation:

There is new kind of control of permissions via scopes. Further infromation in their blog

Is this something you're interested in working on?

No

Custom Image option is not working

Describe the issue:
I tried to spawn custom image jupyter/scipy-notebook, but it faild.
image

After clicking spawn in the above page, a new server is starting up.
However, it failed with a message "Spawn failed: Server at http://192.168.47.61:8080/mlhub/user/admin/customserver/ didn't respond in 120 seconds".

image

Technical details:

  • Hub version :
  • Docker version :
  • Host Machine OS (Windows/Linux/Mac):
  • Command used to start the hub :
  • Browser (Chrome/Firefox/Safari):

Helm chart compatibility with Kubernetes 1.22

Feature description:

The current chart version use a deprecated api for the ingress definition that is not available anymore on recent version of Kubernetes

Problem and motivation:

extensions/v1beta1 is deprecated since 1.16 and will be removed in 1.22. I suggest using the GA version networking.k8s.io/v1 that is available since 1.19.

Since this is a breaking change for old version I think it's necessary to bump a major version for that.

Is this something you're interested in working on?

Yes, pull request coming very soon

Can not install the k8s-hub

I install k8s-hub via helm upgrade --install mlhub mlhub-chart-1.0.1.tgz --namespace mlhub --set-file userConfig=./jupyterhub_user_config.py, but fail ,could you updata the helm resource? Or have any other ways to solve this?

Hub and Proxy running but getting 502 Bad gateway

Describe the bug:

It looks like sometimes, the proxy loose connection with the hub and we need to kill the proxy pod to force it's recreation.

Expected behaviour:

I expect the application to be reachable even after a restart of the hub

Steps to reproduce the issue:

helm install
kubectl delete pod hub-***
you should see 502 bad gateway even after the hub pod is shown as running

EDIT: rebooting the node seems to be the only wait to reproduce the issue consistently

Possible Fix:

A quick fix might be to put liveness probes on the proxy pods to ensure the connexion still exists but there might be a better fix.

Using newer version of vs code server

Thank you very much for this nice work! I am very impressed!
I just have a little wish: Could you change the vs code server version to the newest? Somehow I can't update the version in the container via the gui, it throws an error. The easiest solution would be to use the neweset vs code server version.

Unfortunately I don't have the skill to fix this inside your code myself. But I guess it would be one line of code for you.

Thank you and best regards.

Payment gateway like stripe

Hi,

Hope you are all well !

I would like to create a payment gateway to deploy workspaces. Is it possible ?

I found a repository (https://github.com/hssomel/KubeML) for deploying helm charts with a stripe payment gateway in node.js but I was looking for something more integrated to ml-hub for instance.

Do you have any advice on how I could add this payment workflow and deployment with ml-hub ?

Cheers,
X

add authentication for private docker repositories

Feature description:

Add the env varibales for authentication against a private docker repositories.

Problem and motivation:

We customize the workspace image by installing internal liabries. This image is save in a private and secured docker repository. Right now I have to manually pull new images after running docker login.

Is this something you're interested in working on?

I could not immediately find a place in the code where this change could easily be made.

Volume Mounting

Feature description:
The ability to mount system folders in to spawned containers on the DockerSpawner. Either a predefined list of mounts or a line to specify in the spawner configuration interface.

Problem and motivation:
It would be extremely useful for us to be able to mount external volumes in to spawned containers. Our system has multiple nfs volumes mounted on the the local filesystem where our data is stored. We would like to be able to mount those in to spawned containers. This is useful for the case when you have data you want to work on that is stored somewhere else on your system.

Is this something you're interested in working on?
Maybe, not much free time available currently

endless redirect loop from /hub/user/[username] to /user/[username]

Describe the bug:

I'm simply running ML Hub locally using

docker run \
    -p 8080 \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v jupyterhub_data:/data \
    mltooling/ml-hub:latest

After I created and started a workspace server, and wait until it is started, I get a redirect loop between /hub/user/[username] to /user/[username].

Expected behaviour:

No redirect loop :)

Technical details:

  • Hub version :
  • Host Machine OS (Windows/Linux/Mac): OSX
  • Browser (Chrome/Firefox/Safari): Firefox

mlspawner uses Subnets from the internet

To avoid conflicts with local docker networks, the code uses the subnets 172.33. upwards.
This IP-Ranges are public ip ranges. The networks only exist locally on each host, but there may be a issue, when a spawned container wants to contact a service/ip/Packetresource which resides in the same created subnet. even though this case may not have a high probability, it theoretically can occur.

# we create networks in the range of 172.33-255.0.0/24
# Docker by default uses the range 172.17-32.0.0, so we should be save using that range
INITIAL_CIDR_FIRST_OCTET = 172
INITIAL_CIDR_SECOND_OCTET = 33
INITIAL_CIDR = "{}.33.0.0/24".format(INITIAL_CIDR_FIRST_OCTET)

I would recommend to use another private subnet range, as specified by https://tools.ietf.org/html/rfc1918 Chapter 3. Since the 192.168.x ranges are often used in private networks, and only have a small subnet-possibility I would use a 10.x.x.x Net-Range

Support GPUs on multiple machines (via docker-swarm or kubernetes)?

Feature description:

Support docker-swarm (with GPUs support) out-of-the-box.

Problem and motivation:

As here describes, CURRENTLY it is not possible to run ml-hub with GPU support across multiple machines (while every machine may have one or more GPU cards). Since it is not easy to build a kubernetes cluster with GPU support and management (and I'm not farmiliar with kubernetes), maybe a more lightweight solution (like docker-swarm?) would support it more seamlessly (via nvidia-docker).

Is this something you're interested in working on?

Yes

1.0.0 release does not have helm chart attached :)

Hey,

as written in the README, the release should have the helm chart attached - but I did not find this so far.

I'd be glad for a quick pointer where the helm chart could be.

All the best :)
Sebastian

failed to create admin account

Describe the bug:

I'd like to create a hub on a remote server. mlhub is running, but I failed to create any account, got this error
image

Expected behaviour:

I'd like to sign up a user called admin.
No error in the server log.

Steps to reproduce the issue:

  1. create empty folder /data/ml-hub-data
  2. run docker image via docker run \ -p 8849:8080 \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /data/ml-hub-data:/data \ mltooling/ml-hub:0.1.10
  3. access http://host_ip:8849 to get login page.
  4. click sign up, then type in admin for username and adminadmin for password.
  5. when click the 'Sign up' button, I got server logs below:
[I 2019-12-20 16:02:09.767 JupyterHub log:174] 200 GET /hub/signup (@::ffff:127.0.0.1) 51.35ms
ERROR:asyncio:Future exception was never retrieved
future: <Future finished exception=TypeError('authenticate() takes 3 positional arguments but 4 were given',)>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 307, in wrapper
    result = func(*args, **kwargs)
  File "/usr/lib/python3.6/types.py", line 248, in wrapped
    coro = func(*args, **kwargs)
TypeError: authenticate() takes 3 positional arguments but 4 were given
  1. Failed to login with user admin.
  2. Tried:
  • create linux user admin inside and outside the container, and set proper password
  • set PAMAuthentatior inside ml-hub container, re-tried linux user creation

Technical details:

  • Hub version ml-hub:0.1.10
  • Docker version :
Client: Docker Engine - Community
 Version:           19.03.5
 API version:       1.40
 Go version:        go1.12.12
 Git commit:        633a0ea
 Built:             Wed Nov 13 07:25:41 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.5
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.12
  Git commit:       633a0ea
  Built:            Wed Nov 13 07:24:18 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683
  • Host Machine OS (Windows/Linux/Mac): Linux 3.10.0 CentOS 7 (
    Linux version 3.10.0-1062.4.1.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC) ))
  • Command used to start the hub :docker run \ -p 8849:8080 \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /data/ml-hub-data:/data \ mltooling/ml-hub:0.1.10
  • Browser (Chrome/Firefox/Safari): Chrome

Possible Fix:

TypeError: authenticate() takes 3 positional arguments but 4 were given
may indicate that there's some inconsistence with tornado version and authenticator version.

Additional context:

Here is the tornado version on host machine via conda list | grep tornado:
tornado 4.5.3 pypi_0 pypi

But this shouldn't trouble within a docker container (as I guess).

Supporting `SystemUserSpawner` and using the `--user $UID:$GID` flags

Feature description:
Broadly: Support for PAMAuthenticator, SystemUserSpawner, and --user $UID:$GID flags.

Tying these together, this would allow ml-hub to take advantage of local system users. The primary benefit of this is that in a setting where each user can log in and spin up their own ml-workspace, they now have a way to tie into their home directory on the host file-system. This allows for a single-location, transportable configuration across multiple workspaces, in the cases where a workspace is used as a "project sandbox" (if you will).

Problem and motivation:

  • Why is this change important to you? I've been using ml-hub for a bit and it's great, but I (and other users on my system) find that we're setting up our shell configurations (and cloning projects) quite a bit.
  • What is the problem this feature would solve?
    1. Transporting user configurations and credentials (e.g. ECDSA keys) between workspaces.
    2. Allow ml-hub to work from local datasets (e.g. someone working on YouTube-8M โ€“ it's difficult to redownload the entire dataset in a reasonable timeframe.)
    3. Users with a file-sync service running on the host can have their changes reflected from ml-hub.
    4. Work within ml-workspaces is more transparently accessible.
  • How would you use it? Personally, all the problems this solves are exactly what I'm looking for. While it's challenging to do things like mount datasets directly, I could solve that with some hard-linking. Though this brings to mind another possible feature for admins of ml-hub โ€“ specify dataset repositories.
  • How can it benefit other users? I'm not too sure how it would benefit other users, explicitly, but I have a general feeling that once ml-hub supports local user mappings, and if there's a way to port this to singularity, HPCs could be interested in using this along with some smaller teams of ML researchers/developers.

Is this something you're interested in working on?
Yea! I was planning to do some digging later this week to figure out how challenging an implementation is would be.

Upstream Error when running via docker

Hi,

Thanks for this awesome tooling-set! Really cool.
I'm running into an issue trying to launch ml-hub via docker.

Expected behaviour:

ML-Hub launches and is able to accept connections.

Steps to reproduce the issue:

From https://github.com/ml-tooling/ml-workspace#multi-user-setup

Run the following docker command:

docker run -p 8080:8080 --name mlhub -v /var/run/docker.sock:/var/run/docker.sock mltooling/ml-hub:latest

Technical details:

  • Hub version: latest
  • Docker version Docker version 19.03.5, build 633a0ea
  • Host Machine OS (Windows/Linux/Mac): Mac
  • Command used to start the hub: docker run -p 8080:8080 --name mlhub -v /var/run/docker.sock:/var/run/docker.sock mltooling/ml-hub:latest
  • Browser (Chrome/Firefox/Safari): Chrome

Possible Fix:
N/A

Additional context:

Output from docker:

Start JupyterHub
2019/11/29 23:12:28 [error] 736#0: *4 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "localhost:8080"
2019/11/29 23:12:28 [warn] 736#0: *4 upstream server temporarily disabled while connecting to upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "localhost:8080"
2019/11/29 23:12:28 [error] 736#0: *4 connect() failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "localhost:8080"
2019/11/29 23:12:28 [warn] 736#0: *4 upstream server temporarily disabled while connecting to upstream, client: 127.0.0.1, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "localhost:8080"
2019/11/29 23:12:28 [error] 736#0: *4 open() "/resources/5xx.html" failed (2: No such file or directory), client: 127.0.0.1, server: , request: "GET / HTTP/1.1", upstream: "http://127.0.0.1:8000/", host: "localhost:8080"```

Docker images not up to date?

Describe the issue:

Images of ml-hub on docker hub are rather outdated and doesn't include changes that where made quite a while ago (#17 for instance). Can you tag a version that include those changes? thx

Helm chart configuration change doesn't trigger a pod restart

Describe the bug:

Change the jupyterhub_config.py user configmap doesn't restart the hub pod, thus it doesn't apply the config change.

Expected behaviour:

Changing the userConfig value should be taken into account during helm upgrade

Steps to reproduce the issue:

helm install with a userConfig
helm upgrade with another version of the userConfig
see that the configuration isn't applied

Possible Fix:

adding the checksum annotation in the deployment template should fix the issue.
I'll provide the pull request asap

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.