GithubHelp home page GithubHelp logo

gesiscss / persistent_binderhub Goto Github PK

View Code? Open in Web Editor NEW
19.0 14.0 16.0 1.3 MB

A Helm chart repo to install persistent BinderHub

Home Page: https://gesiscss.github.io/persistent_binderhub/

License: BSD 3-Clause "New" or "Revised" License

Python 39.98% HTML 32.47% Smarty 3.76% CSS 4.66% Dockerfile 0.68% Jupyter Notebook 4.65% JavaScript 13.79%
binderhub persistent-storage helm-chart binder jupyterhub kubernetes

persistent_binderhub's Introduction

This repository is no longer actively maintained!

However, the development of this functionality continues in cooperation with 2i2c.org. For more information, see our blog post:

๐Ÿ‘‰ https://2i2c.org/blog/2022/gesis-2i2c-collaboration-update/

Persistent BinderHub

This is a Helm chart to install a persistent BinderHub. It simply configures and extends BinderHub chart to add persistent storage, it doesn't define any new component. Therefore before using this chart it is required that you read through BinderHub documentation, you know how to deploy a standard BinderHub and you are familiar with enabling authentication in BinderHub.

Prerequisites

First of all create a config.yaml file, everything explained here can go into that file and then it will be used for the installation. Before the installation there are configurations required to be done for User storage and Authentication.

User storage

To be able to offer a persistent storage to users, in your kubernetes cluster you need to have a storage class defined, which dynamically provisions persistent volumes. Please follow the user storage documentation of JupyterHub chart for more information.

Note that any configuration for JupyterHub chart goes under binderhub.jupyterhub in config.yaml that you created to install this chart. For example, if you want to specify the storage class, you have to add the following into your config.yaml:

binderhub:
  jupyterhub:
    singleuser:
      storage:
        dynamic:
          storageClass: <storageclass-name>

Authentication

This chart already includes some of the required changes for enabling authentication. But there are pieces that have to be manually configured. In your config.yaml:

  1. You have to set oauth_client_id:
binderhub:
  jupyterhub:
    hub:
      services:
        binder:
          # this is the default value
          oauth_client_id: "binder-oauth-client-test"
  1. You have to use config of your authenticator for binderhub.jupyterhub.auth. For more information you can check the authentication guide.
binderhub:
  jupyterhub:
    hub:
      config:
        JupyterHub:
          authenticator_class: dummy

Note that by default the authenticator is DummyAuthenticator and it is recommended to use it only for testing purposes.

Installing the chart

First of all you can find the list of charts here: https://gesiscss.github.io/persistent_binderhub/

The installation consists of 2 steps. As a first step we install the chart, then we will finalize the configuration of the Binder service and upgrade the chart to apply final changes in the config.

To install the chart with the release name pbhub into namespace pbhub-ns:

# add the persistent_binderhub helm chart repo
helm repo add persistent_binderhub https://gesiscss.github.io/persistent_binderhub/
# update charts
helm repo update

# you can change release name and namespace as you want
RELEASENAME=pbhub
NAMESPACE=pbhub-ns
kubectl create namespace $NAMESPACE
helm upgrade $RELEASENAME persistent_binderhub/persistent_binderhub \
             --version=0.2.0-n919 \
             --install --namespace=$NAMESPACE \
             --debug \
             -f config.yaml

After the first step, run kubectl get service proxy-public --namespace=$NAMESPACE and note down the IP address under EXTERNAL-IP, which is the IP of the JupyterHub. Then run kubectl get service binder --namespace=$NAMESPACE and again note down the IP address under EXTERNAL-IP, which is the IP of the Binder service.

With the IP addresses you just acquired update your config.yaml:

binderhub:
  jupyterhub:
    hub:
      services:
        binder:
          # where binder runs
          url: "http://<Binder_IP>"
          # when url is set, binder can be reached through JupyterHub
          oauth_redirect_uri: "http://<JupyterHub_IP>/services/binder/oauth_callback"

Finally upgrade the chart to apply this change:

helm upgrade $RELEASENAME persistent_binderhub/persistent_binderhub \
             --version=0.2.0-n919 \
             --install --namespace=$NAMESPACE \
             --debug \
             -f config.yaml

When the installation is done, the persistent BinderHub will be available at "http://<JupyterHub_IP>", and there (at JupyterHub home page) you will see a customized BinderHub UI for persistence, which is the place that users will interact with the system mostly. The standard BinderHub will be available at "http://<JupyterHub_IP>/services/binder" as a service of JupyterHub.

Known issues

  1. If you don't know the url of the JupyterHub (binderhub.config.BinderHub.hub_url) and it is not set during the first step of the installation, you will get an error similar to Error: render error in "persistent_binderhub/charts/binderhub/templates/deployment.yaml": template: persistent_binderhub/charts/binderhub/templates/deployment.yaml:98:74: executing "persistent_binderhub/charts/binderhub/templates/deployment.yaml" at <"/">: invalid value; expected string
    To fix it, you can use a dummy value for the hub_url, e.g. "http://127.0.0.1",
    and after the first step when you have the correct url of the hub, you can replace it.
    GitHub issue: #5
    Potential fix: jupyterhub/binderhub#1139

Uninstalling the chart

# to delete the Helm release
helm delete $RELEASENAME --purge
# to delete the Kubernetes namespace
kubectl delete namespace $NAMESPACE

Customization

As mentioned before, this chart extends the BinderHub chart in order to bring persistency in. To do that this chart also uses extraConfig from JupyterHub and BinderHub charts. While using persistent BinderHub chart, you should use another name for your extraConfigs, unless you want to overwrite defaults of this chart and you know what you are doing. Here is the list of extraConfigs used:

  • binderhub.extraConfig:

    • 20-launcher
    • 10-repo-providers
  • binderhub.jupyterhub.hub.extraConfig:

    • 20-template-variables
    • 10-project-api
    • 00-binder

For more information check values.yaml.

BinderHub customization

Anything you want to customize in BinderHub chart you can refer to the BinderHub documentation. The only thing you have to pay attention is that you put that configs under binderhub in your config.yaml. For example, if you want to use another version of repo2docker to build repos, add following into your config.yaml:

binderhub:
  config:
    BinderHub:
      build_image: quay.io/jupyterhub/repo2docker:2021.08.0-8.gf1d01b6

Note: repo2docker:2021.08.0-8.gf1d01b6 is the repo2docker version used in this chart.

Default project

A project is simply a binder-ready repository that you launch in a persistent BinderHub.

Default project is this repo itself by default (check .binder folder, there is intro_to_persistent_binderhub notebook).

Assuming that you have a binder-ready repo with the following information

  • repo url: https://github.com/user_name/repo_name
  • branch or tag or commit: ref

you can set it as default project by adding the following into your config.yaml:

binderhub:
  jupyterhub:
    custom:
      default_project:
        repo_url: "https://github.com/user_name/repo_name"
        ref: "ref"

Warning: Default project must be a binder-ready repo, e.g. https://github.com/binder-examples/requirements.

Projects limit per user

Number of projects concurrently stored per user. By default it is 5. For example, if you want to increase it to 10, add the following into your config.yaml:

binderhub:
  extraEnv:
    - name: PROJECTS_LIMIT_PER_USER
      value: "10" # change this value as you wish

Repo providers

Only the following repo providers are supported in this chart:

  • GitHubRepoProvider
  • GistRepoProvider
  • GitLabRepoProvider
  • GitRepoProvider

Other providers (ZenodoProvider, FigshareProvider, HydroshareProvider, DataverseProvider) are currently not supported. If you enable them, persistent BinderHub is not going to work as expected for these providers.

For example, if you want to only enable GitHubRepoProvider and GistRepoProvider, add the following into your config.yaml:

binderhub:
  extraConfig:
    10-repo-providers: |
      from binderhub.repoproviders import GitHubRepoProvider, GistRepoProvider
      c.BinderHub.repo_providers = {
          'gh': GitHubRepoProvider,
          'gist': GistRepoProvider,
      }

Spawner

This chart uses the PersistentBinderSpawner. If you want to customize it, you can subclass it in extraConfig. For example:

binderhub:
  jupyterhub:
    hub:
      extraConfig:
        00-binder: |
          from persistent_bhub_config import PersistentBinderSpawner
          MyPersistentBinderSpawner(PersistentBinderSpawner):
            ...
          c.JupyterHub.spawner_class = MyPersistentBinderSpawner

Local development

In local/minikube folder you can find instructions and configuration file to install this chart in minikube.

Migrating from JupyterHub chart

Be aware that this is not tested widely, but it should be safe to migrate from JupyterHub chart to this chart.

Here are the differences compared to fresh installation of this chart:

  • after migration, existing users will have no default project in the beginning
  • files of existing users won't be copied to anywhere, existing users can find them under /projects dir and they should manage them manually via terminal

Limitations

  1. Binder pod (binderhub.replicas) must be 1, otherwise there are authentication errors (jupyterhub/jupyterhub#2841).

Funded by the German Research Foundation (DFG). FKZ/project number: 324867496.

persistent_binderhub's People

Contributors

arnim avatar bitnik avatar ctr26 avatar g-braeunlich avatar larsbonczek avatar mriduls avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

persistent_binderhub's Issues

PodUnschedulable Cannot schedule pods: pod has unbound immediate PersistentVolumeClaims.

Hi all, I'm running into the following error while launching a repo. Any ideas on how to resolve this. Thanks
Repo: https://github.com/binder-examples/r
My config.yaml:

binderhub:
  config:
    BinderHub:
      hub_url: http://<hubip>
      use_registry: true
      image_prefix: <dockerusername/binderhub->

  jupyterhub:
    hub:
      services:
        binder:
          # where binder runs
          url: "http://<binderip>"        
          apiToken: <token>
          oauth_redirect_uri: "http://<hubip>/services/binder/oauth_callback"
          oauth_client_id: "binder-oauth-client-test"

      config:
        JupyterHub:
          authenticator_class: dummy      

    proxy:
      secretToken: <token>

    singleuser:
      storage:
        dynamic:
          storageClass: <name for my storage class>

  registry:
    username: <dockerusername>
    password: <password>

Additional resources (GPUs)

Has it been discussed to add a UI option to launch a project with additional resources? Similar to the profile list in JupyterHub.

Preserve custom init_containers

Would it be possible to not override self.init_containers but only add the project-manager init container in persistent_bhub_config.py?
E.g.

if not any(ic["name"] == "project-manager" for ic in self.init_containers):
    self.init_containers.append({
            "name": "project-manager",
            "image": self.image,
            "command": command,
            # volumes is already defined for notebook container (self.volumes)
            "volume_mounts": [projects_volume_mount],
        })

Connection Failed

I've deployed the pbhub on gke following the readme. I created two dummy users the following error occurs when the second user launches the same project.
image

support all repo providers

### Repo providers
Only the following repo providers are supported in this chart:
- GitHubRepoProvider
- GistRepoProvider
- GitLabRepoProvider
- GitRepoProvider
Other providers (ZenodoProvider, FigshareProvider, HydroshareProvider, DataverseProvider) are currently not supported.
If you enable them, persistent BinderHub is not going to work as expected for these providers.

Here are the pieces to update:

  1. binderhub.extraConfig.10-repo-providers in values.yaml
  2. projects_form.html
  3. PersistentBinderSpawner
  4. persistent_bhub.js

Default image build failure

Launching the default image gesiscss/persistent_binderhub results in

Waiting for build to start...
Picked Git content provider.
Cloning into '/tmp/repo2dockert5t4b9he'...
HEAD is now at da46abd persistent_binderhub-0.2.0-n919.tgz
Using DockerBuildPack builder
Step 1/9 : FROM jupyter/base-notebook:hub-1.4.2
 ---> 3059097d7b03e...
Step 2/9 : COPY intro_to_persistent_binderhub.ipynb ${HOME}/intro_to_persistent_binderhub.ipynb
COPY failed: file not found in build context or excluded by .dockerignore: stat intro_to_persistent_binderhub.ipynb: file does not existBuilt image, launching...
Failed to connect to event stream

on my local machiene

Instructions for securing pbhub with https

I'm trying to follow the binderhub documentation "https://binderhub.readthedocs.io/en/latest/https.html" for settings up https with pbuhup. After this step https://binderhub.readthedocs.io/en/latest/https.html#ingress-proxy-using-nginx the external ip address of the nginxingress is stuck at pending.
image

Here is the output of the kubectl describe. Any suggestions on how to proceed with this. Thanks
W0928 14:05:26.697830 9903 gcp.go:119] WARNING: the gcp auth plugin is deprecated in v1.22+, unavailable in v1.26+; use gcloud instead.
To learn more, consult https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
Name: pbhub-proxy-nginx-ingress
Namespace: pbhub-ns
Labels: app.kubernetes.io/instance=pbhub-proxy
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=pbhub-proxy-nginx-ingress
helm.sh/chart=nginx-ingress-0.14.1
Annotations: cloud.google.com/neg: {"ingress":true}
meta.helm.sh/release-name: pbhub-proxy
meta.helm.sh/release-namespace: pbhub-ns
Selector: app=pbhub-proxy-nginx-ingress
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.16.7.7
IPs: 10.16.7.7
IP: 34.94.96.171
Port: http 80/TCP
TargetPort: 80/TCP
NodePort: http 32137/TCP
Endpoints: 10.12.0.40:80
Port: https 443/TCP
TargetPort: 443/TCP
NodePort: https 30319/TCP
Endpoints: 10.12.0.40:443
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 32685
Events:
Type Reason Age From Message


Normal EnsuringLoadBalancer 48s (x9 over 16m) service-controller Ensuring load balancer
Warning SyncLoadBalancerFailed 48s (x9 over 16m) service-controller Error syncing load balancer: failed to ensure load balancer: requested ip "34.94.96.171" is neither static nor assigned to the LB

Issues installing persistent binderhub chart

Hi folks,

I was trying to have a play with the persistent binderhub today but I couldn't get it installed (following instructions in the readme).

Here's the install command and error:

$ helm install persistent_binderhub/persistent_binderhub --version=0.2.0-n181 --name pbhub --namespace pbhub -f config.yaml --debug
[debug] Created tunnel using local port: '65528'

[debug] SERVER: "127.0.0.1:65528"

[debug] Original chart version: "0.2.0-n181"
[debug] Fetched persistent_binderhub/persistent_binderhub to /Users/sgibson/.helm/cache/archive/persistent_binderhub-0.2.0-n181.tgz

[debug] CHART PATH: /Users/sgibson/.helm/cache/archive/persistent_binderhub-0.2.0-n181.tgz

Error: render error in "persistent_binderhub/charts/binderhub/templates/deployment.yaml": template: persistent_binderhub/charts/binderhub/templates/deployment.yaml:98:74: executing "persistent_binderhub/charts/binderhub/templates/deployment.yaml" at <"/">: invalid value; expected string

And my config file looks like this:

binderhub:
  jupyterhub:
    hub:
      services:
        binder:
          apiToken: "redacted"
    proxy:
      secretToken: "redacted"
    singleuser:
      storage:
        dynamic:
          storageClass: azurefile
  registry:
    username: sgibson91
    password: "redacted"

Looks to me like a templating error somewhere into the binderhub chart, but I can deploy the standard binderhub chart without issue. Also a / character does not exist anywhere in the redacted secrets, so I'm not quite sure where that error comes from.

Any help would be greatly appreciated, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.