GithubHelp home page GithubHelp logo

dcos-labs / dcos-jupyterlab-service Goto Github PK

View Code? Open in Web Editor NEW
12.0 8.0 11.0 317 KB

JupyterLab Notebook for Mesosphere DC/OS

License: Apache License 2.0

Jupyter Notebook 9.55% Python 26.95% Shell 33.48% HTML 13.76% Lua 3.16% Dockerfile 13.11%
beakerx spark jupyter tensorflow mesos dcos toree xgboost gpu cudnn hdfs marathon keras pandas arrow dask mesosphere ray rllib jupyterlab

dcos-jupyterlab-service's Introduction

dcos-jupyter-service

JupyterLab Notebook Docker Image tailored for Mesosphere DC/OS

Docker images built with the Dockerfile herein will enable support for:

  • Apache Spark Apache Spark™ is a unified analytics engine for large-scale data processing.
  • BeakerX BeakerX is a collection of kernels and extensions to the Jupyter interactive computing environment. It provides JVM support, Spark cluster support, polyglot programming, interactive plots, tables, forms, publishing, and more.
  • Dask Dask is a flexible parallel computing library for analytic computing.
  • Distributed Dask.distributed is a lightweight library for distributed computing in Python. It extends both the concurrent.futures and dask APIs to moderate sized clusters.
  • JupyterLab JupyterLab is the next-generation web-based user interface for Project Jupyter.
  • PyTorch Tensors and Dynamic neural networks in Python with strong GPU acceleration. PyTorch is a deep learning framework for fast, flexible experimentation.
  • Ray Ray is a flexible, high-performance distributed execution framework.
    • Ray Tune: Hyperparameter Optimization Framework
    • Ray RLlib: Scalable Reinforcement Learning
  • TensorFlow TensorFlow™ is an open source software library for high performance numerical computation.
  • TensorFlowOnSpark TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters.
  • XGBoost Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more.

Also includes support for:

  • OpenID Connect Authentication and Authorization based on email address or User Principal Name (UPN) (for Windows Integrated Authentication and AD FS 4.0 with Windows Server 2016)
  • HDFS connectivity
  • S3 connectivity
  • GPUs with the <image>:<tag>-gpu Docker Image variant built from Dockerfile-cuDNN

Mesosphere Jupyter Service Docker Images for Mesosphere DC/OS: https://hub.docker.com/r/mesosphere/mesosphere-jupyter-service/tags/

Related Docker Images:

Built FROM: debian:jessie with Miniconda3

Made possible by and/or for:

dcos-jupyterlab-service's People

Contributors

deric avatar fabianbaier avatar joerg84 avatar vishnu2kmohan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dcos-jupyterlab-service's Issues

Add Additional Git Solution support

Please add support for additional git solutions to the base

jupyterlab-gitlab
jupyterlab-git

would be great if a section was added to provide a list of additional packages to install into the base kernel

Jupyterlab (strict mode) Marathon-lb

I'm running DCOS 1.11.6, with Jupyterlab 1.2.0-DEV.33.7 and Marathon-lb 1.12.3. DCOS is configured for strict mode, with necessary secrets and service accounts configured. Marathon-lb is pointing to the container port 8080 and not the vip port 8888. Health checks are failing Jupyterlab is unreachable through marathon-lb. All the other Jupyterlab containers are showing healthy in Marathon-lb (same container and vip ports in the marathon.json.mustache file). In the past I fixed this by setting the vip and container port to the same value (8080). Which seems to be convention with Marathon-lb.

Setting storage persistent volume to enabled breaks json

Setting persistent volume causes:

Unable to parse the string as a JSON value

Changing the value to string in JSON causes:

Options JSON failed validation

DCOS version 1.11.x

Snippet of the JSON affected:

  "storage": {
    "persistence": {
      "host_volume_size": 4000,
      "enable": true
    }
  },

full JSON

{
  "service": {
    "name": "/jupyterlab-notebook",
    "cmd": "/usr/local/bin/start.sh ${CONDA_DIR}/bin/jupyter lab --notebook-dir=${MESOS_SANDBOX}",
    "cpus": 2,
    "force_pull": false,
    "mem": 8192,
    "user": "nobody",
    "gpu_support": {
      "enabled": false,
      "gpus": 0
    }
  },
  "oidc": {
    "enable_oidc": false,
    "oidc_discovery_uri": "https://keycloak.example.com/auth/realms/notebook/.well-known/openid-configuration",
    "oidc_redirect_uri": "/oidc-redirect-callback",
    "oidc_client_id": "notebook",
    "oidc_client_secret": "b874f6e9-8f3f-41a6-a206-53e928d24fb1",
    "oidc_tls_verify": "no",
    "enable_windows": false,
    "oidc_use_email": false,
    "oidc_email": "[email protected]",
    "oidc_upn": "user007",
    "oidc_logout_path": "/logmeout",
    "oidc_post_logout_redirect_uri": "https://<VHOST>/<optional PATH_PREFIX>/<Service Name>",
    "oidc_use_spartan_resolver": true
  },
  "s3": {
    "aws_region": "us-east-1",
    "s3_endpoint": "s3.us-east-1.amazonaws.com",
    "s3_https": 1,
    "s3_ssl": 1
  },
  "spark": {
    "enable_spark_monitor": true,
    "spark_master_url": "mesos://zk://zk-1.zk:2181,zk-2.zk:2181,zk-3.zk:2181,zk-4.zk:2181,zk-5.zk:2181/mesos",
    "spark_driver_cores": 2,
    "spark_driver_memory": "6g",
    "spark_driver_java_options": "\"-server -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/mesos/sandbox\"",
    "spark_history_fs_logdirectory": "hdfs://hdfs/history",
    "spark_conf_spark_scheduler": "spark.scheduler.minRegisteredResourcesRatio=1.0",
    "spark_conf_cores_max": "spark.cores.max=5",
    "spark_conf_executor_cores": "spark.executor.cores=1",
    "spark_conf_executor_memory": "spark.executor.memory=6g",
    "spark_conf_executor_java_options": "spark.executor.extraJavaOptions=\"-server -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/mesos/sandbox\"",
    "spark_conf_eventlog_enabled": "spark.eventLog.enabled=false",
    "spark_conf_eventlog_dir": "spark.eventLog.dir=hdfs://hdfs/",
    "spark_conf_hadoop_fs_s3a_aws_credentials_provider": "spark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.InstanceProfileCredentialsProvider",
    "spark_conf_jars_packages": "spark.jars.packages=org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.1,org.apache.kafka:kafka_2.11:0.10.2.1",
    "spark_conf_mesos_executor_docker_image": "spark.mesos.executor.docker.image=dcoslabs/dcos-spark:1.11.4-2.2.1",
    "spark_conf_mesos_executor_home": "spark.mesos.executor.home=/opt/spark",
    "spark_conf_mesos_containerizer": "spark.mesos.containerizer=mesos",
    "spark_conf_mesos_driver_labels": "spark.mesos.driver.labels=DCOS_SPACE:",
    "spark_conf_mesos_task_labels": "spark.mesos.task.labels=DCOS_SPACE:",
    "spark_conf_executor_krb5_config": "spark.executorEnv.KRB5_CONFIG=/mnt/mesos/sandbox/krb5.conf",
    "spark_conf_executor_java_home": "spark.executorEnv.JAVA_HOME=/opt/jdk",
    "spark_conf_executor_hadoop_hdfs_home": "spark.executorEnv.HADOOP_HDFS_HOME=/opt/hadoop",
    "spark_conf_executor_hadoop_opts": "spark.executorEnv.HADOOP_OPTS=\"-Djava.library.path=/opt/hadoop/lib/native -Djava.security.krb5.conf=/mnt/mesos/sandbox/krb5.conf\"",
    "spark_conf_mesos_executor_docker_forcepullimage": "spark.mesos.executor.docker.forcePullImage=true",
    "spark_user": "nobody"
  },
  "storage": {
    "persistence": {
      "host_volume_size": 4000,
      "enable": true
    }
  },
  "networking": {
    "cni_support": {
      "enabled": true
    },
    "external_access": {
      "enabled": true,
      "external_public_agent_hostname": "somethinghere.com"
    }
  },
  "environment": {
    "secrets": false,
    "service_credential": "jupyterlab-notebook/serviceCredential",
    "conda_envs_path": "/mnt/mesos/sandbox/conda/envs:/opt/conda/envs",
    "conda_pkgs_dir": "/mnt/mesos/sandbox/conda/pkgs:/opt/conda/pkgs",
    "dcos_dir": "/mnt/mesos/sandbox/.dcos",
    "hadoop_conf_dir": "/mnt/mesos/sandbox",
    "home": "/mnt/mesos/sandbox",
    "java_opts": "\"-server -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/mesos/sandbox\"",
    "jupyter_conf_urls": "",
    "jupyter_config_dir": "/mnt/mesos/sandbox/.jupyter",
    "jupyter_password": "",
    "jupyter_runtime_dir": "/mnt/mesos/sandbox/.local/share/jupyter/runtime",
    "nginx_log_level": "warn",
    "start_dask_distributed": false,
    "start_ray_head_node": false,
    "start_spark_history_server": false,
    "start_tensorboard": false,
    "user": "nobody",
    "tensorboard_logdir": "hdfs://hdfs/",
    "term": "xterm-256color"
  }
}

Please provide instruction on how image can be customised

Please provide instruction on how to customise image and include additional C libs and/or Python dependencies.

Initial thought was to use GRANT_SUDO and install what I need in terminal just to be able to test the installation of C libs and Python dependencies. But looks like GRANT_SUDO is disabled?

Second thought was to include the additional dependencies into the Dockerfile and rebuild image and tag it for my own use. Before adding any additional things I've tried to build the image as it is but build fail with:

Traceback (most recent call last): File "/opt/conda/bin/conda", line 7, in <module> from conda.cli import main ModuleNotFoundError: No module named 'conda'

Upon investigation looks like it fail at the line ${CONDA_DIR}/bin/conda env update --json -q -f "${CONDA_DIR}/${CONDA_ENV_YML}"

Do I miss anything? How I can customise the image?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.