Categories

Versions

You are viewing the RapidMiner Deployment documentation for version 9.8 - Check here for latest version

Docker images for JupyterHub

The documentation below describes our integrated JupyterHub instance, which is a component consisting of the following Docker images:

You can read a description for each container below.

These containers are only functional when deployed together, they will not function as intended individually. Check our deployment templates to see how these should be deployed.

JupyterHub DB

This container implements a Postgres database which serves as the configuration store for the JupyterHub backend. This is a standard PostgreSQL docker image.

Configuration

  • Volumes
    • jupyterhub-postgresql-vol: docker volume to persist the database data, maps internally to /var/lib/postgresql/data
  • Ports: none.
  • Environment variables:
    • POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD: credentials and database name where the JupyterHub data will be stored. The same values should be provided for the JupyterHub backend container.

JupyterHub Backend

This container implements the JupyterHub backend, which implements the core of the JupyterHub service. It serves the authentication component for the notebooks, handles user management, and manages the lifecycle of the containers which are running the notebook environments for each authenticated user.

For available versions, please see the tags on Docker Hub.

Configuration

  • Volumes
    • /var/run/docker.sock: volume mount for the Docker socket used for the platform deployment.
  • Ports: none.
  • Environment variables:
    • JHUB_HOSTNAME: internal hostname of the backend service, needed for communication of various components in the deployment.
    • SERVER_BASE_URL: internal hostname and port of the RapidMiner Server instance present in the deployment.
    • POSTGRES_HOST: internal hostname of the database service used for JupyterHub.
    • POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD: database name and credentials for the JupyterHub DB. Must have the same values configured as above.
    • DOCKER_NOTEBOOK_IMAGE: name and tag of the Docker image which will be spawned for each JupyterHub user after login.
    • JUPYTERHUB_CRYPT_KEY: secret key used to encrypt user data in the JupyterHub DB.
    • DOCKER_NOTEBOOK_CPU_LIMIT, DOCKER_NOTEBOOK_MEM_LIMIT: amount of resources each user's notebook container can use. CPU limit can be expressed in percentage values, 100 being one CPU core. Memory limit can be expressed using a number and unit of measurement, e.g. 2g meaning 2 Gigabytes of memory.
    • JUPYTER_STACK_NAME: name of the Jupyter stack, default value is default. Should not be altered in typical deployment scenarios.
    • SSO_PUBLIC_URL, SSO_IDP_REALM, SSO_CLIENT_ID, SSO_CLIENT_SECRET: RapidMiner Identity and Security configuration. Filled automatically by the init service.
    • PUBLIC_URL: the public URL of the deployment.
    • JUPYTER_URL_SUFFIX: the URL suffix where JupyterHub will be served. The RapidMiner Proxy will redirect requests arriving to this suffix to the JupyterHub backend service.
    • JHUB_DEBUG, JHUB_TOKEN_DEBUG, JHUB_PROXY_DEBUG, JHUB_DB_DEBUG, JHUB_SPAWNER_DEBUG: set to True to enable debug level logging for the respective containers. Default is False. Remember to turn them off when not needed to increase performance.

JupyterHub Notebook

This container implements the container which will be instantiated for each user when they log in to JupyterHub. This container will serve the JupyterLab user interface and the default Python environment.

For available versions, please see the tags on Docker Hub.

Configuration

  • Volumes: once a user logs in to JupyterHub, a new user container will be created for them to persist their work and optional custom Python environments. The user containers are stored under docker volumes named according to this pattern: jupyterhub-user-<username>-<JUPYTER_STACK_NAME>.
  • Ports: none.
  • Environment variables: none.