Categories

Versions

You are viewing the RapidMiner Deployment documentation for version 9.7 - Check here for latest version

Hadoop connectivity template

This template, while very similar to the basic production template, becomes relevant when the goal is to deploy RapidMiner processes that leverage big data from a Hadoop cluster by using RapidMiner Radoop. We offer the Radoop Proxy component to make network configuration easier in cases where the Hadoop cluster is behind a firewall.

Use it to deploy RapidMiner AI Hub on a single host, with the following components:

For a detailed description of every Docker image, see the image reference.

System requirements

Minimum recommended hardware configuration

The amount of memory needed depends heavily on the amount of data that will be processed by RapidMiner AI Hub. If most or all of the data is going to be processed in the Hadoop environment using Radoop, then 16GB is enough for the Server. If non-Radoop processes are going to be run in Server, then the recommendation is to increase the memory size to 32GB or more depending on the size of user data.

Each virtual or physical machine should at least have:

  • Quad core
  • 16GB RAM
  • >20GB free disk space

If you are using Docker Desktop for Windows (or Mac), please make sure that you have allocated enough memory. The default setting in Docker Desktop is too low for RapidMiner AI Hub.

Instructions

You will need the following two files, included in the ZIP file in step (1).

Proceed as follows:

  1. Download the template files

  2. Follow the general deployment instructions.

  3. Remember to set the variables PUBLIC_URL and SSO_PUBLIC_URL in the .env file.

The environment file (.env)

# ############################################
#
# Global parameters
#
# ############################################

# Public URL of the deployment that will be used for external access
PUBLIC_URL=https://platform.rapidminer.com

# Public URL of the SSO endpoint that will be used for external access. In most cases it should be the same as the PUBLIC_URL
SSO_PUBLIC_URL=https://platform.rapidminer.com

# Enable/disable the service build into the RapidMiner cloud images, that updates the PUBLIC_URL and SSO_PUBLIC_URL variables to the new dynamic cloud hostname/IP address

# Enable/disable the Legacy BASIC authentication support for REST endpoints, like webservices.
LEGACY_REST_BASIC_AUTH_ENABLED=false

# Timezone setting
TZ=UTC

# ############################################
#
# Deployment parameters
#
# ############################################

REGISTRY=rapidminer/
INIT_VERSION=9.7

# ############################################
#
# KeyCloak (SSO)
#
# ############################################

KEYCLOAK_VERSION=9.7
KC_DB=kcdb
KC_USER=kcdbuser
KC_PASSWORD=kcdbpass
KEYCLOAK_USER=admin
KEYCLOAK_PASSWORD=changeit
SSO_IDP_REALM=master
SSO_SSL_REQUIRED=external
USER_MIGRATION_ENABLED=True
USER_MIGRATION_DRY_RUN=False

# ############################################
#
# Rapidminer server
#
# ############################################

SERVER_VERSION=9.7
SERVER_DBHOST=rm-postgresql-svc
SERVER_DBSCHEMA=rapidminer-server-db
SERVER_DBUSER=rmserver-db-user
SERVER_DBPASS=w61J784XSb24K4LRV97MbE16i8xa9O
SERVER_MAX_MEMORY=2048M
RMSERVER_SSO_CLIENT_ID=urn:rapidminer:server
RMSERVER_SSO_CLIENT_SECRET=
RAPIDMINER_SERVER_HOST=rm-server-svc
RAPIDMINER_SERVER_PORT=8080
RAPIDMINER_SERVER_URL=http://rm-server-svc:8080
AUTH_SECRET=TTY5MjUxbzRBN2ZIWThpNGVKNGo4V2xqOHk0dTNV
BROKER_ACTIVEMQ_USERNAME=amq-user
BROKER_ACTIVEMQ_PASSWORD=M69251o4A7fHY8i4eJ4j8Wlj8y4u3U

# ############################################
#
# Job Agent
#
# ############################################

JOBAGENT_QUEUE_ACTIVEMQ_URI=failover:(tcp://rm-server-svc:5672)
JOBAGENT_CONTAINER_COUNT=2
JOB_QUEUE=DEFAULT
JOBAGENT_CONTAINER_MEMORYLIMIT=2048
RAPIDMINER_JOBAGENT_OPTS="-Djobagent.python.registryBaseUrl=http://platform-admin-webui-svc:82/"
RAPIDMINER_SERVER_PROTOCOL=http

# ############################################
#
# Proxy
#
# ############################################

PROXY_VERSION=9.7
JUPYTER_BACKEND=http://rm-jupyterhub-svc:8000
JUPYTER_URL_SUFFIX=/jupyter
GRAFANA_BACKEND=http://rm-grafana-svc:3000
GRAFANA_URL_SUFFIX=/grafana
PA_BACKEND=http://platform-admin-webui-svc:82/
PA_URL_SUFFIX=/platform-admin
RTS_WEBUI_BACKEND=http://platform-admin-webui-svc:82/
RTS_WEBUI_URL_SUFFIX=/rts-admin
RTS_SCORING_BACKEND=http://rts-agent-svc:8090/
RTS_SCORING_URL_SUFFIX=/rts
KEYCLOAK_BACKEND=http://rm-keycloak-svc:8080
LANDING_BACKEND=http://landing-page
TOKEN_BACKEND=http://rm-token-tool-svc
TOKEN_URL_SUFFIX=/get-token

HTTPS_CRT_PATH=/etc/nginx/ssl/certificate.crt
HTTPS_KEY_PATH=/etc/nginx/ssl/private.key
HTTPS_DH_PATH=/etc/nginx/ssl/dhparam.pem

# ############################################
#
# Radoop Proxy
#
# ############################################

RADOOP_PROXY_VERSION=1.2.1
# Authentication: 'server|jwt|superuser'
RADOOP_PROXY_AUTHENTICATION=superuser
RADOOP_PROXY_SUPERUSERNAME=proxyadmin
RADOOP_PROXY_SUPERUSERPASSWORD=changeit
RADOOP_PROXY_PORT=1081
RADOOP_PROXY_WORKERSPOOLSIZE=100
RADOOP_PROXY_SSL="off"

# ############################################
#
# Platform admin
#
# ############################################

PA_VERSION=9.7
PA_SSO_CLIENT_ID=urn:rapidminer:platform-admin
PA_SSO_CLIENT_SECRET=
PA_DISABLE_PYTHON=false
PA_DISABLE_RTS=false


# ############################################
#
# Landing page
#
# ############################################

RM_LANDING_VERSION=9.7
LANDING_SSO_CLIENT_ID=urn:rapidminer:landing-page
LANDING_SSO_CLIENT_SECRET=

# ############################################
#
# Token Tool
#
# ############################################

TOKEN_SSO_CLIENT_ID=urn:rapidminer:token-tool
TOKEN_SSO_CLIENT_SECRET=

The docker-compose definition (docker-compose.yml)

version: '3'
services:
  rm-proxy-svc:
    image: ${REGISTRY}rapidminer-proxy:${PROXY_VERSION}
    hostname: rm-proxy-svc
    restart: always
    environment:
      - KEYCLOAK_BACKEND=${KEYCLOAK_BACKEND}
      - RMSERVER_BACKEND=${RAPIDMINER_SERVER_URL}
      - JUPYTER_BACKEND=${JUPYTER_BACKEND}
      - JUPYTER_URL_SUFFIX=${JUPYTER_URL_SUFFIX}
      - GRAFANA_BACKEND=${GRAFANA_BACKEND}
      - GRAFANA_URL_SUFFIX=${GRAFANA_URL_SUFFIX}
      - PA_BACKEND=${PA_BACKEND}
      - PA_URL_SUFFIX=${PA_URL_SUFFIX}
      - TOKEN_BACKEND=${TOKEN_BACKEND}
      - TOKEN_URL_SUFFIX=${TOKEN_URL_SUFFIX}
      - RTS_WEBUI_BACKEND=${RTS_WEBUI_BACKEND}
      - RTS_WEBUI_URL_SUFFIX=${RTS_WEBUI_URL_SUFFIX}
      - RTS_SCORING_BACKEND=${RTS_SCORING_BACKEND}
      - RTS_SCORING_URL_SUFFIX=${RTS_SCORING_URL_SUFFIX}
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
      - SSO_IDP_REALM=${SSO_IDP_REALM}
      - RTS_BASIC_AUTH=true
      - LANDING_BACKEND=${LANDING_BACKEND}
      - HTTPS_CRT_PATH=${HTTPS_CRT_PATH}
      - HTTPS_KEY_PATH=${HTTPS_KEY_PATH}
      - HTTPS_DH_PATH=${HTTPS_DH_PATH}
    ports:
      - 80:80
      - 443:443
    networks:
      rm-platform-int-net:
        aliases:
          - rm-proxy-svc
    volumes:
      - ./ssl:/etc/nginx/ssl
      - platform-admin-uploaded-vol:/rapidminer/platform-admin/uploaded/
  rm-keycloak-db-svc:
    image: postgres:9.6
    restart: always
    hostname: rm-keycloak-db-svc
    environment:
      - POSTGRES_DB=${KC_DB}
      - POSTGRES_USER=${KC_USER}
      - POSTGRES_PASSWORD=${KC_PASSWORD}
    volumes:
      - keycloak-postgresql-vol:/var/lib/postgresql/data
    networks:
      rm-idp-db-net:
        aliases:
          - rm-keycloak-db-svc
  rm-keycloak-svc:
    image: ${REGISTRY}rapidminer-keycloak:${KEYCLOAK_VERSION}
    restart: always
    hostname: rm-keycloak-svc
    environment:
      - DB_VENDOR=POSTGRES
      - DB_ADDR=rm-keycloak-db-svc
      - DB_DATABASE=${KC_DB}
      - DB_USER=${KC_USER}
      - DB_SCHEMA=public
      - DB_PASSWORD=${KC_PASSWORD}
      - KEYCLOAK_USER=${KEYCLOAK_USER}
      - KEYCLOAK_PASSWORD=${KEYCLOAK_PASSWORD}
      - PROXY_ADDRESS_FORWARDING=true
    depends_on:
      - rm-keycloak-db-svc
      - rm-proxy-svc
    networks:
      rm-platform-int-net:
        aliases:
          - rm-keycloak-svc
      rm-idp-db-net:
        aliases:
          - rm-keycloak-svc
  rm-init-svc:
    image: ${REGISTRY}rapidminer-deployment-init:${INIT_VERSION}
    restart: 'no'
    hostname: rm-keycloak-init-svc
    depends_on:
      - rm-keycloak-svc
      - rm-postgresql-svc
    environment:
      - LEGACY_REST_BASIC_AUTH_ENABLED=${LEGACY_REST_BASIC_AUTH_ENABLED}
      - PUBLIC_URL=${PUBLIC_URL}
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
    volumes:
      - ./.env:/.env
      - ./docker-compose.yml:/docker-compose.yml:ro
      - keycloak-admin-cli-vol:/root/.keycloak/
      - deployed-services-vol:/rapidminer/deployed-services/
    networks:
      rm-platform-int-net:
        aliases:
          - rm-init-svc
      rm-server-db-net:
        aliases:
          - rm-init-svc
  rm-postgresql-svc:
    image: postgres:9.6
    hostname: rm-postgresql-svc
    restart: always
    environment:
      - POSTGRES_DB=${SERVER_DBSCHEMA}
      - POSTGRES_USER=${SERVER_DBUSER}
      - POSTGRES_PASSWORD=${SERVER_DBPASS}
    volumes:
      - rm-postgresql-vol:/var/lib/postgresql/data
    networks:
      rm-server-db-net:
        aliases:
          - rm-postgresql-svc
  rm-server-svc:
    image: ${REGISTRY}rapidminer-server:${SERVER_VERSION}
    hostname: rm-server-svc
    restart: always
    environment:
      - PA_BASE_URL=${PA_BACKEND}
      - PA_SYNC_DEBUG=False
      - DBHOST=${SERVER_DBHOST}
      - DBSCHEMA=${SERVER_DBSCHEMA}
      - DBUSER=${SERVER_DBUSER}
      - DBPASS=${SERVER_DBPASS}
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
      - SSO_IDP_REALM=${SSO_IDP_REALM}
      - SSO_CLIENT_ID=${RMSERVER_SSO_CLIENT_ID}
      - SSO_CLIENT_SECRET=${RMSERVER_SSO_CLIENT_SECRET}
      - SSO_SSL_REQUIRED=${SSO_SSL_REQUIRED}
      - LEGACY_REST_BASIC_AUTH_ENABLED=${LEGACY_REST_BASIC_AUTH_ENABLED}
      - SERVER_MAX_MEMORY=${SERVER_MAX_MEMORY}
      - BROKER_ACTIVEMQ_USERNAME=${BROKER_ACTIVEMQ_USERNAME}
      - BROKER_ACTIVEMQ_PASSWORD=${BROKER_ACTIVEMQ_PASSWORD}
      - JOBSERVICE_AUTH_SECRET=${AUTH_SECRET}
      - JUPYTER_URL_SUFFIX=${JUPYTER_URL_SUFFIX}
      - GRAFANA_URL_SUFFIX=${GRAFANA_URL_SUFFIX}
      - TZ=${TZ}
    volumes:
      - rm-server-bootstrap-vol:/bootstrap.d
      - rm-server-home-vol:/persistent-rapidminer-home
    depends_on:
      - rm-postgresql-svc
    networks:
      rm-platform-int-net:
        aliases:
          - rm-server-svc
      rm-server-db-net:
        aliases:
          - rm-server-svc
  rm-server-job-agent-svc:
    image: ${REGISTRY}rapidminer-execution-jobagent:${SERVER_VERSION}
    hostname: rm-server-job-agent-svc
    restart: always
    environment:
      - RAPIDMINER_SERVER_HOST=${RAPIDMINER_SERVER_HOST}
      - RAPIDMINER_SERVER_PROTOCOL=${RAPIDMINER_SERVER_PROTOCOL}
      - RAPIDMINER_SERVER_PORT=${RAPIDMINER_SERVER_PORT}
      - JOBAGENT_QUEUE_ACTIVEMQ_URI=${JOBAGENT_QUEUE_ACTIVEMQ_URI}
      - JOBAGENT_QUEUE_ACTIVEMQ_USERNAME=${BROKER_ACTIVEMQ_USERNAME}
      - JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD=${BROKER_ACTIVEMQ_PASSWORD}
      - JOBAGENT_AUTH_SECRET=${AUTH_SECRET}
      - JOBAGENT_CONTAINER_COUNT=${JOBAGENT_CONTAINER_COUNT}
      - JOB_QUEUE=${JOB_QUEUE}
      - JOBAGENT_CONTAINER_MEMORYLIMIT=${JOBAGENT_CONTAINER_MEMORYLIMIT}
      - RAPIDMINER_JOBAGENT_OPTS=${RAPIDMINER_JOBAGENT_OPTS}
      - TZ=${TZ}
    volumes:
      - rm-server-bootstrap-ja-vol:/bootstrap.d
    depends_on:
      - rm-server-svc
    networks:
      rm-platform-int-net:
        aliases:
          - rm-server-job-agent-svc
  platform-admin-webui-svc:
    image: ${REGISTRY}rapidminer-platform-admin-webui:${PA_VERSION}
    hostname: platform-admin-webui-svc
    restart: always
    environment:
      - PA_URL_SUFFIX=${PA_URL_SUFFIX}
      - RTS_SCORING_URL_SUFFIX=${RTS_SCORING_URL_SUFFIX}
      - RTS_SCORING_BACKEND=${RTS_SCORING_BACKEND}
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
      - SSO_IDP_REALM=${SSO_IDP_REALM}
      - SSO_CLIENT_ID=${PA_SSO_CLIENT_ID}
      - SSO_CLIENT_SECRET=${PA_SSO_CLIENT_SECRET}
      - PA_DISABLE_PYTHON=${PA_DISABLE_PYTHON}
      - PA_DISABLE_RTS=${PA_DISABLE_RTS}
      - DEBUG=false
    volumes:
      - platform-admin-uploaded-vol:/var/www/html/uploaded/
    networks:
      jupyterhub-user-net:
        aliases:
          - platform-admin-webui-svc
      rm-platform-int-net:
        aliases:
          - platform-admin-webui-svc
  rm-radoop-proxy-svc:
    image: ${REGISTRY}radoop-proxy:${RADOOP_PROXY_VERSION}
    hostname: rm-radoop-proxy-svc
    restart: always
    environment:
      - AUTHENTICATION=${RADOOP_PROXY_AUTHENTICATION}
      - SUPERUSERNAME=${RADOOP_PROXY_SUPERUSERNAME}
      - SUPERUSERPASSWORD=${RADOOP_PROXY_SUPERUSERPASSWORD}
      - PORT=${RADOOP_PROXY_PORT}
      - WORKERSPOOLSIZE=${RADOOP_PROXY_WORKERSPOOLSIZE}
      - SSL=${RADOOP_PROXY_SSL}
      - SERVERHOST=${RAPIDMINER_SERVER_HOST}
      - SERVERPORT=${RAPIDMINER_SERVER_PORT}
    ports:
      - ${RADOOP_PROXY_PORT}:${RADOOP_PROXY_PORT}
  landing-page:
    image: ${REGISTRY}rapidminer-deployment-landing-page:${RM_LANDING_VERSION}
    restart: always
    hostname: landing-page
    environment:
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
      - SSO_IDP_REALM=${SSO_IDP_REALM}
      - SSO_CLIENT_ID=${LANDING_SSO_CLIENT_ID}
      - SSO_CLIENT_SECRET=${LANDING_SSO_CLIENT_SECRET}
      - DEBUG=false
    volumes:
      - rm-landing-page-vol:/var/www/html/uploaded/
      - deployed-services-vol:/rapidminer/deployed-services/
    networks:
      rm-platform-int-net:
        aliases:
          - landing-page
  rm-token-tool-svc:
    image: ${REGISTRY}rapidminer-deployment-landing-page:${RM_LANDING_VERSION}
    restart: always
    hostname: rm-token-tool
    environment:
      - PUBLIC_URL=${PUBLIC_URL}
      - SSO_PUBLIC_URL=${SSO_PUBLIC_URL}
      - SSO_IDP_REALM=${SSO_IDP_REALM}
      - SSO_CLIENT_ID=${TOKEN_SSO_CLIENT_ID}
      - SSO_CLIENT_SECRET=${TOKEN_SSO_CLIENT_SECRET}
      - DEBUG=false
      - SSO_CUSTOM_SCOPE=openid info offline_access
      - CUSTOM_URL_SUFFIX=${TOKEN_URL_SUFFIX}
      - CUSTOM_CONTENT=get-token
    volumes:
      - rm-token-tool-vol:/var/www/html/uploaded/
    networks:
      rm-platform-int-net:
        aliases:
          - rm-token-tool

volumes:
  rm-postgresql-vol:
  rm-server-bootstrap-vol:
  rm-server-home-vol:
  rm-server-bootstrap-ja-vol:
  platform-admin-uploaded-vol:
  keycloak-postgresql-vol:
  keycloak-admin-cli-vol:
  rm-landing-page-vol:
  rm-token-tool-vol:
  deployed-services-vol:

networks:
  rm-platform-int-net:
  rm-idp-db-net:
  rm-server-db-net: