You are viewing the RapidMiner Deployment documentation for version 9.7 - Check here for latest version
Hadoop connectivity template
This template, while very similar to the basic production template, becomes relevant when the goal is to deploy RapidMiner processes that leverage big data from a Hadoop cluster by using RapidMiner Radoop. We offer the Radoop Proxy component to make network configuration easier in cases where the Hadoop cluster is behind a firewall.
Use it to deploy RapidMiner AI Hub on a single host, with the following components:
- 1 RapidMiner AI Hub instance
- 3 RapidMiner Job Agents
- Postgres database
- Platform Admin
- Radoop Proxy
- 1 KeyCloak instance
For a detailed description of every Docker image, see the image reference.
System requirements
Minimum recommended hardware configuration
The amount of memory needed depends heavily on the amount of data that will be processed by RapidMiner AI Hub. If most or all of the data is going to be processed in the Hadoop environment using Radoop, then 16GB is enough for the Server. If non-Radoop processes are going to be run in Server, then the recommendation is to increase the memory size to 32GB or more depending on the size of user data.
Each virtual or physical machine should at least have:
- Quad core
- 16GB RAM
- >20GB free disk space
If you are using Docker Desktop for Windows (or Mac), please make sure that you have allocated enough memory. The default setting in Docker Desktop is too low for RapidMiner AI Hub.
Instructions
You will need the following two files, included in the ZIP file in step (1).
Proceed as follows:
Remember to set the variables
PUBLIC_URL
andSSO_PUBLIC_URL
in the .env file.
The environment file (.env)
# ############################################ # # Global parameters # # ############################################ # Public URL of the deployment that will be used for external access PUBLIC_URL=https://platform.rapidminer.com # Public URL of the SSO endpoint that will be used for external access. In most cases it should be the same as the PUBLIC_URL SSO_PUBLIC_URL=https://platform.rapidminer.com # Enable/disable the service build into the RapidMiner cloud images, that updates the PUBLIC_URL and SSO_PUBLIC_URL variables to the new dynamic cloud hostname/IP address # Enable/disable the Legacy BASIC authentication support for REST endpoints, like webservices. LEGACY_REST_BASIC_AUTH_ENABLED=false # Timezone setting TZ=UTC # ############################################ # # Deployment parameters # # ############################################ REGISTRY=rapidminer/ INIT_VERSION=9.7 # ############################################ # # KeyCloak (SSO) # # ############################################ KEYCLOAK_VERSION=9.7 KC_DB=kcdb KC_USER=kcdbuser KC_PASSWORD=kcdbpass KEYCLOAK_USER=admin KEYCLOAK_PASSWORD=changeit SSO_IDP_REALM=master SSO_SSL_REQUIRED=external USER_MIGRATION_ENABLED=True USER_MIGRATION_DRY_RUN=False # ############################################ # # Rapidminer server # # ############################################ SERVER_VERSION=9.7 SERVER_DBHOST=rm-postgresql-svc SERVER_DBSCHEMA=rapidminer-server-db SERVER_DBUSER=rmserver-db-user SERVER_DBPASS=w61J784XSb24K4LRV97MbE16i8xa9O SERVER_MAX_MEMORY=2048M RMSERVER_SSO_CLIENT_ID=urn:rapidminer:server RMSERVER_SSO_CLIENT_SECRET= RAPIDMINER_SERVER_HOST=rm-server-svc RAPIDMINER_SERVER_PORT=8080 RAPIDMINER_SERVER_URL=http://rm-server-svc:8080 AUTH_SECRET=TTY5MjUxbzRBN2ZIWThpNGVKNGo4V2xqOHk0dTNV BROKER_ACTIVEMQ_USERNAME=amq-user BROKER_ACTIVEMQ_PASSWORD=M69251o4A7fHY8i4eJ4j8Wlj8y4u3U # ############################################ # # Job Agent # # ############################################ JOBAGENT_QUEUE_ACTIVEMQ_URI=failover:(tcp://rm-server-svc:5672) JOBAGENT_CONTAINER_COUNT=2 JOB_QUEUE=DEFAULT JOBAGENT_CONTAINER_MEMORYLIMIT=2048 RAPIDMINER_JOBAGENT_OPTS="-Djobagent.python.registryBaseUrl=http://platform-admin-webui-svc:82/" RAPIDMINER_SERVER_PROTOCOL=http # ############################################ # # Proxy # # ############################################ PROXY_VERSION=9.7 JUPYTER_BACKEND=http://rm-jupyterhub-svc:8000 JUPYTER_URL_SUFFIX=/jupyter GRAFANA_BACKEND=http://rm-grafana-svc:3000 GRAFANA_URL_SUFFIX=/grafana PA_BACKEND=http://platform-admin-webui-svc:82/ PA_URL_SUFFIX=/platform-admin RTS_WEBUI_BACKEND=http://platform-admin-webui-svc:82/ RTS_WEBUI_URL_SUFFIX=/rts-admin RTS_SCORING_BACKEND=http://rts-agent-svc:8090/ RTS_SCORING_URL_SUFFIX=/rts KEYCLOAK_BACKEND=http://rm-keycloak-svc:8080 LANDING_BACKEND=http://landing-page TOKEN_BACKEND=http://rm-token-tool-svc TOKEN_URL_SUFFIX=/get-token HTTPS_CRT_PATH=/etc/nginx/ssl/certificate.crt HTTPS_KEY_PATH=/etc/nginx/ssl/private.key HTTPS_DH_PATH=/etc/nginx/ssl/dhparam.pem # ############################################ # # Radoop Proxy # # ############################################ RADOOP_PROXY_VERSION=1.2.1 # Authentication: 'server|jwt|superuser' RADOOP_PROXY_AUTHENTICATION=superuser RADOOP_PROXY_SUPERUSERNAME=proxyadmin RADOOP_PROXY_SUPERUSERPASSWORD=changeit RADOOP_PROXY_PORT=1081 RADOOP_PROXY_WORKERSPOOLSIZE=100 RADOOP_PROXY_SSL="off" # ############################################ # # Platform admin # # ############################################ PA_VERSION=9.7 PA_SSO_CLIENT_ID=urn:rapidminer:platform-admin PA_SSO_CLIENT_SECRET= PA_DISABLE_PYTHON=false PA_DISABLE_RTS=false # ############################################ # # Landing page # # ############################################ RM_LANDING_VERSION=9.7 LANDING_SSO_CLIENT_ID=urn:rapidminer:landing-page LANDING_SSO_CLIENT_SECRET= # ############################################ # # Token Tool # # ############################################ TOKEN_SSO_CLIENT_ID=urn:rapidminer:token-tool TOKEN_SSO_CLIENT_SECRET=
The docker-compose definition (docker-compose.yml)
version: '3' services: rm-proxy-svc: image: ${REGISTRY}rapidminer-proxy:${PROXY_VERSION} hostname: rm-proxy-svc restart: always environment: - KEYCLOAK_BACKEND=${KEYCLOAK_BACKEND} - RMSERVER_BACKEND=${RAPIDMINER_SERVER_URL} - JUPYTER_BACKEND=${JUPYTER_BACKEND} - JUPYTER_URL_SUFFIX=${JUPYTER_URL_SUFFIX} - GRAFANA_BACKEND=${GRAFANA_BACKEND} - GRAFANA_URL_SUFFIX=${GRAFANA_URL_SUFFIX} - PA_BACKEND=${PA_BACKEND} - PA_URL_SUFFIX=${PA_URL_SUFFIX} - TOKEN_BACKEND=${TOKEN_BACKEND} - TOKEN_URL_SUFFIX=${TOKEN_URL_SUFFIX} - RTS_WEBUI_BACKEND=${RTS_WEBUI_BACKEND} - RTS_WEBUI_URL_SUFFIX=${RTS_WEBUI_URL_SUFFIX} - RTS_SCORING_BACKEND=${RTS_SCORING_BACKEND} - RTS_SCORING_URL_SUFFIX=${RTS_SCORING_URL_SUFFIX} - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} - SSO_IDP_REALM=${SSO_IDP_REALM} - RTS_BASIC_AUTH=true - LANDING_BACKEND=${LANDING_BACKEND} - HTTPS_CRT_PATH=${HTTPS_CRT_PATH} - HTTPS_KEY_PATH=${HTTPS_KEY_PATH} - HTTPS_DH_PATH=${HTTPS_DH_PATH} ports: - 80:80 - 443:443 networks: rm-platform-int-net: aliases: - rm-proxy-svc volumes: - ./ssl:/etc/nginx/ssl - platform-admin-uploaded-vol:/rapidminer/platform-admin/uploaded/ rm-keycloak-db-svc: image: postgres:9.6 restart: always hostname: rm-keycloak-db-svc environment: - POSTGRES_DB=${KC_DB} - POSTGRES_USER=${KC_USER} - POSTGRES_PASSWORD=${KC_PASSWORD} volumes: - keycloak-postgresql-vol:/var/lib/postgresql/data networks: rm-idp-db-net: aliases: - rm-keycloak-db-svc rm-keycloak-svc: image: ${REGISTRY}rapidminer-keycloak:${KEYCLOAK_VERSION} restart: always hostname: rm-keycloak-svc environment: - DB_VENDOR=POSTGRES - DB_ADDR=rm-keycloak-db-svc - DB_DATABASE=${KC_DB} - DB_USER=${KC_USER} - DB_SCHEMA=public - DB_PASSWORD=${KC_PASSWORD} - KEYCLOAK_USER=${KEYCLOAK_USER} - KEYCLOAK_PASSWORD=${KEYCLOAK_PASSWORD} - PROXY_ADDRESS_FORWARDING=true depends_on: - rm-keycloak-db-svc - rm-proxy-svc networks: rm-platform-int-net: aliases: - rm-keycloak-svc rm-idp-db-net: aliases: - rm-keycloak-svc rm-init-svc: image: ${REGISTRY}rapidminer-deployment-init:${INIT_VERSION} restart: 'no' hostname: rm-keycloak-init-svc depends_on: - rm-keycloak-svc - rm-postgresql-svc environment: - LEGACY_REST_BASIC_AUTH_ENABLED=${LEGACY_REST_BASIC_AUTH_ENABLED} - PUBLIC_URL=${PUBLIC_URL} - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} volumes: - ./.env:/.env - ./docker-compose.yml:/docker-compose.yml:ro - keycloak-admin-cli-vol:/root/.keycloak/ - deployed-services-vol:/rapidminer/deployed-services/ networks: rm-platform-int-net: aliases: - rm-init-svc rm-server-db-net: aliases: - rm-init-svc rm-postgresql-svc: image: postgres:9.6 hostname: rm-postgresql-svc restart: always environment: - POSTGRES_DB=${SERVER_DBSCHEMA} - POSTGRES_USER=${SERVER_DBUSER} - POSTGRES_PASSWORD=${SERVER_DBPASS} volumes: - rm-postgresql-vol:/var/lib/postgresql/data networks: rm-server-db-net: aliases: - rm-postgresql-svc rm-server-svc: image: ${REGISTRY}rapidminer-server:${SERVER_VERSION} hostname: rm-server-svc restart: always environment: - PA_BASE_URL=${PA_BACKEND} - PA_SYNC_DEBUG=False - DBHOST=${SERVER_DBHOST} - DBSCHEMA=${SERVER_DBSCHEMA} - DBUSER=${SERVER_DBUSER} - DBPASS=${SERVER_DBPASS} - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} - SSO_IDP_REALM=${SSO_IDP_REALM} - SSO_CLIENT_ID=${RMSERVER_SSO_CLIENT_ID} - SSO_CLIENT_SECRET=${RMSERVER_SSO_CLIENT_SECRET} - SSO_SSL_REQUIRED=${SSO_SSL_REQUIRED} - LEGACY_REST_BASIC_AUTH_ENABLED=${LEGACY_REST_BASIC_AUTH_ENABLED} - SERVER_MAX_MEMORY=${SERVER_MAX_MEMORY} - BROKER_ACTIVEMQ_USERNAME=${BROKER_ACTIVEMQ_USERNAME} - BROKER_ACTIVEMQ_PASSWORD=${BROKER_ACTIVEMQ_PASSWORD} - JOBSERVICE_AUTH_SECRET=${AUTH_SECRET} - JUPYTER_URL_SUFFIX=${JUPYTER_URL_SUFFIX} - GRAFANA_URL_SUFFIX=${GRAFANA_URL_SUFFIX} - TZ=${TZ} volumes: - rm-server-bootstrap-vol:/bootstrap.d - rm-server-home-vol:/persistent-rapidminer-home depends_on: - rm-postgresql-svc networks: rm-platform-int-net: aliases: - rm-server-svc rm-server-db-net: aliases: - rm-server-svc rm-server-job-agent-svc: image: ${REGISTRY}rapidminer-execution-jobagent:${SERVER_VERSION} hostname: rm-server-job-agent-svc restart: always environment: - RAPIDMINER_SERVER_HOST=${RAPIDMINER_SERVER_HOST} - RAPIDMINER_SERVER_PROTOCOL=${RAPIDMINER_SERVER_PROTOCOL} - RAPIDMINER_SERVER_PORT=${RAPIDMINER_SERVER_PORT} - JOBAGENT_QUEUE_ACTIVEMQ_URI=${JOBAGENT_QUEUE_ACTIVEMQ_URI} - JOBAGENT_QUEUE_ACTIVEMQ_USERNAME=${BROKER_ACTIVEMQ_USERNAME} - JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD=${BROKER_ACTIVEMQ_PASSWORD} - JOBAGENT_AUTH_SECRET=${AUTH_SECRET} - JOBAGENT_CONTAINER_COUNT=${JOBAGENT_CONTAINER_COUNT} - JOB_QUEUE=${JOB_QUEUE} - JOBAGENT_CONTAINER_MEMORYLIMIT=${JOBAGENT_CONTAINER_MEMORYLIMIT} - RAPIDMINER_JOBAGENT_OPTS=${RAPIDMINER_JOBAGENT_OPTS} - TZ=${TZ} volumes: - rm-server-bootstrap-ja-vol:/bootstrap.d depends_on: - rm-server-svc networks: rm-platform-int-net: aliases: - rm-server-job-agent-svc platform-admin-webui-svc: image: ${REGISTRY}rapidminer-platform-admin-webui:${PA_VERSION} hostname: platform-admin-webui-svc restart: always environment: - PA_URL_SUFFIX=${PA_URL_SUFFIX} - RTS_SCORING_URL_SUFFIX=${RTS_SCORING_URL_SUFFIX} - RTS_SCORING_BACKEND=${RTS_SCORING_BACKEND} - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} - SSO_IDP_REALM=${SSO_IDP_REALM} - SSO_CLIENT_ID=${PA_SSO_CLIENT_ID} - SSO_CLIENT_SECRET=${PA_SSO_CLIENT_SECRET} - PA_DISABLE_PYTHON=${PA_DISABLE_PYTHON} - PA_DISABLE_RTS=${PA_DISABLE_RTS} - DEBUG=false volumes: - platform-admin-uploaded-vol:/var/www/html/uploaded/ networks: jupyterhub-user-net: aliases: - platform-admin-webui-svc rm-platform-int-net: aliases: - platform-admin-webui-svc rm-radoop-proxy-svc: image: ${REGISTRY}radoop-proxy:${RADOOP_PROXY_VERSION} hostname: rm-radoop-proxy-svc restart: always environment: - AUTHENTICATION=${RADOOP_PROXY_AUTHENTICATION} - SUPERUSERNAME=${RADOOP_PROXY_SUPERUSERNAME} - SUPERUSERPASSWORD=${RADOOP_PROXY_SUPERUSERPASSWORD} - PORT=${RADOOP_PROXY_PORT} - WORKERSPOOLSIZE=${RADOOP_PROXY_WORKERSPOOLSIZE} - SSL=${RADOOP_PROXY_SSL} - SERVERHOST=${RAPIDMINER_SERVER_HOST} - SERVERPORT=${RAPIDMINER_SERVER_PORT} ports: - ${RADOOP_PROXY_PORT}:${RADOOP_PROXY_PORT} landing-page: image: ${REGISTRY}rapidminer-deployment-landing-page:${RM_LANDING_VERSION} restart: always hostname: landing-page environment: - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} - SSO_IDP_REALM=${SSO_IDP_REALM} - SSO_CLIENT_ID=${LANDING_SSO_CLIENT_ID} - SSO_CLIENT_SECRET=${LANDING_SSO_CLIENT_SECRET} - DEBUG=false volumes: - rm-landing-page-vol:/var/www/html/uploaded/ - deployed-services-vol:/rapidminer/deployed-services/ networks: rm-platform-int-net: aliases: - landing-page rm-token-tool-svc: image: ${REGISTRY}rapidminer-deployment-landing-page:${RM_LANDING_VERSION} restart: always hostname: rm-token-tool environment: - PUBLIC_URL=${PUBLIC_URL} - SSO_PUBLIC_URL=${SSO_PUBLIC_URL} - SSO_IDP_REALM=${SSO_IDP_REALM} - SSO_CLIENT_ID=${TOKEN_SSO_CLIENT_ID} - SSO_CLIENT_SECRET=${TOKEN_SSO_CLIENT_SECRET} - DEBUG=false - SSO_CUSTOM_SCOPE=openid info offline_access - CUSTOM_URL_SUFFIX=${TOKEN_URL_SUFFIX} - CUSTOM_CONTENT=get-token volumes: - rm-token-tool-vol:/var/www/html/uploaded/ networks: rm-platform-int-net: aliases: - rm-token-tool volumes: rm-postgresql-vol: rm-server-bootstrap-vol: rm-server-home-vol: rm-server-bootstrap-ja-vol: platform-admin-uploaded-vol: keycloak-postgresql-vol: keycloak-admin-cli-vol: rm-landing-page-vol: rm-token-tool-vol: deployed-services-vol: networks: rm-platform-int-net: rm-idp-db-net: rm-server-db-net: