Categories

Versions

You are viewing the RapidMiner Deployment documentation for version 9.6 - Check here for latest version

Docker image for RapidMiner Reverse Proxy

The documentation below describes RapidMiner Reverse Proxy, which is a generic reverse proxy for the following components:

  • RapidMiner Server
  • Jupyterhub
  • Dashboards
  • Real Time Scoring WebUI
  • Python Environment Manager

RapidMiner Reverse Proxy

This container provides the RapidMiner Reverse Proxy component.

With this proxy:

  • you have to open only a single https (or http, but it is not recommended) port, and access different services under different suffixes (e.g. https://domain.name/grafana), the root suffix is used for RapidMiner Server.
  • suffix names can be customized
  • backends can be configured, and in case you do not have a particular service, you can set the backend empty (in this case, the RapidMiner Server backend will be used). If the backend is not accessibe, you will receive a "502 bad gateway" error page.
  • provide your https certificate, key, and dhparam file, this files should be accessibe inside the container (e.g. bind mounted, or stored in a persistent volume), and with the environment variables below, you only specify the paths to this files.
  • you will have improved security settings

For available versions, please see the tags on Docker Hub.

Configuration

The proxy can be configured with the following environment variables:

  • RMSERVER_BACKEND this variable sets the URL of the RapidMiner Server backend on the internal network default value: http://rm-server-svc:8080
  • JUPYTER_BACKEND this variable sets the URL of the Jupyterhub backend on the internal network default value: http://rm-jupyterhub-svc:8000
  • JUPYTER_URL_SUFFIX this variable sets the suffix under the Jupyterhub service will be accessibe by the users (e.g. https://domain.name/jupyter) default value: /jupyter
  • GRAFANA_BACKEND this variable sets the URL of the Dashboards backend on the internal network default value: http://rm-grafana-svc:3000
  • GRAFANA_URL_SUFFIX this variable sets the suffix under the Dashboards service will be accessibe by the users (e.g. https://domain.name/grafana) default value: /grafana
  • RTS_WEBUI_BACKEND this variable sets the URL of the RapidMiner Real-Time Scoring WebUI backend on the internal network default value: http://rts-webui-svc:80/
  • RTS_WEBUI_SUFFIX this variable sets the suffix under the RapidMiner Real-Time Scoring WebUI service will be accessibe by the users (e.g. https://domain.name/rts-admin) default value: /rts-admin
  • RTS_SCORING_BACKEND this variable sets the URL of the RapidMiner Real-Time Scoring Agent backend on the internal network default value: http://rts-agent-svc:8090/
  • RTS_SCORING_SUFFIX this variable sets the suffix under the RapidMiner Real-Time Scoring Agent service will be accessibe by the users (e.g. https://domain.name/rts) default value: /rts
  • PEM_BACKEND this variable sets the URL of the RapidMiner Python Environment Manager backend on the internal network default value: http://pem-webui-svc:80/
  • PEM_URL_SUFFIX this variable sets the suffix under the RapidMiner Python Environment Manager service will be accessibe by the users (e.g. https://domain.name/python-admin) default value: /python-admin
  • HTTPS_CRT_PATH you can define the path to the ssl certificate inside the container with this variable (full path with filename, e.g. /etc/nginx/ssl/domain.name.crt). default value: ""
  • HTTPS_KEY_PATH you can define the path to the ssl key inside the container with this variable (full path with filename, e.g. /etc/nginx/ssl/domain.name.key). default value: ""
  • HTTPS_DH_PATH you can define the path to the dhparam file inside the container with this variable (full path with filename, e.g. /etc/nginx/ssl/dhparam.pem). default value: ""
  • DEBUG_CONF_INIT you can switch the proxy into more verbose mode. default value: false

HTTPS

If all HTTPS_CRT_PATH, HTTPS_KEY_PATH, HTTPS_DH_PATH are set to a valid path, all http traffic will be redirected to https, else the proxy will work using only http protocol.

To have a persistent storage for this files, a bind mount, or a volume can be used. We suggest to mount this storage in read-only mode, for example:

volumes:
- ./ssl:/etc/nginx/ssl:ro

Having the bind mount above done, you can copy your files into the ssl folder on the host mashine, then you can set the HTTPS_CRT_PATH, HTTPS_KEY_PATH, HTTPS_DH_PATH variables to point to your files inside the /etc/nginx/ssl folder (e.g. /etc/nginx/ssl/domain.name.crt, /etc/nginx/ssl/domain.name.key, /etc/nginx/ssl/dhparam.pem)

The DH parameter file can be generated with the following command:

openssl dhparam -out ./ssl/dhparam.pem 4096

We suggest to use at least 4096 bit encryption

This step can take some time to finish depending on the computer running the command

Data persistence

Some services are using Basic Auth for access control, this accesses are stored encrypted in .htpasswd files. The proxy needs read access to this files to provide the access control, this can be achived with volume mounts:

volumes:
- pem-uploaded-vol:/rapidminer/pem/uploaded/:ro
- rts-uploaded-vol:/rapidminer/rts/uploaded/:ro

If you don't have RapidMiner Python Environment Manager, or Rapidminer Real-Time Scoring, you can forsake the particular volume.

The complete volume definition is like:

volumes:
- ./ssl:/etc/nginx/ssl:ro
- pem-uploaded-vol:/rapidminer/pem/uploaded/:ro
- rts-uploaded-vol:/rapidminer/rts/uploaded/:ro

Access and error logs

The proxy container forwards all logs to the container log, wich can be handled as in any container based deployment, this is out of scope for this document.