Docker image for RapidMiner Server

The documentation below describes the following Docker image:

  • RapidMiner Server 9.3 (rapidminer/rapidminer-server:9.3.0)

For Docker images, see Docker Hub.

Description

This is a fully functional RapidMiner Server image.

For available versions, please see the tags:

Database connection

The image provides two different database connection methods:

  • For quick deployments, there is an embedded database, but it is not recommended for production
  • For production, use an external database to ensure data persistence

Environment variables

The following parameters are available:

  • EMBEDDED_DATABASE: set this to "1" in order to start the embedded PostgreSQL database server in the container. The persistence of the data using an embedded database server is not solved.
  • BUNDLED_JOB_AGENT: set this to "1" in order to start the bundled Job Agent
  • DBHOST, DBUSER, DBPASS, DBSCHEMA: set these variables to configure RapidMiner Server to use an external PostgreSQL database. If the provided database is empty (there is no table), it will be initialized with an initial RapidMiner Server database.
  • BROKER_ACTIVEMQ_USERNAME, BROKER_ACTIVEMQ_PASSWORD and JOBSERVICE_AUTH_SECRET: set these variables to define the required authentication secret and ActiveMQ credentials that should be used in the Job Agents
  • JOBAGENT_QUEUE_ACTIVEMQ_USERNAME, JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD, JOBAGENT_AUTH_SECRET: set these variables to configure the bundled Job Agents' connection credentials
  • INTERACTIVE_MODE: setting this variable to "1" will start an interactive bash shell without starting the RapidMiner Server process. The server can be configured, plugins can be installed and afterwards the RapidMiner Server process can be started manually.

Data persistence

The RapidMiner home directory stores all the data and configuration connected with the RapidMiner Server image.

To make this data persistent, make sure to start the container with a volume mounted on the mount point /persistent-rapidminer-home, as indicated by the -v option to docker run in the examples below, or the volumes option in docker-compose.yml.

  • If the mounted volume is empty, then a default configuration and data content will be propagated to it for use by RapidMiner Server.
  • If the volume contains data from any previous executions, then the server will be started with that data.

This volume will contain all the configuration files, extensions, licenses, logs and repository data. After the first execution (with a mounted empty volume), the following data can be edited:

  • Extensions can be installed by adding them to the folder <volume>/resources/extensions
  • Licences can be installed in <volume>/resources/licenses
  • The configuration can be tuned via files stored in <volume>/configuration

Good to know

  • RapidMiner Server requires at least 8GB of memory. On Windows hosts, please make sure that the Docker Engine is configured to run with enough memory.
  • RapidMiner Server listens on port 8080, as described in the previous examples. You can connect to it on http://localhost:8080 or any other network interface.
  • The default login credentials are admin/changeit.
  • A bundled Job Agent is included in the image for testing purposes, but its data persistence is not solved. For production use, it is highly recommended to define queues and have external Job Agent containers connect to them. The value of the required JOBAGENT_AUTH_SECRET, JOBAGENT_QUEUE_ACTIVEMQ_USERNAME and JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD parameters are printed out to the console during the RapidMiner Server container startup.
  • JOBAGENT_AUTH_SECRET and JOBSERVICE_AUTH_SECRET values are Base64 encoded strings.
  • The examples use the folders /PATH/TO/PGSQL/HOME for persistent PostgreSQL data and /PATH/TO/RAPIDMINER/HOME for persistent rapidminer-home data. The same folder (/PATH/TO/RAPIDMINER/HOME) is used to mount licenses for the Job Agent.
  • The licenses mount point should be a standard RapidMiner licenses folder, containing the license files in subfolders named rapidminer-server, rapidminer-studio, radoop.
  • To mount volumes on a Windows system you should pay attention to the Windows-specific Docker volume mount settings:
    • Make sure the drive is shared in the Docker settings
    • If using docker-compose, consider setting the environment variable "COMPOSE_CONVERT_WINDOWS_PATHS=1"
    • Make sure that Docker can read and write to the mounted files and folders

Examples

In the following scripts, all the terms with <brackets> need to be replaced with values that are defined by you and are unique to your configuration.

Startup examples

  1. Start the container using the embedded database, the bundled Job Agent, and an in-container rapidminer-home, without any data persistence:

     docker run \
            -e EMBEDDED_DATABASE=1 \
            -e BUNDLED_JOB_AGENT=1 \
            -p 8080:8080 \
            rapidminer/rapidminer-server:9.3.0
    
  2. Start the container using an external database, and an external rapidminer-home directory. Use any existing PostgreSQL server or start a new one based on the PostgreSQL docker image. Here we provide an example startup command. For details, please check the postgres page on Docker Hub.

     docker run \
            -d \
            -v </PATH/TO/PGSQL/HOME>:/var/lib/postgresql/data \
            -e POSTGRES_DB=<db-name> \
            -e POSTGRES_USER=<user-name> \
            -e POSTGRES_PASSWORD=<password> \
            postgres:9.6
    

    Once a database is up and running (either an external or a docker one), RapidMiner Server can be started:

     docker run \
            -e DBHOST=<ip.address.of.pgsql> \
            -e DBSCHEMA=<db-name> \
            -e DBUSER=<user-name> \
            -e DBPASS=<password> \
            -v </PATH/TO/RAPIDMINER/HOME>:/persistent-rapidminer-home \
            -p 8080:8080 \
            rapidminer/rapidminer-server:9.3.0
    

    The provided database connection parameters will be stored in standalone.xml in the RapidMiner Home folder. If a persistent RapidMiner Home volume is used, then the database-related startup parameters are not needed after the first execution.

  3. Start the container using interactive shell and embedded database for debug purposes

     docker run \
                -i -t \
                -e EMBEDDED_DATABASE=1 \
                -e INTERACTIVE_MODE=1 \
                -p 8080:8080 \
                rapidminer/rapidminer-server:9.3.0
    

    You can start or stop the RapidMiner Server process manually using the following command

     /etc/init.d/rapidminer-server {start|stop}
    

    You can start or stop the RapidMiner Job Agent process manually using the following command

     /etc/init.d/rapidminer-job-agent {start|stop}
    

    Detailed logs are placed in /rapidminer-home/log and /opt/rapidminer-server/job-agent/home/log.

Example configuration for docker-compose

version: '3'
services:
  database:
    image: postgres:9.6
    environment:
      - POSTGRES_DB=<db-name>
      - POSTGRES_USER=<user-name>
      - POSTGRES_PASSWORD=<password>
    volumes:
      - </PATH/TO/PGSQL/HOME>:/var/lib/postgresql/data
  rapidminer-server:
    image: rapidminer/rapidminer-server:9.3.0
    environment:
      - DBHOST=database
      - DBSCHEMA=<db-name>
      - DBUSER=<user-name>
      - DBPASS=<password>
      - BROKER_ACTIVEMQ_USERNAME=<some-amq-username>
      - BROKER_ACTIVEMQ_PASSWORD=<some-secure-amq-password>
      - JOBSERVICE_AUTH_SECRET=<c29tZS1hdXRoLXNlY3JldAo=>
    volumes:
      - </PATH/TO/RAPIDMINER/HOME>:/persistent-rapidminer-home
    ports:
      - 8080:8080
    depends_on:
      - database
    links:
      - database
  job-agent:
    image: rapidminer/rapidminer-execution-jobagent:9.3.0
    environment:
      - RAPIDMINER_SERVER_HOST=rapidminer-server
      - RAPIDMINER_SERVER_PORT=8080
      - RAPIDMINER_SERVER_PROTOCOL=http
      - JOBAGENT_QUEUE_ACTIVEMQ_URI=failover:(tcp://rapidminer-server:5672)
      - JOBAGENT_QUEUE_ACTIVEMQ_USERNAME=<some-amq-username>
      - JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD=<some-secure-amq-password>
      - JOBAGENT_AUTH_SECRET=<c29tZS1hdXRoLXNlY3JldAo=>
      - JOBAGENT_CONTAINER_COUNT=1
      - JOBAGENT_CONTAINER_MEMORYLIMIT=4096M
      - JOB_QUEUE=DEFAULT
    links:
      - rapidminer-server
    depends_on:
      - rapidminer-server