Categories

Versions

You are viewing the RapidMiner Server documentation for version 9.5 - Check here for latest version

Architecture overview

This page describes the components in the RapidMiner Server High Availability setup.

RapidMiner Server nodes

For high availability, you need at least two RapidMiner Server nodes. Each node of RapidMiner Server is completely independent. There is no need for cross-node communication.

Job Agent nodes

Job Agent nodes run the processes that have been submitted to the RapidMiner Server cluster. For high availability, you need to make sure that at least two Job Agents point to each queue, so that if one Job Agent breaks down, the jobs on the queue can be picked up by another Job Agent.

Load balancer

The load balancer acts as a link between users and the RapidMiner Server cluster. All requests to RapidMiner Server go through the load balancer; it decides which RapidMiner Server node will respond to each particular request. Users can contact the cluster through RapidMiner Studio, the RapidMiner Server's web UI, BI applications, etc. The load balancer is completely transparent for the user and for client applications. A configuration with sticky sessions is required. Any third-party load balancer that supports sticky sessions is compatible.

Shared database

A shared database is a requirement, because it stores Repository metadata, user management settings, and other configuration. The database should be configured for high availability to prevent it from becoming a single point of failure.

Shared file system

A high-performance shared file system (like NFS) is a requirement, because the Repository and additional configuration is stored in the RapidMiner Server home directory. It too should be configured for high availability.

External ActiveMQ broker

The ActiveMQ broker connects RapidMiner Server with external Job Agents. For high availability, read the Installation guide.