This page describes the components in the RapidMiner Server High Availability setup.
RapidMiner Server nodes
For high availability, you need at least two RapidMiner Server nodes. Each node of RapidMiner Server is completely independent. There is no need for cross-node communication.
Job Agent nodes
Job Agent nodes run the processes that have been submitted to the RapidMiner Server cluster. For high availability, you need to make sure that at least two Job Agents point to each queue, so that if one Job Agent breaks down, the jobs on the queue can be picked up by another Job Agent.
The load balancer acts as a link between users and the RapidMiner Server cluster. All requests to RapidMiner Server go through the load balancer; it decides which RapidMiner Server node will respond to each particular request. Users can contact the cluster through RapidMiner Studio, the RapidMiner Server's web UI, BI applications, etc. The load balancer is completely transparent for the user and for client applications. A configuration with sticky sessions is required. Any third-party load balancer that supports sticky sessions is compatible.
A shared database is a requirement, because it stores Repository metadata, user management settings, and other configuration. The database should be configured for high availability to prevent it from becoming a single point of failure.
Shared file system
A high-performance shared file system (like NFS) is a requirement, because the Repository and additional configuration is stored in the RapidMiner Server home directory. It too should be configured for high availability.