You are viewing the RapidMiner Server documentation for version 9.5 - Check here for latest version
Kubernetes
Our Docker Images are ready to deploy to any Kubernetes Cluster. Here we provide example deployment configurations and tutorials, but the final deployment depends on your requirements.
The following guide requires a running Kubernetes cluster. We tested our example configuration with these Kubernetes services:
Deployment architecture and definition
In our example, we deploy a PostgeSQL database server, RapidMiner Server, and some Job Agents on Kubernetes.
To deploy RapidMiner Server on Kubernetes, you need to define the services, volumes and pods.
Volumes
Our example configuration uses two persistent volumes:
- A volume for the PostgreSQL database data storage
- A volume for the RapidMiner Home of the RapidMiner Server
To define the volumes, you can apply the following Kubernetes Object Configuration YAML file.
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pgvolume-claim labels: app: database spec: accessModes: - ReadWriteOnce resources: requests: storage: 2Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: rmsvolume-claim labels: app: rapidminer-server spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi
Services
To deploy the example configuration, we specify three Kubernetes Service Endpoints:
- The ActiveMQ service endpoint is an internal endpoint that is used by the Job Agents (port: 5672)
- The database service endpoint is an internal endpoint that used to connect from the RapidMiner Server (port: 5432)
- The RapidMiner Server service endpoint represent the public web interface of the RapidMiner Server (port: 8080).
Note: the public endpoint definition may differ on different Kubernetes Clusters.
Public cloud providers support the LoadBalancer
type, but the MicroK8S implementation requires the setting of an Ingress
to enable public access.
To define the service endpoints, you can apply the following Kubernetes Object Configuration YAML file:
kind: Service apiVersion: v1 metadata: name: rapidminer-server-amq-svc labels: app: rapidminer-server-amq-svc role: server spec: ports: - port: 5672 targetPort: amq selector: app: rapidminer-server role: server --- kind: Service apiVersion: v1 metadata: name: postgres-svc labels: app: database spec: ports: - port: 5432 targetPort: postgresport selector: app: database --- kind: Service apiVersion: v1 metadata: name: rapidminer-server-svc labels: app: rapidminer-server-svc role: server spec: ports: - port: 8080 targetPort: rmswebui selector: app: rapidminer-server role: server type: LoadBalancer
PODs / Containers
Our example configuration defines the following 3 deployments:
- The Database pod contains the PostgreSQL container. The
pgvolume-claim
is used as persistent volume. We also defined asubPath
to ensure empty mount point for the postgres container.
kind: Pod apiVersion: v1 metadata: name: database labels: app: database spec: containers: - name: database image: postgres:9.6 ports: - name: postgresport containerPort: 5432 env: - name: POSTGRES_DB value: rmsdb - name: POSTGRES_USER value: rmsdbuser - name: POSTGRES_PASSWORD value: rmsdbpassword volumeMounts: - name: pgvolume mountPath: /var/lib/postgresql/data subPath: postgres volumes: - name: pgvolume persistentVolumeClaim: claimName: pgvolume-claim
- The RapidMiner Server container is defined with the following configuration. The environment variables are defined based on our Docker Image documentation. The
rmsvolume-claim
is used to provide the persistent RapidMiner Home Folder. We also defined asubPath
on the volume to ensure empty mount point for the first startup to let the RapidMiner Server container do the initialization of the RapidMiner Home Folder.
kind: Pod apiVersion: v1 metadata: name: rapidminer-server labels: app: rapidminer-server role: server spec: containers: - name: rapidminer-server image: rapidminer/rapidminer-server:latest ports: - name: rmswebui containerPort: 8080 - name: amq containerPort: 5672 env: - name: JOBSERVICE_QUEUE_ACTIVEMQ_USERNAME value: amq-user - name: JOBSERVICE_QUEUE_ACTIVEMQ_PASSWORD value: amq-pass - name: JOBSERVICE_AUTH_SECRET value: c29tZS1hdXRoLXNlY3JldAo= - name: DBHOST value: postgres-svc - name: DBSCHEMA value: rmsdb - name: DBUSER value: rmsdbuser - name: DBPASS value: rmsdbpassword volumeMounts: - name: rmsvolume mountPath: /persistent-rapidminer-home subPath: rapidminer-home volumes: - name: rmsvolume persistentVolumeClaim: claimName: rmsvolume-claim
- The Job Agent containers are deployed using a Deployment Kubernetes object type, that provides replication and starts three instances in our example.
kind: Deployment apiVersion: apps/v1 kind: Deployment metadata: name: job-agent labels: app: job-agent role: execution spec: replicas: 3 selector: matchLabels: app: job-agent template: metadata: labels: app: job-agent role: execution spec: containers: - name: job-agent image: rapidminer/rapidminer-execution-jobagent:latest env: - name: RAPIDMINER_SERVER_HOST value: rapidminer-server-svc - name: RAPIDMINER_SERVER_PORT value: '8080' - name: JOBAGENT_QUEUE_ACTIVEMQ_URI value: failover:(tcp://rapidminer-server-amq-svc:5672) - name: JOBAGENT_QUEUE_ACTIVEMQ_USERNAME value: amq-user - name: JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD value: amq-pass - name: JOBAGENT_AUTH_SECRET value: c29tZS1hdXRoLXNlY3JldAo=
Deployment process
Based on the object definitions shown above, you can deploy the RapidMiner Server on Kubernetes Cluster with the database and Job Agent dependencies:
- Make sure that the connection to your Kubernetes Cluster is working
$ kubectl version Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
- Create and check the volumes
$ kubectl apply -f volumes.yaml persistentvolumeclaim/pgvolume-claim created persistentvolumeclaim/rmsvolume-claim created $ kubectl get pv pvc $ kubectl get pv pv
- Create and check services
$ kubectl apply -f services.yaml` service/rapidminer-server-amq-svc created service/postgres-svc created service/rapidminer-server-svc created $ kubectl get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE postgres-svc ClusterIP 10.152.183.35432/TCP 72s rapidminer-server-amq-svc ClusterIP 10.152.183.128 5672/TCP 72s rapidminer-server-svc LoadBalancer 10.152.183.252 ****** 8080:30661/TCP 72s
- Deploy services
$ kubectl apply -f database.yaml pod/database created $ kubectl apply -f rapidminer-server.yaml pod/rapidminer-server created $ kubectl apply -f job-agent.yaml deployment.apps/job-agent created
- Check the running PODs
$ kubectl get pod NAME READY STATUS RESTARTS AGE pod/database 1/1 Running 0 41m pod/job-agent-556b49567b-5cm8n 1/1 Running 0 44s pod/job-agent-556b49567b-6585h 1/1 Running 0 44s pod/job-agent-556b49567b-zk44g 1/1 Running 0 44s pod/rapidminer-server 1/1 Running 0 40m