Categories

Versions

You are viewing the RapidMiner Deployment documentation for version 9.7 - Check here for latest version

Basic production template

The template defined below is meant for a typical production environment.

Use it to deploy RapidMiner AI Hub on Kubernetes, with the following components:

  • 1 RapidMiner AI Hub instance
  • 3 RapidMiner Job Agents
  • Postgres database
  • Platform Admin
  • 1 JupyterHub instance
  • 1 Dashboards instance
  • 1 KeyCloak instance

For a detailed description of every Docker image, see the image reference.

System requirements

Minimum recommended hardware configuration

The amount of memory needed depends heavily on the amount of data that will be processed by RapidMiner AI Hub. By themselves, the RapidMiner services can run with as little as 8GB. However, in production environments, we recommend 32GB or more depending on user data, in order to provide users with enough capacity to analyze data from realistic use cases.

Each virtual or physical machine should at least have:

  • Quad core
  • 32GB RAM
  • >20GB free disk space

Instructions

The provided Docker Images are ready to deploy to any Kubernetes Cluster.

Please review the configuration below according to your environment and requirements.

The following guide requires a running Kubernetes cluster.

Rapidminer Platform is supported on the following Kubernetes services:

Volumes

Volumes provides the Elastic Block Storage for the RapidMiner Platform components (Postgre DB, Python Enviroment Manager, RapidMiner Server, Real-Time Scoring) to store the data permanently during container life-cycle.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rm-postgresql-pvc
  labels:
    app: rm-postgresql-svc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pem-uploaded-pvc 
  labels:
    app: pem-uploaded-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rm-server-home-pvc
  labels:
    app: rm-server-svc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rapidminer-uploaded-pvc
  labels:
    app: rapidminer-uploaded-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rts-uploaded-pvc
  labels:
    app: rts-webui
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rts-licenses-pvc
  labels:
    app: rapidminer-rts
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rts-deployments-pvc
  labels:
    app: rapidminer-rts
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rm-grafana-home-pvc 
  labels:
    app: rm-grafana-home-pvc 
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500M

Services

Services are the essential parts of the RapidMiner Platform. The services are used by containers/pods for reaching each other.

kind: Service
apiVersion: v1
metadata:
  name: rapidminer-server-amq-svc
  labels:
    app: rapidminer-server-amq-svc
    role: server
spec:
  ports:
  - port: 5672
    targetPort: amq
  selector:
    app: rm-server-svc 
    role: server
---
kind: Service
apiVersion: v1
metadata:
  name: rm-proxy-svc
  labels:
    app: rm-proxy-svc
    role: proxy
spec:
  ports:
  - name: proxyhttp
    protocol: TCP
    port: 80
    targetPort: proxyhttp
  - name: proxyhttps
    protocol: TCP
    port: 443
    targetPort: proxyhttps
  selector:
    app: rm-proxy-svc
    role: proxy
  type: LoadBalancer
---
kind: Service
apiVersion: v1
metadata:
  name: postgres-svc
  labels:
    app: rm-postgresql-svc
spec:
  ports:
  - port: 5432
    targetPort: postgresport
  selector:
    app: rm-postgresql-svc 
---
kind: Service
apiVersion: v1
metadata:
  name: rm-server-svc
  labels:
    app: rm-server-svc
    role: server
spec:
  ports:
  - port: 8080
    targetPort: rmswebui
  selector:
    app: rm-server-svc 
    role: server
---
kind: Service
apiVersion: v1
metadata:
  name: pem-webui-svc
  labels:
    app: pem-webui-cron
    role: pem
spec:
  ports:
  - name: pem-webuiport
    port: 82
    protocol: TCP
    targetPort: pem-webuiport
  selector:
    app: rm-proxy-svc
    role: proxy
---
kind: Service
apiVersion: v1
metadata:
  name: rm-grafana-svc
  labels:
    app: rm-grafana-svc
    role: grafana 
spec:
  ports:
  - name: grafanaport
    port: 3000
    protocol: TCP
    targetPort: grafanaport
  selector:
    app: rm-grafana-svc
    role: grafana
---
kind: Service
apiVersion: v1
metadata:
  name: rts-webui-svc
  labels:
    app: rm-proxy-svc
    role: proxy
spec:
  ports:
  - name: rts-webuiport
    port: 81
    protocol: TCP
    targetPort: rts-webuiport
  selector:
    app: rm-proxy-svc
    role: proxy
---
kind: Service
apiVersion: v1
metadata:
  name: real-time-scoring-agent
  labels:
    app: real-time-scoring-agent
    role: real-time-scoring
spec:
  ports:
  - name: rts-scoreport
    port: 8090
    protocol: TCP
    targetPort: rts-scoreport
  selector:
    app: real-time-scoring-agent
    role: real-time-scoring

Database

Database is used by RapidMiner Server.

kind: Pod
apiVersion: v1
metadata:
  name: rm-postgresql-svc
  labels:
    app: rm-postgresql-svc
spec:
  containers:
  - name: rm-postgresql-svc
    image: postgres:9.6
    ports:
    - name: postgresport
      containerPort: 5432
    env:
    - name: POSTGRES_DB
      value: rmsdb
    - name: POSTGRES_USER
      value: rmsdbuser
    - name: POSTGRES_PASSWORD
      value: rmsdbpassword
    volumeMounts:
    - name: pgvolume
      mountPath: /var/lib/postgresql/data
      subPath: postgres
  volumes:
  - name: pgvolume
    persistentVolumeClaim:
      claimName: rm-postgresql-pvc

RapidMiner Server

The main component of the RapidMiner Platform.

kind: Pod
apiVersion: v1
metadata:
  name: rm-server-svc 
  labels:
    app: rm-server-svc
    role: server
spec:
  hostname: rm-server-svc
  containers:
  - name: rapidminer-server
    image: rapidminer/rapidminer-server:9.6.0
    ports:
    - name: rmswebui
      containerPort: 8080
    - name: amq
      containerPort: 5672
    env:
    - name: JOBSERVICE_QUEUE_ACTIVEMQ_USERNAME
      value: amq-user
    - name: JOBSERVICE_QUEUE_ACTIVEMQ_PASSWORD
      value: amq-pass
    - name: JOBSERVICE_AUTH_SECRET
      value: c29tZS1hdXRoLXNlY3JldAo=
    - name: DBHOST
      value: postgres-svc
    - name: DBSCHEMA
      value: rmsdb
    - name: DBUSER
      value: rmsdbuser
    - name: DBPASS
      value: rmsdbpassword
    - name: JUPYTER_URL_SUFFIX
      value: /jupyterhub 
    - name: GRAFANA_URL_SUFFIX
      value: /grafana
    volumeMounts:
    - name: rm-server-home-pvc
      mountPath: /persistent-rapidminer-home
      subPath: rapidminer-home
  volumes:
  - name: rm-server-home-pvc
    persistentVolumeClaim:
      claimName: rm-server-home-pvc

Job-Agent

The worker which perform the computation tasks.

kind: Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rm-server-job-agent-svc
  labels:
    app: rm-server-job-agent-svc
    role: execution
spec:
  replicas: 3
  selector:
    matchLabels:
      app: rm-server-job-agent-svc
  template:
    metadata:
      labels:
        app: rm-server-job-agent-svc
        role: execution
    spec:
      containers:
      - name: rm-server-job-agent-svc
        image: rapidminer/rapidminer-execution-jobagent:9.6.0
        env:
        - name: RAPIDMINER_SERVER_HOST
          value: rapidminer-server-svc
        - name: RAPIDMINER_SERVER_PORT
          value: '8080'
        - name: JOBAGENT_QUEUE_ACTIVEMQ_URI
          value: failover:(tcp://rapidminer-server-amq-svc:5672)
        - name: JOBAGENT_QUEUE_ACTIVEMQ_USERNAME
          value: amq-user
        - name: JOBAGENT_QUEUE_ACTIVEMQ_PASSWORD
          value: amq-pass
        - name: JOBAGENT_AUTH_SECRET
          value: c29tZS1hdXRoLXNlY3JldAo=
        - name: RAPIDMINER_JOBAGENT_OPTS
          value: "-Djobagent.python.registryBaseUrl=http://pem-webui-svc:82/"

RapidMiner Proxy & Python Environment Manager

The proxy component handles the incoming HTTP(S) traffic into the entire platform. Python Environment manager component (PEM) controls the python packages for job-agents. Real-Time Scoring (RTS) was designed for fast scoring use cases via web services. Those three platform pieces are MUST in one POD in kubernetes beaucuse proxy must read the certificates which are genereated by pem-cron and rts-cron containers.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rm-proxy-svc
  labels:
    app: rm-proxy-svc
    role: proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rm-proxy-svc
  template:
    metadata:
      labels:
        app: rm-proxy-svc
        role: proxy
    spec:
      containers:
      - name: rm-proxy-svc
        image: rapidminer/rapidminer-proxy:9.6.0
        imagePullPolicy: Always
        env:
        - name: RMSERVER_BACKEND
          value: "http://rm-server-svc:8080"
        - name: GRAFANA_BACKEND
          value: "http://rm-grafana-svc:3000"
        - name: GRAFANA_URL_SUFFIX
          value: "/grafana"
        - name: PEM_BACKEND
          value: "http://pem-webui-svc:82/"
        - name: PEM_URL_SUFFIX
          value: "/pem"
        - name: RTS_WEBUI_BACKEND
          value: "http://rts-webui-svc:81/"
        - name: RTS_WEBUI_URL_SUFFIX
          value: "/rts-admin"
        - name: RTS_SCORING_BACKEND
          value: "http://rts-agent-svc:8090/"
        - name: RTS_SCORING_URL_SUFFIX
          value: "/rts"
        - name: HTTPS_CRT_PATH
          value: "/rapidminer/uploaded/certs/validated_cert.crt"
        - name: HTTPS_KEY_PATH
          value: "/rapidminer/uploaded/certs/validated_cert.key"
        - name: HTTPS_DH_PATH
          value: "/rapidminer/uploaded/certs/dhparam.pem"
        - name: DEBUG_CONF_INIT
          value: "true"
        ports:
        - name: proxyhttp
          containerPort: 80
        - name: proxyhttps
          containerPort: 443
        volumeMounts:
          - name: pem-uploaded-pvc
            mountPath: /rapidminer/pem/uploaded/
          - name: rts-uploaded-pvc
            mountPath: /rapidminer/rts/uploaded/
      - name: pem-webui
        image: rapidminer/python-environment-manager-webui:9.6.0
        imagePullPolicy: Always
        ports:
        - name: pem-webuiport
          containerPort: 82
        volumeMounts:
          - name: pem-uploaded-pvc
            mountPath: /var/www/html/uploaded
      - name: pem-cron
        image: rapidminer/python-environment-manager-cron:9.6.0
        imagePullPolicy: Always
        volumeMounts:
          - name: pem-uploaded-pvc
            mountPath: /rapidminer/uploaded
      - name: rts-cron
        image: rapidminer/rapidminer-real-time-scoring-cron:9.6.0
        resources:
          requests:
            memory: "100M"
            cpu: "0.5"
          limits:
            memory: "200M"
            cpu: "0.5"
        volumeMounts:
        - name: rts-uploaded-pvc
          mountPath: /rapidminer/uploaded/
        - name: rts-licenses-pvc
          mountPath: /rapidminer/rts_home/licenses/
      - name: real-time-scoring-webui
        image: rapidminer/rapidminer-real-time-scoring-webui:9.6.0
        ports:
        - name: rts-webuiport
          containerPort: 81
        resources:
          requests:
            memory: "200M"
            cpu: "0.5"
          limits:
            memory: "500M"
            cpu: "0.5"
        volumeMounts:
        - name: rts-uploaded-pvc
          mountPath: /var/www/html/uploaded
        - name: rts-licenses-pvc
          mountPath: 
      volumes:
      - name: pem-uploaded-pvc
        persistentVolumeClaim:
          claimName: pem-uploaded-pvc
      - name: rts-uploaded-pvc
        persistentVolumeClaim:
          claimName: rts-uploaded-pvc
      - name: rts-licenses-pvc
        persistentVolumeClaim:
          claimName: rts-licenses-pvc

Dashboards

Monitoring and metric analytics & dashboards.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: rm-grafana-svc
  labels:
    app: rm-grafana-svc
    role: grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rm-grafana-svc
  template:
    metadata:
      labels:
        app: rm-grafana-svc
        role: grafana
    spec:
      containers:
      - name: rm-grafana-proxy-svc
        image: rapidminer/rapidminer-grafana-proxy:9.6.0
        imagePullPolicy: Always
        env:
        - name: RAPIDMINER_URL
          value: http://rm-server-svc:8080
        ports:
        - name: grafanaport
          containerPort: 3000
      - name: rm-grafana-svc
        image: rapidminer/rapidminer-grafana:9.6.0
        imagePullPolicy: Always
        env:
        - name: GF_SERVER_ROOT_URL
          value: '%(protocol)s://%(domain)s:%(http_port)s/grafana/'
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: grafanaadminpass
        volumeMounts:
        - name: rm-grafana-home-pvc
          mountPath: /var/lib/grafana
      volumes:
      - name: rm-grafana-home-pvc
        persistentVolumeClaim:
          claimName: rm-grafana-home-pvc

Real-Time Scoring

This is an add-on product to RapidMiner Server designed for fast scoring use cases via web services.

kind: Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: real-time-scoring-agent
  labels:
    app: real-time-scoring-agent
    role: real-time-scoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: real-time-scoring-agent
  template:
    metadata:
      labels:
        app: real-time-scoring-agent
        role: real-time-scoring
    spec:
      containers:
      - name: real-time-scoring-agent
        image: rapidminer/rapidminer-execution-scoring:latest
        ports:
        - name: rts-scoreport
          containerPort: 8090
        env:
        - name: WAIT_FOR_LICENSES
          value: "1"
        resources:
          requests:
            memory: "2G"
            cpu: "1"
          limits:
            memory: "32G"
            cpu: "1"
        volumeMounts:
        - name: rts-deployments-pvc
          mountPath: /rapidminer-scoring-agent/home/deployments
        - name: rts-licenses-pvc
          mountPath: /rapidminer-scoring-agent/home/resources/licenses/rapidminer-scoring-agent/
      volumes:
      - name: rts-deployments-pvc
        persistentVolumeClaim:
          claimName: rts-deployments-pvc
      - name: rts-licenses-pvc
        persistentVolumeClaim:
          claimName: rts-licenses-pvc
#      nodeSelector:
#        node-label-name: label-value-of-worker-node-where-rts-may-started