You are viewing the RapidMiner Radoop documentation for version 8.0 - Check here for latest version
RapidMiner Radoop Compatibility
Supported Hadoop distributions
RapidMiner Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR) 4.4+
- Apache Hadoop 2.2+
- Apache HDInsight 3.5
- Cloudera Hadoop CDH5.x
- Hortonworks HDP 2.x
- IBM Open Platform 4.1+
- Open Data Platform 0.9+
Supported data warehouse systems (DWS)
RapidMiner Radoop supports the following data warehouse infrastructures:
- Apache HiveServer2 0.13+
- Cloudera Impala 1.2.3 and later (see Impala limitations on the Installing Radoop on Studio page)
Supported Spark versions
RapidMiner Radoop supports the following Spark versions:
Apache Spark 1.2.x, 1.3.x and 1.4.x
- Supports Decision Tree, Linear Regression and Logistic Regression operators.
Apache Spark 1.5.x, 1.6.x, 2.0.x (except 2.0.1), 2.1.x, 2.2.x
- Supports all Spark operators, including Spark Script (Python and/or R is required on the cluster nodes), Single Process Pushdown and SparkRM.
- Spark 2.0.1 minor version is not supported.
Supported Java versions
RapidMiner Radoop requires Java 8 installed on the Hadoop cluster to operate. The nodes should have at least 8 GB of RAM.
RapidMiner extension compatibility
RapidMiner Radoop is not compatible with the Parallel Processing Extension. This extension must be disabled when using Radoop. Please select the Extensions > Manage Extensions... menu item and uncheck the box for Parallel Processing Extension.