RapidMiner Radoop Compatibility
Supported Hadoop distributions
RapidMiner Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR) 4.4+
- Apache Hadoop 2.2+
- Azure HDInsight 3.6
- Cloudera Hadoop CDH5.x, 6.x
- Hortonworks HDP 2.x, 3.x
- IBM Open Platform 4.1+
- Mapr 5.x, 6.x
- Open Data Platform 0.9+
Supported data warehouse systems (DWS)
RapidMiner Radoop supports the following data warehouse infrastructures:
- Apache HiveServer2 0.13+
- Cloudera Impala 1.2.3 and later (see Impala limitations on the Installing Radoop on Studio page)
Supported Spark versions
RapidMiner Radoop supports the following Spark versions:
Apache Spark 1.2.x, 1.3.x and 1.4.x
- Supports Decision Tree, Linear Regression and Logistic Regression operators.
Apache Spark 1.5.x, 1.6.x, 2.0.x (except 2.0.1), 2.1.x, 2.2.x, 2.3.x
- Supports all Spark operators, including Spark Script (Python and/or R is required on the cluster nodes), Single Process Pushdown and SparkRM.
- Spark 2.0.1 minor version is not supported.
Supported Java versions
RapidMiner Radoop requires Java 8 installed on the Hadoop cluster to operate. The nodes should have at least 8 GB of RAM.
RapidMiner extension compatibility
RapidMiner Radoop is not compatible with the Parallel Processing Extension. This extension must be disabled when using Radoop. Please select the Extensions > Manage Extensions... menu item and uncheck the box for Parallel Processing Extension.