RapidMiner Radoop Compatibility
Supported Hadoop distributions
RapidMiner Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR) 5.x
- Azure HDInsight 3.6, 4.0
- Cloudera's Distribution of Hadoop (CDH) 5.x, 6.x
- Cloudera Data Platform Private Cloud Base (CDP) 7.x
- Hortonworks Data Platform (HDP) 2.x, 3.x
For CDH and HDP distributions, we only support the minor versions that are also supported by Cloudera.
Deprecated Hadoop distributions
Although earlier versions of RapidMiner Radoop works with the Hadoop distributions and versions listed below, we have marked them as deprecated. Support for these is discontinued as of RapidMiner Radoop 9.6.0.
- Amazon Elastic MapReduce (EMR) 4.x
- Apache Hadoop 2.2+
- IBM Open Platform 4.1+
- Mapr 5.x, 6.x
- Open Data Platform 0.9+
For Amazon Elastic MapReduce, we will now distinguish between 4.x (deprecated) and 5.x versions inside Radoop connections. If you already have a Radoop connection set up to an Amazon Elastic MapReduce cluster, please update the connection with the corresponding version (after updating Radoop to the latest version).
Supported data warehouse systems (DWS)
RapidMiner Radoop supports the following data warehouse infrastructures:
- Apache HiveServer2 0.13+
- Cloudera Impala 1.2.3 and later (see Impala limitations on the Installing Radoop on Studio page)
Supported Spark versions
RapidMiner Radoop supports the following Spark versions:
- Apache Spark 1.6.x
- Apache Spark 2.0.x (except 2.0.1), 2.1.x, 2.2.x, 2.3.x, 2.4.x (only Scala 2.11 distribution is supported)
Spark 2.0.1 minor version is not supported.
Deprecated Spark versions
Although earlier versions of RapidMiner Radoop works with the Spark versions listed below, we have marked them as deprecated. Support is discontinued as of RapidMiner Radoop 9.6.0.
- Apache Spark 1.5.x
Please contact support if you rely on this specific Spark version.
Supported Java versions
On the Hadoop cluster, RapidMiner Radoop requires Oracle JDK 8 or OpenJDK 8 installed to operate. The cluster nodes should have at least 8 GB of RAM. On the machine running the extension itself (either within RapidMiner Studio or RapidMiner Server), RapidMiner Radoop requires Oracle Java 8 or OpenJDK Java 8.
RapidMiner extension compatibility
RapidMiner Radoop is not compatible with the Parallel Processing Extension. This extension must be disabled when using Radoop. Please select the Extensions > Manage Extensions... menu item and uncheck the box for Parallel Processing Extension.