You are viewing the RapidMiner Radoop documentation for version 2024.0 - Check here for latest version
Radoop Compatibility
Supported Hadoop distributions
Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR) 6.x
- Azure HDInsight 4.0, 5.0
- Cloudera Data Platform Private Cloud Base (CDP) 7.x
For CDH distributions, we only support the minor versions that are also supported by Cloudera. For HDInsight and Amazon EMR operators related to model scoring is not available due their lack of running Hive on Java11.
Supported data warehouse systems (DWS)
Radoop supports the following data warehouse infrastructure:
- Hive 3.x (for scoring models it must run on Java11 JVM to load Radoop UDFs)
Supported Spark versions
Radoop supports the following Spark versions:
- Apache Spark 3.x (only Scala 2.12 distribution is supported on Java11 JVM)
Supported Java versions
On the Hadoop cluster, Radoop requires Oracle JDK 11 or OpenJDK 11 installed to operate. The cluster nodes should have at least 32 GB of RAM. On the machine running the extension itself (either within Altair AI Studio or Altair AI Hub), Radoop requires Oracle Java 11 or OpenJDK Java 11.
Altair RapidMiner extension compatibility
Radoop is not compatible with the Parallel Processing Extension. This extension must be disabled when using Radoop. Please select the Extensions > Manage Extensions... menu item and uncheck the box for Parallel Processing Extension.