You are viewing the RapidMiner Radoop documentation for version 10.1 - Check here for latest version
Cloudera Hadoop CDH 5.x/6.x
Creating a Radoop connection
It is highly recommended to use New Connection / Import from Cluster Manager option to create the connection directly from the configuration retrieved from Cloudera Manager. If you do not have a Cloudera Manager account that has access to the configuration, an administrator should be able to Download Client Configuration. Using the client configuration files, choose New Connection / Import Hadoop Configuration Files to create the connection from those files.
If security is enabled on the cluster, make sure you check Configuring Apache Sentry authorization section of the Hadoop Security chapter.
Configuring Spark
If you are using Spark 1.6 version you may need to select Spark 1.6 (CDH) for more recent CDH 5.x Cloudera Hadoop releases and Spark 1.6 for older CDH 5.x releases. Select any of them and then run the Spark job test (enable only this test in Full Test... / Customize...) that automatically detects the proper version for you. Please choose the setting that this test recommends.
Using any other Spark version should be straightforward.