You are viewing the RapidMiner Radoop documentation for version 9.7 - Check here for latest version

Cloudera Hadoop CDH 5.x/6.x

Creating a Radoop connection

It is highly recommended to use New Connection Icon New Connection / Import from Manager Icon Import from Cluster Manager option to create the connection directly from the configuration retrieved from Cloudera Manager. If you do not have a Cloudera Manager account that has access to the configuration, an administrator should be able to Download Client Configuration. Using the client configuration files, choose New Connection Icon New Connection / Import Wizard Icon Import Hadoop Configuration Files to create the connection from those files.

If security is enabled on the cluster, make sure you check Configuring Apache Sentry authorization section of the Hadoop Security chapter.

Configuring Spark

If you are using Spark 1.6 version you may need to select Spark 1.6 (CDH) for more recent CDH 5.x Cloudera Hadoop releases and Spark 1.6 for older CDH 5.x releases. Select any of them and then run the Spark job test (enable only this test in Full Test Icon Full Test... / Customize Icon Customize...) that automatically detects the proper version for you. Please choose the setting that this test recommends.

Using any other Spark version should be straightforward.