What's new in RapidMiner Radoop 9.5

This page describes the new features of RapidMiner Radoop 9.5.

Radoop Proxy connection to Hadoop 3 based clusters

We have enhanced Radoop Proxy to work seamlessly with clusters based on Hadoop 3 (such as Cloudera CDH 6.x or HDP 3.x).

This means that if your organization runs a Hadoop cluster with a Hadoop 3 based distribution, network administrators will only need to open a few ports on the company firewall to enable data scientists to use RapidMiner Radoop with such a firewalled cluster.

Revamped general and connection-level settings

To make Radoop more user-friendly, we moved most of the settings from the RapidMiner Studio Preferences to Radoop connections. This allows you to conveniently set up your connections when connecting to multiple Hadoop clusters and use them without some settings interfering with each other.

Revamped Radoop general settings

As an example, on a production cluster with a lot more data, you might want to use a different timeout value for your Hive commands, than on the dev/test cluster. In Radoop 9.5, this is now quite easy as we moved the Hive command timeout setting from Studio Preferences to a connection level setting.

New location for Hive command timeout in the above example

Don't worry, all existing connections and settings will be preserved during an update to this version of Radoop.

Median and mode in Aggregate (Radoop) operator

To make it even easier to work with big data, we are continuously working on closing the gap between operators built into RapidMiner Studio and the ones optimized for Hadoop.

This time, we added median and mode as two new aggregation attributes. Behind the scenes these aggregations will leverage the power of optimized Hive queries to produce aggregates on large datasets quickly.

Median and mode

OpenJDK support

To support you and your company in adopting OpenJDK, RapidMiner Radoop now supports OpenJDK Java 8.

Enhancements and bug fixes