What's new in RapidMiner Radoop 9.5
This page describes the new features of RapidMiner Radoop 9.5.
Radoop Proxy connection to Hadoop 3 based clusters
We have enhanced Radoop Proxy to work seamlessly with clusters based on Hadoop 3 (such as Cloudera CDH 6.x or HDP 3.x).
This means that if your organization runs a Hadoop cluster with a Hadoop 3 based distribution, network administrators will only need to open a few ports on the company firewall to enable data scientists to use RapidMiner Radoop with such a firewalled cluster.
Revamped general and connection-level settings
To make Radoop more user-friendly, we moved most of the settings from the RapidMiner Studio Preferences to Radoop connections. This allows you to conveniently set up your connections when connecting to multiple Hadoop clusters and use them without some settings interfering with each other.
As an example, on a production cluster with a lot more data, you might want to use a different timeout value for your Hive commands, than on the dev/test cluster. In Radoop 9.5, this is now quite easy as we moved the Hive command timeout setting from Studio Preferences to a connection level setting.
Don't worry, all existing connections and settings will be preserved during an update to this version of Radoop.
Median and mode in Aggregate (Radoop) operator
To make it even easier to work with big data, we are continuously working on closing the gap between operators built into RapidMiner Studio and the ones optimized for Hadoop.
This time, we added median and mode as two new aggregation attributes. Behind the scenes these aggregations will leverage the power of optimized Hive queries to produce aggregates on large datasets quickly.
To support you and your company in adopting OpenJDK, RapidMiner Radoop now supports OpenJDK Java 8.