Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 7.6 - Check here for latest version

What’s New in RapidMiner Radoop 2.2?

This page describes the enhancement being delivered by RapidMiner Radoop 2.2.

Machine Learning on Spark

Starting from version 2.2, RapidMiner Radoop now supports Apache Spark. RapidMiner Radoop 2.2 adds two popular machine learning algorithms from MLLib, Apache Spark’s machine learning library: logistic regression and decision trees. With that, both logistic regression and decision tree models can now be trained natively in Hadoop, making use of the full distributed computation power of a Hadoop cluster running Spark.

Scoring ANY RapidMiner Model

Up to now, RapidMiner Radoop supported the distributed scoring of the most common RapidMiner models. With RapidMiner Radoop 2.2, this capability has been extended fundamentally: Now, any RapidMiner model created in RapidMiner can be scored on Hadoop clusters in a distributed fashion using the Apply Model operator of RapidMiner Radoop. All classification models, regression models and clustering models provided by RapidMiner are supported.

Kerberos Authentication

With version 2.2, RapidMiner Radoop gains support for authenticated access to Hadoop clusters using Kerberos: RapidMiner Radoop now allows connecting to and working on Hadoop clusters that are secured through a Kerberos implementation.

Other Changes and Bugfixes

  • New helper operator for modeling: Remap Binominals.
  • New aggregation functions and default aggregations for Aggregate operator.
  • Improvements to the Retrieve operator UI.
  • Binominal type added to Type Conversion operator.
  • BUGFIX: Set Role metadata fixed when multiple labels are present.
  • BUGFIX: Pig integration test permission problem fixed.
  • BUGFIX: Integer overflow prevented when sampling from billions of records.