Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 7.6 - Check here for latest version

What’s New in RapidMiner Radoop 7.2?

This page describes the new features of RapidMiner Radoop 7.2 as well as its enhancements and bug fixes.

Update / migration

Please note that RapidMiner Radoop 7.2 is not backwards compatible and requires RapidMiner Studio 7.2 and/or RapidMiner Server 7.2. Update is available through the RapidMiner Marketplace.

Free version available

RapidMiner Radoop only had a paid version before, but starting from RapidMiner Radoop 7.2, most functionality is part of the Free version. Please visit the RapidMiner website for details.

Enhancements and bug fixes

The following improvements are part of RapidMiner Radoop 7.2.

Enhancements

  • Added support for HDFS encryption
  • Added support for EMR 4.x (default ports are adapted as well)
  • Added support for Hive database selection in Retrieve from Hive, Append into Hive, Store in Hive operators
  • Added support for Hive database selection in Drop Hive Table, Rename Hive Table and Copy Hive Table operators
  • Dropped support for old HiveServer1
  • Dropped support for Hive versions below 0.13
  • Added connection test for Java 8 requirement on the cluster nodes
  • Added progress display and progress logging for Radoop Read CSV operator
  • Added warning for required Hadoop properties for Impala connection, the entry keys can be added automatically
  • Table names inside the Radoop Nest that contain special characters are now rejected and not converted automatically
  • Temporary Hive objects are now dropped with purge option (skipping the Trash)
  • Added workarounds for multiple ALTER TABLE command related Hive and Impala bugs
  • The property radoop.emr.modify-staging-dir is no longer required for any EMR connection
  • Added more logging about the client side Radoop connection properties
  • Spark is now enabled by default for new connections
  • Advanced connection dialog can now be opened via a double-click on the connection entry
  • Submitted jobs on Hive-on-Spark now have proper names (usually the name of the operator running)

Bug fixes

  • BUGFIX: Both keytab file path and local Spark temp directory path are allowed to contain space characters now
  • BUGFIX: Create permanent UDFs test now reports if functions are missing on the cluster
  • BUGFIX: Multiple Studio or Server instances on the same machine may no longer remove each others local temporary files on Linux and Mac OS (ClassNotFoundException or NoClassDefFoundError)
  • BUGFIX: Single Process Pushdown no longer fails when an input object on one of its input ports is too large
  • BUGFIX: Single Process Pushdown now uploads an extension if its IOObject is used in the process
  • BUGFIX: Adds workaround for Sentry bug (SENTRY-1001)
  • BUGFIX: Fixed rare connection error occurred when switching between different secure connections
  • BUGFIX: Stop Test now cancels Hive Connection test and other smaller tests, does not wait for a timeout
  • BUGFIX: Fixed Job Kill test for Hive-on-Spark (container does not keep running)
  • BUGFIX: Generate Attributes UI no longer adds invalid square brackets around attribute names
  • BUGFIX: Fixed an edge case when switching between different secure connections could lead to Kerberos auth error
  • BUGFIX: Fixed a bug that Single Process Pushdown may fail with table not found error after its port is connected then disconnected
  • BUGFIX: Fixed rare error when opening Hadoop Data View (ClassCastException)