Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.8 - Check here for latest version

What's New in RapidMiner Studio 9.6.0?

Released: Feb 26th, 2020

The following describes the bug fixes in RapidMiner Studio 9.6.0:

New Features

  • Added buttons for copying/pasting the active process to the process toolbar.
  • Equalize Time Series
    • Added two new operators (Equalize Numerical Indices and Equalize Time Stamps) which provide the functionality to equalize input time series. The output time series will have new equidistant index values. The operators provide different possibilities to configure the number of examples, the start value and the stop value and the step size of the new index values. The corresponding values of the output time series are computed by using a Replace Missing Values (Series) operation.
    • Equalize Numerical Indices: Equalize numerical indices into equidistant numerical indices with a numerical step size.
    • Equalize Time Stamps: Equalize date-time indices into equidistant date-time indices. Either with an exact duration (with millisecond precision) as the step size, or with a period (multiple of days, weeks, months or years) as the step size.
  • Peak Transformations:
    • Added two new operators (Z-Score Peak Transformation and Highest Peak Transformation) which perform a peak detection and transformation on time series. They detect peaks in a time series and add an indicator peak series (with the values -1,0,1 as peak flag values) and a peaked series (original values if a peak was detected, missing for non-peak areas).
    • Z-Score Peak Transformation: performs the peak detection by calculating the local mean and standard deviation and identifies values as peaks when they have a large deviation to this local mean
    • Highest Peak Transformation: performs the peak detection by dividing the time series in different areas and checking if local minima and maxima are valid peaks or only noise effects.
  • Peak Feature Extraction:
    • New operator Extract Peaks which performs a peak detection (by utilizing one of the new Peak Transformation operators and extracts features describing the peaks)
  • Added optional custom endpoint parameter to Amazon S3 connections. This enables you to use an S3 API compatible storage service other than Amazon S3.
  • Deployments / Model Ops:
    • All custom prediction models are now supported in model ops, i.e. models created with the Design view, in addition to Auto Model models
    • Grouped models are now supported as well which allows combinations of preprocessing models with a prediction model
    • Model Simulator in Deployments now uses raw data columns as input and performs data prep on the fly
    • Offer setting if scores should be explained (about 100x faster without), new deployments will have this disabled per default, existing deployments enabled
    • Show if scores should be explained in overview table
    • Model Ops initialization happens in background now – no longer blocking UI start of RM if a remote location is not available (anymore)
    • Some speed improvements for model ops (less objects are loaded from repos which makes things a bit faster for remote deployments
  • Model Simulator operator now also supports grouped models

Enhancements

  • Connections to external data sources like Cassandra or MongoDB are now properly re-used (within reason) and closed when a process is finished. This should lead to less connections to an external data source when using loop constructs, as well as properly closed connections after a process if finished.
  • Windows and OS X builds now ship with OpenJDK (version 8u232)
  • Added new timezone parameter to JDBC connections. Note: date handling in databases (and generally) is a tricky subject, and there are quite a few ways to make mistakes while doing so. Some databases/JDBC drivers also don't implement date handling properly. Last but not least, keep in mind that a date_time/date is a fixed point in time, but when it is displayed in a more human readable format than "milliseconds since 01-01-1970 UTC", the display string is converting that instant to your display timezone. So even if for example a date is 13th of Jan in UTC, you may see 12th of Jan when viewing it in Australia, due to the display timezone offset. The actual point in time (milliseconds since 01-01-1970 UTC) however would be identical. See documentation for further information.
  • When parsing a string to time with Nominal to Date, the associated timestamp now represents that time on the 1st of January, 1970 instead of 1st of February 1970
  • Added Default User-Agent setting to Preferences / System
  • Updated MariaDB JDBC driver
  • You can now see which Java version is being used when looking at the "About" dialog
  • Improved meta data warning in case the time series attribute selection of time series operators is empty
  • Added option to autodetect S3 region in Amazon S3 connections
  • Improved Google Cloud Services connection UI
  • File chooser icons on OS X are now also supporting HiDPI
  • When removing a repository, the repository.xml file now gets updated immediately
  • Visualizations: Tick interval input field now allows to set much larger values for datetime axes as its using milliseconds as a unit to split the chunks
  • Updated the Step by Step In-Product Tutorial content
  • Added more search tags to various performance and aggregation operators
  • Improved error message when download/deserialization of data from a remote repository occurs
  • Improved error message when SSL certificate was invalid when attempting to connect to a RM Server repository.
  • Improved logging when trying to connect to a RM Server and unusual exceptions occur, e.g. more details about why SSL connection failed, what the network problem is, etc.

Bugfixes

  • Fixed issue that could cause Studio to stop starting and be stuck at the splash screen forever.
  • Fixed an issue where storing datasets in a database using the automatically created primary key was not possible.
  • Declare Missing Value no longer crashes if the expression mode is selected and the expression itself returns a missing value. Instead, it will evaluate to false and thus NOT set a missing value for that row.
  • Fixed models and other IOObjects coming from extensions not being identified correctly in Server repositories.
  • Fixed Auto Model not being able to use results of a Join operator in some cases.
  • Fixed broken properties when storing data tables in rare cases.
  • It is no longer possible to create RapidMiner Server repositories with an invalid name.
  • Filter Examples now correctly resolves all macros in parameters, including in custom filter attribute names.
  • Fixed error that could sometimes cause result tables not being able to move to Auto Model via the button in the Results tab.
  • Fixed an issue that caused Visualizations to not appear on certain Linux systems.
  • Fixed file chooser icons on OS X.
  • Fixed bug for scoring in Deployments: if column types are incompatible, they are actually dropped now (which was documented as such but did not happen)
  • Auto Model will now be restored if the user cancels a deployment by closing the deployment dialog

Other

  • It is no longer possible to create legacy connections and other connections which have been replaced with the new repository connection objects in RapidMiner 9.3. Existing connections can still be edited and used, but this functionality will be removed eventually as well. Make sure to migrate existing legacy connections to repository connection objects! See documentation for reference.

Development

  • Added caching for connections based on ConnectionAdapterHandler to reduce connection count and give possibility to clean connections up after it is no longer needed (e.g. the process is finished).
  • GlobalSearch is no longer available in headless mode (aka command line, job container execution, etc)