What's new in RapidMiner Studio 9.1

Automatic Feature Selection and Engineering

Improve your predictive models through automated selection and generation of optimal feature sets. Benefit from the capability in Auto Model or by using the new Automatic Feature Engineering operator in a RapidMiner process.

Improved integration of Turbo Prep and Auto Model into process design

Access Turbo Prep and Auto Model seamlessly from the process canvas, by right-clicking on the output port, or by interacting with the buttons on results view. Once done with transforming data in Turbo Prep, you can generate a sub process which will add the steps you performed in Turbo Prep back into the process you have been working on. This way you can quickly navigate back and forth and benefit from the intuitive and interactive interface of Turbo Prep while keeping the full flexibility of process design.

Further improvements in Turbo Prep and Auto Model

  • Turbo Prep
    • Use bar charts in PIVOT to explore your data.
    • Filter or sort columns in PIVOT tables.
    • Additional education videos added.
  • Auto Model
    • Use Support Vector Machine for prediction tasks.
    • Extract features from date columns.
    • Export result to Repository, Qlik, Excel, CSV or continue directly in Turbo Prep.

New Pivot operator and percentile aggregation

  • Easily aggregate and transform your data with a single and lightning fast Pivot operator. This new operator will deprecate the old one.
  • Calculate percentile for a numerical column using the the enhanced Aggregate operator. Just add the desired percentile as parameter to "percentile(X)" function.

Time Series

Tackle the complexity of time series data with the new time series capabilities: Understand trends and seasonality using the new time series decomposition operators. Forecast with the Holt-Winters method. Process nominal time series data with the Windowing, Process Windows, and Replace Missing Values (Series) operators.

Use internally signed certificates

Connect to an https server using a custom certificate by copying it to .RapidMiner/cacerts folder. Studio will make sure that it is added to Trust Store upon next start, and usable until it is removed.

In-Database Processing

Save time doing data prep on large data with the new In-Database extension. Visually define data prep or ETL workflows in RapidMiner Studio and execute them directly in the database. Reduce data transfer by loading only the data you need after preparation.

How it works

With the new In-Database Processing extension you can design a subprocess with new, but familiar preprocessing operators. Computation of these operators is pushed down into a database, i.e. they are automatically translated into SQL code which is submitted to the database. You can then process the result with other operators just like in a normal RapidMiner process.

The main goal of this extension is to allow you to limit the data that you read from a database into the memory of RapidMiner Studio or Server. This is especially important when you are using cloud engines like Google BigQuery where you have to pay for the amount of data you retrieve. Another goal is to leverage your database's computing power which is also important when using distributed, scalable database or cloud engines. All this is done without the need to write SQL code.

This first version of the extension supports Google BigQuery (via OAuth 2), PostgreSQL, MySQL and H2. Further database and cloud engine support is planned for the future.

Enhancements and bug fixes

The following pages describe the enhancements and bug fixes in RapidMiner Studio 9.1 releases: