You are viewing the RapidMiner Studio documentation for version 9.9 - Check here for latest version

# Default Forecast (Time Series)

## Synopsis

This operator trains a Default Forecast model on time series data.## Description

The Default Forecast model performs the forecast by predicting the same *forecast value* for the whole horizon window.
It can be used to test a forecasting method (e.g. ARIMA, Holt-Winters, Windowing) against a baseline forecasting performance, provided by this Default Forecast model.

The method to determine the *forecast value* can be selected by the parameter *method*.
The last value of the time series can be used or the *mean in window*, *median in window* or *mode in window*.
For the later three methods, the window always consists of the last *n* values of the time series, with *n* specified by the user (see parameter *window size*).
By default invalid values (missing, positive and negative infinity for numerical time series and emtpy strings for nominal time series) are included in the calculation of *mean*, *median*, or *mode*.
If the parameter *ignore invalid values* is selected, the invalid values are ignored in the calculation.

The *mode* is derived as the most frequent value in the window.
If more than one value have the highest frequency the value which occurs first in the window is used.
The *methods* *mean in window* and *median in window* can only be used for numerical series, while the others work also for nominal and time series with date time values.
For more details about the calculation of *mean*, *median*, or *mode* see the operator help of the operators *Extract Aggregate* and *Extract Mode*.

This operator works on all time series (numerical, nominal and time series with date time values) for the *methods* *last value* and *mode in window*. The *methods* *mean in window* and *median in window* only work on numerical time series.

## Differentiation

This operator is similar to other modeling operators, but is specifically designed to work on time series data. One of the implications of this is, that the forecast model should be applied on the same data it was trained on.

### Apply Forecast

This operator receives a trained Forecast Model (e.g. the Default Forecast model) and creates the forecast for the time series it was trained on.

### ARIMA

This operator trains an ARIMA model (Autoregressive Integrated Moving Average) on time series data to perform a forecast.

### Function and Seasonal Component Forecast

This operator trains a Function and Seasonal Forecast model (combining fitted function and seasonal component values) on time series data to perform a forecast.

### Holt-Winters

This operator trains a Holt-Winters model (triple exponential smoothing) on time series data to perform a forecast.

## Input

- example set (Data Table)
The ExampleSet which contains the time series data as an attribute.

## Output

- forecast model (IOObject)
The Default Forecast model calculated from the specified time series attribute. It also contains the original time series values.

- original (Data Table)
The ExampleSet that was given as input is passed through without changes.

## Parameters

- time_series_attribute
The time series attribute for which the Default Forecast model should be build. The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the parameter if the meta data is known.

Range: - has_indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.

Range: - indices_attribute
If the parameter

Range:*has indices*is set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known. - method
This parameter specifies the method to calculate the

*forecast value*of the Default Forecast model.- last value: The last value of the time series is used.
- mean in window: Mean of the values in the windowed time series. If invalid values aren't ignored, the mean is missing if any time series value is missing, positive or negative infinity.
- median in window: Median of the values in the windowed time series. If invalid values aren't ignored, these values are listed in the same way as finite values for the determination of the median.
- mode in window: Mode (most frequent) of the values in the windowed time series. If more than one value have the highest frequency the value which occurs first in the window is used. If invalid values aren't ignored and an invalid value is the most frequent in a time series, the computed mode is this value.

- window_size
This parameter specifies how many values are used in the calculation of

Range:*mean*,*median*or*mode*of the window. The window are always the last*n*values in the time series, n is specified by this parameter. - ignore_invalid_values
If this parameter is set to true invalid values (missing for all time series, positive infinity and negative infinity for numeric time series and empty strings for nominal time series) are ignored in the calculation of the

Range:*mean*,*median*or*mode*in the window.

## Tutorial Processes

### Default Forecast on Lake Huron Data Set

This tutorial process shows the basic usage of the Default Forecast operator, by training two Default Forecast models (method = last value and method = mean in window) on the Lake Huron data.

### Compare Performance of Forecast Models on Milk Production Data

This tutorial process shows how to compare the performance of an ARIMA forecast and an Holt-Winters forecast to the performance of a Default Forecast using method = median of the last 3 values.