You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version
Default Forecast (Time Series)
Synopsis
This operator trains a Default Forecast model on time series data.Description
The Default Forecast model performs the forecast by predicting the same forecast value for the whole horizon window. It can be used to test a forecasting method (e.g. ARIMA, Holt-Winters, Windowing) against a baseline forecasting performance, provided by this Default Forecast model.
The method to determine the forecast value can be selected by the parameter method. The last value of the time series can be used or the mean in window, median in window or mode in window. For the later three methods, the window always consists of the last n values of the time series, with n specified by the user (see parameter window size). By default invalid values (missing, positive and negative infinity for numerical time series and emtpy strings for nominal time series) are included in the calculation of mean, median, or mode. If the parameter ignore invalid values is selected, the invalid values are ignored in the calculation.
The mode is derived as the most frequent value in the window. If more than one value have the highest frequency the value which occurs first in the window is used. The methods mean in window and median in window can only be used for numerical series, while the others work also for nominal and time series with date time values. For more details about the calculation of mean, median, or mode see the operator help of the operators Extract Aggregate and Extract Mode.
This operator works on all time series (numerical, nominal and time series with date time values) for the methods last value and mode in window. The methods mean in window and median in window only work on numerical time series.
Differentiation
This operator is similar to other modeling operators, but is specifically designed to work on time series data. One of the implications of this is, that the forecast model should be applied on the same data it was trained on.
Apply Forecast
This operator receives a trained Forecast Model (e.g. the Default Forecast model) and creates the forecast for the time series it was trained on.
ARIMA
This operator trains an ARIMA model (Autoregressive Integrated Moving Average) on time series data to perform a forecast.
Function and Seasonal Component Forecast
This operator trains a Function and Seasonal Forecast model (combining fitted function and seasonal component values) on time series data to perform a forecast.
Holt-Winters
This operator trains a Holt-Winters model (triple exponential smoothing) on time series data to perform a forecast.
Input
- example set (Data Table)
The ExampleSet which contains the time series data as an attribute.
Output
- forecast model (IOObject)
The Default Forecast model calculated from the specified time series attribute. It also contains the original time series values.
- original (Data Table)
The ExampleSet that was given as input is passed through without changes.
Parameters
- time_series_attribute
The time series attribute for which the Default Forecast model should be build. The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Range: - has_indices
This parameter indicates if there is an index attribute associated with the time series. If this parameter is set to true, the index attribute has to be selected.
Range: - indices_attribute
If the parameter has indices is set to true, this parameter defines the associated index attribute. It can be either a date, date_time or numeric value type attribute. The attribute name can be selected from the drop down box of the parameter if the meta data is known.
Range: - sort_time_series
If this parameter is selected, the input time series will be sorted, according to the selected indices attribute, before the time series operation is applied on. If it is not selected and the input time series is not sorted, a corresponding User Error is thrown.
Keep in mind that the indices values still needs to be unique. If the values are non-unique a corresponding User Error is thrown.
The data set provided at the original output port will be the sorted input time series.
Range: - method
This parameter specifies the method to calculate the forecast value of the Default Forecast model.
- last value: The last value of the time series is used.
- mean in window: Mean of the values in the windowed time series. If invalid values aren't ignored, the mean is missing if any time series value is missing, positive or negative infinity.
- median in window: Median of the values in the windowed time series. If invalid values aren't ignored, these values are listed in the same way as finite values for the determination of the median.
- mode in window: Mode (most frequent) of the values in the windowed time series. If more than one value have the highest frequency the value which occurs first in the window is used. If invalid values aren't ignored and an invalid value is the most frequent in a time series, the computed mode is this value.
- window_size
This parameter specifies how many values are used in the calculation of mean, median or mode of the window. The window are always the last n values in the time series, n is specified by this parameter.
Range: - ignore_invalid_values
If this parameter is set to true invalid values (missing for all time series, positive infinity and negative infinity for numeric time series and empty strings for nominal time series) are ignored in the calculation of the mean, median or mode in the window.
Range:
Tutorial Processes
Default Forecast on Lake Huron Data Set
This tutorial process shows the basic usage of the Default Forecast operator, by training two Default Forecast models (method = last value and method = mean in window) on the Lake Huron data.
Compare Performance of Forecast Models on Milk Production Data
This tutorial process shows how to compare the performance of an ARIMA forecast and an Holt-Winters forecast to the performance of a Default Forecast using method = median of the last 3 values.