Forward Selection (RapidMiner Studio Core)

Synopsis

This operator selects the most relevant attributes of the given ExampleSet through a highly efficient implementation of the forward selection scheme.

Description

The Forward Selection operator is a nested operator i.e. it has a subprocess. The subprocess of the Forward Selection operator must always return a performance vector. For more information regarding subprocesses please study the Subprocess operator.

The Forward Selection operator starts with an empty selection of attributes and, in each round, it adds each unused attribute of the given ExampleSet. For each added attribute, the performance is estimated using the inner operators, e.g. a cross-validation. Only the attribute giving the highest increase of performance is added to the selection. Then a new round is started with the modified selection. This implementation avoids any additional memory consumption besides the memory used originally for storing the data and the memory which might be needed for applying the inner operators. The stopping behavior parameter specifies when the iteration should be aborted. There are three different options:

  • without increase : The iteration runs as long as there is any increase in performance.
  • without increase of at least: The iteration runs as long as the increase is at least as high as specified, either relative or absolute. The minimal relative increase parameter is used for specifying the minimal relative increase if the use relative increase parameter is set to true. Otherwise, the minimal absolute increase parameter is used for specifying the minimal absolute increase.
  • without significant increase: The iteration stops as soon as the increase is not significant to the level specified by the alpha parameter.

The speculative rounds parameter defines how many rounds will be performed in a row, after the first time the stopping criterion is fulfilled. If the performance increases again during the speculative rounds, the selection will be continued. Otherwise all additionally selected attributes will be removed, as if no speculative rounds had executed. This might help avoiding getting stuck in local optima.

Feature selection i.e. the question for the most relevant features for classification or regression problems, is one of the main data mining tasks. A wide range of search methods have been integrated into RapidMiner including evolutionary algorithms. For all search methods we need a performance measurement which indicates how well a search point (a feature subset) will probably perform on the given data set.

Differentiation

Backward Elimination

The Backward Elimination operator starts with the full set of attributes and, in each round, it removes each remaining attribute of the given ExampleSet. For each removed attribute, the performance is estimated using the inner operators, e.g. a cross-validation. Only the attribute giving the least decrease of performance is finally removed from the selection. Then a new round is started with the modified selection.

Input

  • example set (IOObject)

    This input port expects an ExampleSet. This ExampleSet is available at the first port of the nested chain (inside the subprocess) for processing in the subprocess.

Output

  • example set (IOObject)

    The feature selection algorithm is applied on the input ExampleSet. The resultant ExampleSet with reduced attributes is delivered through this port.

  • attribute weights (Attribute Weights)

    The attribute weights are delivered through this port.

  • performance (Performance Vector)

    This port delivers the Performance Vector for the selected attributes. A Performance Vector is a list of performance criteria values.

Parameters

  • maximal_number_of_attributesThis parameter specifies the maximal number of attributes to be selected through Forward Selections. Range: integer
  • speculative_roundsThis parameter specifies the number of times, the stopping criterion might be consecutively ignored before the elimination is actually stopped. A number higher than one might help avoiding getting stuck in local optima. Range: integer
  • stopping_behaviorThe stopping behavior parameter specifies when the iteration should be aborted. There are three different options:
    • without_increase: The iteration runs as long as there is any increase in performance.
    • without_increase_of_at_least: The iteration runs as long as the increase is at least as high as specified, either relative or absolute. The minimal relative increase parameter is used for specifying the minimal relative increase if the use relative increase parameter is set to true. Otherwise, the minimal absolute increase parameter is used for specifying the minimal absolute increase.
    • without_significant_increase: The iteration stops as soon as the increase is not significant to the level specified by the alpha parameter.
    Range: selection
  • use_relative_increaseThis parameter is only available when the stopping behavior parameter is set to 'without increase of at least'. If the use relative increase parameter is set to true the minimal relative increase parameter will be used otherwise the minimal absolute increase parameter will be used. Range: boolean
  • minimal_absolute_increaseThis parameter is only available when the stopping behavior parameter is set to 'without increase of at least' and the use relative increase parameter is set to false. If the absolute performance increase to the last step drops below this threshold, the selection will be stopped. Range: real
  • minimal_relative_increaseThis parameter is only available when the stopping behavior parameter is set to 'without increase of at least' and the use relative increase parameter is set to true. If the relative performance increase to the last step drops below this threshold, the selection will be stopped. Range: real
  • alphaThis parameter is only available when the stopping behavior parameter is set to 'without significant increase'. This parameter specifies the probability threshold which determines if differences are considered as significant. Range: real

Tutorial Processes

Feature reduction of the Polynomial data set through Forward Selection

The 'Polynomial' data set is loaded using the Retrieve operator. A breakpoint is inserted here so that you can have a look at the ExampleSet. You can see that the ExampleSet has 5 regular attributes other then the label attribute. The Forward Selection operator is applied on the ExampleSet which is a nested operator i.e. it has a subprocess. It is necessary for the subprocess to deliver a performance vector. This performance vector is used by the underlying feature reduction algorithm. Have a look at the subprocess of this operator. The X-Validation operator is used which itself is a nested operator. Have a look at the subprocesses of the X-Validation operator. The K-NN operator is used in the 'Training' subprocess to train a model. The trained model is applied using the Apply Model operator in the 'Testing' subprocess. The performance is measured through the Performance operator and the resultant performance vector is used by the underlying algorithm. Run the process and switch to the Results Workspace. You can see that the ExampleSet that had 5 attributes has now been reduced to 3 attributes.