Backward Elimination (RapidMiner Studio Core)

Synopsis

This operator selects the most relevant attributes of the given ExampleSet through an efficient implementation of the backward elimination scheme.

Description

The Backward Elimination operator is a nested operator i.e. it has a subprocess. The subprocess of the Backward Elimination operator must always return a performance vector. For more information regarding subprocesses please study the Subprocess operator.

The Backward Elimination operator starts with the full set of attributes and, in each round, it removes each remaining attribute of the given ExampleSet. For each removed attribute, the performance is estimated using the inner operators, e.g. a cross-validation. Only the attribute giving the least decrease of performance is finally removed from the selection. Then a new round is started with the modified selection. This implementation avoids any additional memory consumption besides the memory used originally for storing the data and the memory which might be needed for applying the inner operators. The stopping behavior parameter specifies when the iteration should be aborted. There are three different options:

  • with decrease: The iteration runs as long as there is any increase in performance.
  • with decrease of more than: The iteration runs as long as the decrease is less than the specified threshold, either relative or absolute. The maximal relative decrease parameter is used for specifying the maximal relative decrease if the use relative decrease parameter is set to true. Otherwise, the maximal absolute decrease parameter is used for specifying the maximal absolute decrease.
  • with significant decrease: The iteration stops as soon as the decrease is significant to the level specified by the alpha parameter.

The speculative rounds parameter defines how many rounds will be performed in a row, after the first time the stopping criterion is fulfilled. If the performance increases again during the speculative rounds, the elimination will be continued. Otherwise all additionally eliminated attributes will be restored, as if no speculative rounds had executed. This might help avoiding getting stuck in local optima.

Feature selection i.e. the question for the most relevant features for classification or regression problems, is one of the main data mining tasks. A wide range of search methods have been integrated into RapidMiner including evolutionary algorithms. For all search methods we need a performance measurement which indicates how well a search point (a feature subset) will probably perform on the given data set.

Differentiation

Forward Selection

The Forward Selection operator starts with an empty selection of attributes and, in each round, it adds each unused attribute of the given ExampleSet. For each added attribute, the performance is estimated using the inner operators, e.g. a cross-validation. Only the attribute giving the highest increase of performance is added to the selection. Then a new round is started with the modified selection.

Input

  • example set (IOObject)

    This input port expects an ExampleSet. This ExampleSet is available at the first port of the nested chain (inside the subprocess) for processing in the subprocess.

Output

  • example set (IOObject)

    The feature selection algorithm is applied on the input ExampleSet. The resultant ExampleSet with reduced attributes is delivered through this port.

  • attribute weights (Attribute Weights)

    The attribute weights are delivered through this port.

  • performance (Performance Vector)

    This port delivers the Performance Vector for the selected attributes. A Performance Vector is a list of performance criteria values.

Parameters

  • maximal_number_of_eliminationsThis parameter specifies the maximal number of backward eliminations. Range: integer
  • speculative_roundsThis parameter specifies the number of times, the stopping criterion might be consecutively ignored before the elimination is actually stopped. A number higher than one might help avoiding getting stuck in local optima. Range: integer
  • stopping_behaviorThe stopping behavior parameter specifies when the iteration should be aborted. There are three different options:
    • with_decrease: The iteration runs as long as there is any increase in performance.
    • with_decrease_of_more_than: The iteration runs as long as the decrease is less than the specified threshold, either relative or absolute. The maximal relative decrease parameter is used for specifying the maximal relative decrease if the use relative decrease parameter is set to true. Otherwise, the maximal absolute decrease parameter is used for specifying the maximal absolute decrease.
    • with_significant_decrease: The iteration stops as soon as the decrease is significant to the level specified by the alpha parameter.
    Range: selection
  • use_relative_decreaseThis parameter is only available when the stopping behavior parameter is set to 'with decrease of more than'. If the use relative decrease parameter is set to true the maximal relative decrease parameter will be used otherwise the maximal absolute decrease parameter. Range: boolean
  • maximal_absolute_decreaseThis parameter is only available when the stopping behavior parameter is set to 'with decrease of more than' and the use relative decrease parameter is set to false. If the absolute performance decrease to the last step exceeds this threshold, the elimination will be stopped. Range: real
  • maximal_relative_decreaseThis parameter is only available when the stopping behavior parameter is set to 'with decrease of more than' and the use relative decrease parameter is set to true. If the relative performance decrease to the last step exceeds this threshold, the elimination will be stopped. Range: real
  • alphaThis parameter is only available when the stopping behavior parameter is set to 'with significant decrease'. This parameter specifies the probability threshold which determines if differences are considered as significant. Range: real

Tutorial Processes

Feature reduction of the Polynomial data set

The 'Polynomial' data set is loaded using the Retrieve operator. A breakpoint is inserted here so that you can have a look at the ExampleSet. You can see that the ExampleSet has 5 regular attributes other then the label attribute. The Backward Elimination operator is applied on the ExampleSet which is a nested operator i.e. it has a subprocess. It is necessary for the subprocess to deliver a performance vector. This performance vector is used by the underlying feature reduction algorithm. Have a look at the subprocess of this operator. The X-Validation operator is used there which itself is a nested operator. Have a look at the subprocesses of the X-Validation operator. The K-NN operator is used in the 'Training' subprocess to train a model. The trained model is applied using the Apply Model operator in the 'Testing' subprocess. The performance is measured through the Performance operator and the resultant performance vector is used by the underlying algorithm. Run the process and switch to the Results Workspace. You can see that the ExampleSet that had 5 attributes has now been reduced to 3 attributes.