Weight by Component Model (RapidMiner Studio Core)

Synopsis

This operator creates attribute weights of the ExampleSet by using a component created by operators like the PCA, GHA or ICA. If the model given to this operator is PCA then this operator behaves exactly as the Weight by PCA operator.

Description

The Weight by Component Model operator always comes after operators like the PCA, GHA or ICA. The ExampleSet and Preprocessing model generated by these operators is connected to the ExampleSet and Model ports of the Weight by Component Model operator. The Weight by Component Model operator then generates attribute weights of the original ExampleSet using a component created by the previous operator (i.e. PCA, GHA, ICA etc). The component is specified by the component number parameter. If the normalize weights parameter is not set to true exact values of the selected component are used as attribute weights. The normalize weights parameter is usually set to true to spread the weights between 0 and 1.

The attribute weights reflect the relevance of the attributes with respect to the class attribute. The higher the weight of an attribute, the more relevant it is considered.

Input

  • example set (IOObject)

    This input port expects an ExampleSet. It is the output of the PCA operator in the attached Example Process.

  • model (Model)

    This input port expects a model. Usually the Preprocessing model generated by the operators like PCA, GHA or ICA is provided here.

Output

  • weights (Average Vector)

    This port delivers the weights of the attributes with respect to the label attribute. The attributes with higher weight are considered more relevant.

  • example set (IOObject)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

  • model (Model)

    The model that was given as input is passed without changing to the output through this port.

Parameters

  • normalize_weightsThis parameter indicates if the calculated weights should be normalized or not. If set to true, all weights are normalized in range from 0 to 1. Range: boolean
  • sort_weightsThis parameter indicates if the attributes should be sorted according to their weights in the results. If this parameter is set to true, the order of the sorting is specified using the sort direction parameter. Range: boolean
  • sort_directionThis parameter is only available when the sort weights parameter is set to true. This parameter specifies the sorting order of the attributes according to their weights. Range: selection
  • component_numberThis parameter specifies the number of the component that should be used as attribute weights. Range: integer

Tutorial Processes

Calculating the attribute weights of the Sonar data set by PCA

The 'Sonar' data set is loaded using the Retrieve operator. The PCA operator is applied on it. The dimensionality reduction parameter is set to 'none'. A breakpoint is inserted here so that you can have a look at the components created by the PCA operator. Have a look at the EigenVectors generated by the PCA operator especially 'PC1' because it will be used as weights by using the Weight by Component Model operator. The Weight by Component Model operator is applied next. The ExampleSet and Model ports of the PCA operator are connected to the corresponding ports of the Weight by Component Model operator. The normalize weights and sort weights parameters are set to false, thus all the weights will be exactly the same as the selected component. The component number parameter is set to 1, thus 'PC1' will be used as attribute weights. The weights can be seen in the Results Workspace. You can see that these weights are exactly the same as the values of 'PC1'.

In the second operator chain the Weight by PCA operator is applied on the 'Sonar' data set. The parameters of the Weight by PCA operator are set exactly the same as the parameters of the Weight by Component Model operator. As it can be seen in the Results Workspace, exactly same weights are generated here.