Weight by Value Average (RapidMiner Studio Core)

Synopsis

This operator uses a corpus of examples to characterize a single class by setting feature weights.

Description

This operator uses a corpus of examples to characterize a single class by setting feature weights. Characteristic features receive higher weights than less characteristic features. The weight for a feature is determined by calculating the average value of this feature for all examples of the target class.

This operator assumes that the feature values characterize the importance of this feature for an example (e.g. TFIDF or others). Therefore, this operator is mainly used on textual data based on TFIDF weighting schemes. To extract such feature values from text collections you can use the Text plugin.

Input

  • example set (IOObject)

    This input port expects an ExampleSet.

Output

  • weights (Average Vector)

    This port delivers the weights of the attributes with respect to the label attribute. The attributes with higher weight are considered more relevant.

  • example set (IOObject)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

  • normalize weights Activates the normalization of all weights. Range: boolean
  • sort_weights This parameter indicates if the attributes should be sorted according to their weights in the results. If this parameter is set to true, the order of the sorting is specified using the sort direction parameter. Range: boolean
  • sort_direction This parameter is only available when the sort weights parameter is set to true. This parameter specifies the sorting order of the attributes according to their weights. Range: selection
  • class to characterize The target class for which to find characteristic feature weights. Range: string