Categories

Versions

Hierarchical Classification (RapidMiner Studio Core)

Synopsis

Builds a hierarchical classification model due to the specified class taxonomy.

Description

This meta learner builds a hierarchical classification model due to a class taxonomy. This class taxonomy has to be specified within the hierarchy parameter list. Each list entry represents an edge in the class hierarchy which in fact represents a parent-child class relationship. You need to specify one root node and assign each other node to one father.

Input

  • training set (Data Table)

    This input port expects an ExampleSet holding the training data.

Output

  • model (MetaCost Model)

    The meta model is delivered from this output port which can now be applied on unseen data sets for prediction of the label attribute.

  • example set (Data Table)

    The ExampleSet that was given as input is passed without changing to the output through this port.

Parameters

  • hierarchyThis parameter is used for specifying the class hierarchy. See the tutorial process for further explanation. Range: string
  • use_local_random_seedThis parameter indicates if a local random seed should be used for randomization. Using the same value of local random seed will produce the same sample. Changing the value of this parameter changes the way examples are randomized, thus the sample will have a different set of values. Range: boolean
  • local_random_seedThis parameter specifies the local random seed. This parameter is only available if the use local random seed parameter is set to true. Range: integer

Tutorial Processes

Using the Hierarchical Classification operator to classify flowers

In this example we take the iris data set and first create a 4th class "iris-random".

We assume, that there are two classes of iris flowers: blue and purple. The blue one has two subtypes: Iris-virginica and iris-random. The purple one also has to subtypes: Iris-versicolor and Iris-setosa.

We define this hierarchy in the Hierachical Classification operator and use a Decision Tree inside. This means we first learn a Decision Tree to decide between Blue and Purple. Then we learn two individual Decision trees which decide in the predicted blue ones between iris-virginica and iris-random and in the predicted purple ones between iris-versicolor and iris-setosa.

The result is one model which can be applied with an Apply Model operator. From the outside it can be used like one model which predicts the 4 classes directly.