Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Multi Label Modeling (Time Series)

Synopsis

This operator trains a Multi Label Model on an input ExampleSet.

Description

A Multi Label Model is a meta model able to predict multiple label attributes at once. For each label attribute, the Multi Label Model contains a Prediction Model. The label attributes can be selected by the attribute filter.

The inner subprocess of the operator is executed for each selected label attribute. The input training data, with the current label attribute set to the Label role, is provided at the inner training set port. It can be used to train a Prediction Model for the corresponding label attribute. The Prediction Model has to be provided to the model port of the inner subprocess. The Prediction Models are collected and provided as the Multi Label Model at the model output port of the operator. Additional objects can be passed in and out of the subprocess through the input and output port extender.

Any attribute, independent of type or special role, can be selected as label attribute. It has only be ensured that the Prediction Model trained in the inner subprocess is capable of handling all selected label attributes. To ensure a defined behavior the input ExampleSet is not allowed to have a normal 'Label' attribute. If the input data has such an attribute, its role has to be set to another role with the Set Role operator. Note that the role of all other attributes are not changed. Hence if a regular attribute is selected as a label attribute, it will be still a regular attribute in the training iterations of the other label attributes and used as an input attribute for the Prediction Models of the other label attributes. The name and type of the current label attribute in the subprocess can be added as macros if the parameter add macros is selected.

When the Multi Label Model is applied on an ExampleSet (using the Apply Model operator) the Prediction Models of the meta model are used to create a 'Prediction' attribute for every label attribute. The name of the 'Prediction' attributes is set to prediction(<name of label attribute>), the special role of the 'Prediction' attribute is set to prediction_<name of label attribute>. If the Prediction Model also creates 'Confidence' attributes (in case the label is nominal), the 'Confidence' attributes are named confidence(<name of label attribute> = <value>). The role of the 'Confidence' attributes are set to confidence_<name of label attribute>_<value>.

The performance of the multiple predictions can be evaluated by the operator Multi Label Performance.

Input

  • training set (Data Table)

    The input ExampleSet on which the Multi Label Model is built.

  • input (IOObject)

    This port is a port extender, which means if a port is connected a new input port is created. Any IOObject can be connected to the port and is passed to the corresponding inner input port for each iteration.

Output

  • model (Model)

    The Multi Label Model, containing Prediction Models for each of the selected label attributes.

  • output (IOObject)

    This port is a port extender, which means if a port is connected a new output port is created. The port collects every result that is provided by the inner process and returns a collections of all iterations.

Parameters

  • attribute_filter_type

    This parameter allows you to select the filter for the label attributes selection; the method you want to select the attributes for which Prediction Models are trained. Note that the filter is applied on all attributes indepent of their special role (the parameter include special attributes is set to true for this attribute selection and cannot be changed). The different filter types are:

    • all: This option selects all attributes of the ExampleSet to be label attributes. This is the default option.
    • single: This option allows the selection of a single label attribute. The required attribute is selected by the attribute parameter.
    • subset: This option allows the selection of multiple label attributes through a list (see parameter attributes). If the meta data of the ExampleSet is known all attributes are present in the list and the required ones can easily be selected.
    • regular_expression: This option allows you to specify a regular expression for the label attribute selection. The regular expression filter is configured by the parameters regular expression, use except expression and except expression.
    • value_type: This option allows selection of all the attributes of a particular type to be label attributes. It should be noted that types are hierarchical. For example real and integer types both belong to the numeric type. The value type filter is configured by the parameters value type, use value type exception, except value type.
    • block_type: This option allows the selection of all the attributes of a particular block type to be label attributes. It should be noted that block types may be hierarchical. For example value_series_start and value_series_end block types both belong to the value_series block type. The block type filter is configured by the parameters block type, use block type exception, except block type.
    • no_missing_values: This option selects all attributes of the ExampleSet as label attributes which do not contain a missing value in any example. Attributes that have even a single missing value are not selected.
    • numeric_value_filter: All numeric attributes whose examples all match a given numeric condition are selected as label attributes. The condition is specified by the numeric condition parameter.
    Range:
  • attribute

    The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the parameter if the meta data is known.

    Range:
  • attributes

    The required attributes can be selected from this option. This opens a new window with two lists. All attributes are present in the left list. They can be shifted to the right list, which is the list of selected label attributes.

    Range:
  • regular_expression

    Attributes whose names match this expression will be selected. The expression can be specified through the edit and preview regular expression menu. This menu gives a good idea of regular expressions and it also allows you to try different expressions and preview the results simultaneously.

    Range:
  • use_except_expression

    If enabled, an exception to the first regular expression can be specified. This exception is specified by the except regular expression parameter.

    Range:
  • except_regular_expression

    This option allows you to specify a regular expression. Attributes matching this expression will not be selected even if they match the first expression (expression that was specified in regular expression parameter).

    Range:
  • value_type

    This option allows to select a type of attribute.

    Range:
  • use_value_type_exception

    If enabled, an exception to the selected type can be specified. This exception is specified by the except value type parameter.

    Range:
  • except_value_type

    The attributes matching this type will not be selected even if they matched the before selected type, specified by the value type parameter.

    Range:
  • block_type

    This option allows to select a block type of attribute.

    Range:
  • use_block_type_exception

    If enabled, an exception to the selected block type can be specified. This exception is specified by the except block type parameter.

    Range:
  • except_block_type

    The attributes matching this block type will not be selected even if they matched the before selected type by the block type parameter.

    Range:
  • numeric_condition

    The numeric condition used by the numeric condition filter type. A numeric attribute is selected if all examples match the specified condition for this attribute. For example the numeric condition '> 6' will keep all numeric attributes having a value of greater than 6 in every example. A combination of conditions is possible: '> 6 && < 11' or '<= 5 || < 0'. But && and || cannot be used together in one numeric condition. Conditions like '(> 0 && < 2) || (>10 && < 12)' are not allowed because they use both && and ||.

    Range:
  • invert_selection

    If this parameter is set to true the selection is reversed. In that case all attributes not matching the specified condition are selected as label attributes.

    Range:
  • add_macros

    If selected macros containing the name and the value type of the current label attribute are added in each iteration of the subprocess.

    Range:
  • current_label_name_macro

    If add macros is true, this parameter defines the name of the macro holding the current label attribute name.

    Range:
  • current_label_type_macro

    If add macros is true, this parameter defines the name of the macro holding the current label attribute type.

    Range:
  • enable_parallel_execution

    This parameter enables the parallel execution of the inner processes. Please disable the parallel execution if you run into memory problems.

    Range:

Tutorial Processes

Multi Label Modeling on Titanic data set

This tutorial process shows the basic usage of the Multi Label Modeling operator by training a Multi Label Model for the attributes 'Survived', 'Port of Embarkation' and 'Age' of the Titanic samples data set.

Training different models for different types of label attributes

In this tutorial process the Multi Label Modeling operator is used to train a Multi Label Model for the attributes 'Survived', 'Port of Embarkation' and 'Age' of the Titanic samples data set similar as in Tutorial 1. But instead of using one kind of model for all label attributes, different models are used for different types of label attributes. For the nominal label attributes Decision Trees are built, for the non-nominal ones a Generalized Linear Model is built. See the comments in the process for more details.