Categories

Versions

Generate Multi-Label Data (RapidMiner Studio Core)

Synopsis

This operator generates a multi-label ExampleSet based on numerical attributes. The number of examples, lower and upper bounds of attributes can be specified by the user.

Description

The Generate Multi-Label Data operator generates an ExampleSet with 5 numerical attributes and 3 label attributes. If the regression parameter is set to false, then the labels have two possible values i.e. positive or negative. Otherwise, the labels have real values. The number of examples to be generated can be specified by the number examples parameter. The upper and lower bounds of the numerical values can be specified by the attributes upper bound and attributes lower bound parameters. This operator is used for generating a random ExampleSet for testing purposes.

Output

  • output (Data Table)

    The Generate Multi-Label Data operator generates a multi-label ExampleSet based on numerical attributes which is delivered through this port. The meta data is also delivered along with the data.This output is the same as the output of the Retrieve operator.

Parameters

  • number_examplesThis parameter specifies the number of examples to be generated. Range: integer
  • regressionThis parameter specifies if multiple labels for regression tasks should be generated. If this parameter is set to false, then the labels have two possible values i.e. positive or negative. Otherwise, the labels have real values. Range: boolean
  • attributes_lower_boundThis parameter specifies the minimum possible value for the attributes to be generated. In other words this parameter specifies the lower bound of the range of possible values of regular attributes. Range: real
  • attributes_upper_boundThis parameter specifies the maximum possible value for the attributes to be generated. In other words this parameter specifies the upper bound of the range of possible values of regular attributes. Range: real
  • use_local_random_seedThis parameter indicates if a local random seed should be used for randomization. Using the same value of local random seed will produce the same ExampleSet. Changing the value of this parameter changes the way examples are randomized, thus the ExampleSet will have a different set of values. Range: boolean
  • local_random_seedThis parameter specifies the local random seed. This parameter is only available if the use local random seed parameter is set to true. Range: integer

Tutorial Processes

Introduction to the Generate Multi-Label Data operator

The Generate Multi-Label Data operator is applied for generating an ExampleSet. The number examples parameter is set to 100, thus the ExampleSet will have 100 examples. The attributes lower bound and attributes upper bound parameters are set to -10 and 10 respectively, thus values of the regular attributes will be within this range. The regression parameter is set to false, thus the ExampleSet will have nominal labels. You can verify this by viewing the results in the Results Workspace. The use local random seed parameter is set to false in this Example process. Set the use local random seed parameter to true and run the process with different values of local random seed. You will see that changing the values of local random seed changes the randomization.