Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.8 - Check here for latest version

Select by Random (RapidMiner Studio Core)

Synopsis

This operator selects a random subset of attributes of the given ExampleSet.

Description

The Select by Random operator selects attributes randomly from the input ExampleSet. If the use fixed number of attributes parameter is set to true, then the required number of attributes is specified through the number of attributes parameter. Otherwise, a random number of attributes is selected. The randomization can be changed by changing the seed value in the corresponding parameters. This operator can be useful in combination with the Loop Parameters operator or can be used as a baseline for significance test comparisons for feature selection techniques.

Input

  • example set (Data Table)

    This input port expects an ExampleSet. It is the output of the Retrieve operator in the attached Example Process. The output of other operators can also be used as input. It is essential that meta data should be attached with the data for the input because attributes are specified in their meta data. The Retrieve operator provides meta data along-with the data.

Output

  • example set (Data Table)

    The ExampleSet with selected attributes is output of this port.

  • original (Data Table)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

  • use_fixed_number_of_attributesThis parameter specifies if a fixed number of attributes should be selected. Range: boolean
  • number_of_attributesThis parameter is only available when the use fixed number of attributes parameter is set to true. This parameter specifies the number of attributes which should be randomly selected. Range: integer
  • use_local_random_seedThis parameter indicates if a local random seed should be used for randomization. Using the same value of local random seed will produce the same ExampleSet. Changing the value of the local seed changes the randomization, thus the ExampleSet will have a different set of attributes. Range: boolean
  • local_random_seedThis parameter specifies the local random seed. This parameter is only available if the use local random seed parameter is set to true. Range: integer

Tutorial Processes

Selecting random attributes from Sonar data set

The 'Sonar' data set is loaded using the Retrieve operator. A breakpoint is inserted here so that you can have a look at the ExampleSet. You can see the ExampleSet has 60 attributes. The Select by Random operator is applied on this ExampleSet. The use fixed number of attributes parameter is set to true and the number of attributes parameter is set to 10. Thus 10 attributes will be selected randomly from the 'Sonar' data set. The resultant ExampleSet can be seen in the Results Workspace.