Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Loop Values (Concurrency)

Synopsis

This operator iterates over its subprocess for all the possible values of the selected attribute. The subprocess can access the attribute value of the current iteration by a macro.

Description

The Loop Values operator has a parameter named attribute that allows you to select the required attribute of the input ExampleSet. Once the attribute is selected, the Loop Values operator applies its subprocess for each possible value of the selected attribute i.e. the subprocess executes n number of times where n is the number of possible values of the selected attribute. In all iterations the attribute value of the current iteration can be accessed using the macro specified in the iteration macro parameter. You need to have basic understanding of macros in order to apply this operator. Please study the documentation of the Extract Macro operator for basic understanding of macros. The Extract Macro operator is also used in the attached Example Process. For more information regarding subprocesses please study the Subprocess operator.

It is important to note that the subprocess of the Loop Values operator executes for all possible values of the selected attribute. Suppose the selected attribute has three possible values and the ExampleSet has 100 examples. The Loop Values operator will iterate only three times (not 100 times); once for each possible value of the selected attribute. This operator is usually applied on nominal attributes.

Input

  • example set (Data Table)

    This input port expects an ExampleSet. It is the output of the Subprocess operator in the attached Example Process. The output of other operators can also be used as input.

Output

  • output (IOObject)

    The Loop Values operator can have multiple outputs. When one output is connected, another output port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The Object delivered at first output port of subprocess is delivered at first output of the outer process. Don't forget to connect all outputs in correct order. Make sure that you have connected the right number of ports at all levels of the chain.

Parameters

  • attribute The required attribute can be selected from this option. The attribute name can be selected from the drop down box of the attribute parameter if the meta data is known. Range: string
  • iteration_macro This parameter specifies the name of the macro which holds the current value of the selected attribute in each iteration. Range: string
  • reuse_results Set whether to reuse the results of each iteration as the input of the next iteration. If set to true, the output of each iteration is used as input for the next iteration. For obvious reasons, this will limit the loop to run in a single thread and not make use of more CPU cores. If set to false, the input of each iteration will be the original input of the loop. Range: boolean
  • enable_parallel_execution This parameter enables the parallel execution of the subprocess. Please disable the parallel execution if you run into memory problems. Range: boolean

Tutorial Processes

The use of the Loop Values operator in complex preprocessing

This Tutorial Process will cover a number of concepts of macros including redefining macros, the macro of the Loop Values operator and the use of the Extract Macro operator. This process starts with a subprocess which is used to generate data. What is happening inside this subprocess is not relevant to the use of macros, so it is not discussed here. A breakpoint is inserted after this subprocess so that you can view the ExampleSet. You can see that the ExampleSet has 12 examples and 2 attributes: 'att1' and 'att2'. 'att1' is nominal and has 3 possible values: 'range1', 'range2' and 'range3'. 'att2' has real values.

The Loop Values operator is applied on the ExampleSet. The attribute parameter is set to 'att1' therefore the Loop Values operator iterates over the values of the specified attribute (i.e. att1) and applies the inner operators on the given ExampleSet while the current value can be accessed via the macro defined by the iteration macro parameter which is set to 'loop_value', thus the current value can be accessed by specifying %{loop_value} in the parameter values. As att1 has 3 possible values, Loop Values will iterate 3 times, once for each possible value of att1.

Here is an explanation of what happens inside the Loop Values operator. It is provided with an ExampleSet as input. The Filter Examples operator is applied on it. The condition class parameter is set to 'attribute value filter' and the parameter string is set to 'att1 = %{loop_value}'. Note the use of the loop_value macro here. Only those examples are selected where the value of att1 is equal to the value of the loop_value macro. A breakpoint is inserted here so that you can view the selected examples. Then the Aggregation operator is applied on the selected examples. It is configured to take the average of the att2 values of the selected examples. This average value is stored in a new ExampleSet in the attribute named 'average(att2)'. A breakpoint is inserted here so that you can see the average of the att2 values of the selected examples. The Extract Macro operator is applied on this new ExampleSet to store this average value in a macro named 'current_average'. The originally selected examples are passed to the Generate Attributes operator that generates a new attribute named 'att2_abs_avg' which is defined by the expression 'abs(att2 - %{current_average})'. Note the use of the current_average macro here. Its value is subtracted from all values of att2 and stored in a new attribute named 'att2_abs_avg'. The Resultant ExampleSet is delivered at the output of the Loop Values operator. A breakpoint is inserted here so that you can see the ExampleSet with the 'att2_abs_avg' attribute. This output is fed to the Append operator in the main process. It merges the results of all the iterations into a single ExampleSet which is visible at the end of this process in the Results Workspace.

Here is what you see when you run the process. ExampleSet generated by the first Subprocess operator. Then the process enters the Loop Value operator and iterates 3 times. Iteration 1: ExampleSet where the 'att1' value is equal to the current value of the loop_value macro i.e. 'range1' Average of 'att2' values for the selected examples. The average is -1.161. ExampleSet with 'att2_abs_avg' attribute for iteration 1. Iteration 2: ExampleSet where the 'att1' value is equal to the current value of the loop_value macro i.e. 'range2' Average of 'att2' values for the selected examples. The average is -1.656. ExampleSet with 'att2_abs_avg' attribute for iteration 2. Iteration 3: ExampleSet where the 'att1' value is equal to the current value of the loop_value macro i.e. 'range3' Average of 'att2' values for the selected examples. The average is 1.340. ExampleSet with 'att2_abs_avg attribute' for iteration 3. Now the process comes out of the Loop Values operator and the Append operator merges the final ExampleSets of all three iterations into a single ExampleSet that you can see in the Results Workspace.