Categories

Versions

Loop and Average (RapidMiner Studio Core)

Synopsis

This operator iterates over its subprocess the specified number of times and delivers the average of the inner results.

Description

The Loop and Average operator is a nested operator i.e. it has a subprocess. The subprocess of the Loop and Average operator executes n number of times, where n is the value of the iterations parameter specified by the user. The subprocess of this operator must always return a performance vector. These performance vectors are averaged and returned as result of this operator. For more information regarding subprocesses please study the Subprocess operator.

Differentiation

Loop and Deliver Best

This operator iterates over its subprocess the specified number of times and delivers the results of the iteration that has the best performance.

Input

  • in (IOObject)

    This operator can have multiple inputs. When one input is connected, another in port becomes available which is ready to accept another input (if any). The order of inputs remains the same. The Object supplied at the first in port of this operator is available at the first in port of the nested chain (inside the subprocess). Do not forget to connect all inputs in correct order. Make sure that you have connected the right number of ports at the subprocess level.

Output

  • averagable (Average Vector)

    This operator can have multiple averagable output ports. When one output is connected, another averagable output port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The Average Vector delivered at the first averagable port of the subprocess is delivered at the first averagable output port of the outer process. Don't forget to connect all outputs in correct order. Make sure that you have connected the right number of ports at all levels of the chain.

Parameters

  • iterationsThis parameter specifies the number of iterations of the subprocess of this operator. Range: integer
  • average_performances_onlyThis parameter indicates if only performance vectors or all types of averagable result vectors should be averaged. Range: boolean

Tutorial Processes

Taking average of performance vectors

The 'Golf' data set is loaded using the Retrieve operator. The Loop And Average operator is applied on it. The iterations parameter is set to 3; thus the subprocess of the Loop And Average operator will be executed three times. A performance vector will be generated in every iteration and the average of these performance vectors will be delivered as the result of this operator.

Have a look at the subprocess of the Loop And Average operator. The Split Validation operator is used for training and testing a Naive Bayes model. A breakpoint is inserted after the Split Validation operator so that the performance vector can be seen in each iteration. Run the process. You will see the performance vector of the first iteration. It has 25% accuracy. Keep continuing the process; you will see the performance vectors of the second and third iterations (with 75% and 100% accuracy respectively). The Loop and Average operator takes the average of these three results and delivers it through its output port. The average of these three results is 66.67% (i.e. (25% + 75% + 100%) / 3). The resultant average vector can be seen in the Results Workspace.