Loop Clusters (RapidMiner Studio Core)

Synopsis

This operator iterates over its subprocess for each cluster in the input ExampleSet. In each iteration the subprocess receives examples belonging to the cluster of that iteration.

Description

The Loop Clusters operator is a nested operator i.e. it has a subprocess. The subprocess of the Loop Clusters operator executes n number of times, where n is the number of clusters in the given ExampleSet. It is compulsory that the given ExampleSet should have a cluster attribute. Numerous clustering operators are available in RapidMiner that generate a cluster attribute e.g. the K-Means operator. The subprocess executes on examples of one cluster in an iteration, on examples of the next cluster in next iteration and so on. Please study the attached Example Process for better understanding.

Input

  • example set (IOObject)

    This input port expects an ExampleSet. It is compulsory that the ExampleSet should have an attribute with cluster role. It is output of the K-Means operator in the attached Example Process.

  • in (IOObject)

    This operator can have multiple in input ports. When one input is connected, another in input port becomes available which is ready to accept another input (if any). The order of inputs remains the same. The object delivered at first in port of the operator is available at first in port of the subprocess. Don't forget to connect all outputs in correct order. Make sure that you have connected the right number of ports at all levels of the chain.

Output

  • out (IOObject)

    This operator can have multiple out output ports. When one output is connected, another out output port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The object delivered at first out port of subprocess is delivered at first out output port of the outer process. Don't forget to connect all outputs in correct order. Make sure that you have connected the right number of ports at all levels of the chain.

Tutorial Processes

Introduction to the Loop Clusters operator

The 'Ripley-Set' data set is loaded using the Retrieve operator. The K-Means operator is applied on it for generating a cluster attribute. A breakpoint is inserted here so that you can have a look at the clustered ExampleSet. You can see that there is an attribute with cluster role. It has two possible values i.e. cluster_0 and cluster_1. This means that there are two clusters in the ExampleSet. The Loop Clusters operator is applied next. The subprocess of the Loop Clusters operator executes twice; once for each cluster. Have a look at the subprocess of the Loop Clusters operator. The Log operator is applied in the subprocess. A breakpoint is inserted before the Log operator so that you can see the examples of each iteration. In the first iteration you will see that all examples belong to cluster_1 while in the second iteration all examples belong to cluster_0.