Loop Parameters (RapidMiner Studio Core)

Synopsis

This operator iterates over its subprocess for all the defined parameter combinations. The parameter combinations can be set by the wizard provided in parameters.

Description

The Loop Parameters operator has a subprocess in it. It executes the subprocess for all combinations of selected values of the parameters. This can be very useful for plotting or logging purposes and sometimes for simply configuring the parameters for the inner operators as a sort of meta step (e.g. learning curve generation). Any results of the subprocess are delivered through the result ports.

The entire configuration of this operator is done through the edit parameter settings parameter. Complete description of this parameter is described in the parameters section.

Please note that this operator has two modes: synchronized and non-synchronized which depend on the setting of the synchronize parameter. In the latter, all parameter combinations are generated and the subprocess is executed for each combination. In the synchronized mode, no combinations are generated but the set of all pairs of the increasing number of parameters are used. For the iteration over a single parameter there is no difference between both modes. Please note that the number of parameter possibilities must be the same for all parameters in the synchronized mode.

If the synchronize parameter is not set to true, selecting a large number of parameters and/or large number of steps (or possible values of parameters) results in a huge number of combinations. For example, if you select 3 parameters and 25 steps for each parameter then the total number of combinations would be above 390625 (i.e. 25 x 25 x 25). The subprocess is executed for all possible combinations. Running a subprocess for such a huge number of iterations will take a lot of time. So always carefully limit the parameters and their steps.

Differentiation

Optimize Parameters (Grid)

The Optimize Parameters (Grid) operator executes the subprocess for all combinations of selected values of the parameters and then delivers the optimal parameter values. The Loop Parameters operator, in contrast to the optimization operators, simply iterates through all parameter combinations. This might be especially useful for plotting purposes.

Input

  • input (IOObject)

    This operator can have multiple inputs. When one input is connected, another input port becomes available which is ready to accept another input (if any). The order of inputs remains the same. The Object supplied at the first input port of this operator is available at the first input port of the nested chain (inside the subprocess). Do not forget to connect all inputs in correct order. Make sure that you have connected the right number of ports at the subprocess level.

Output

  • result (IOObject)

    Any results of the subprocess are delivered through the result ports. This operator can have multiple outputs. When one result port is connected, another result port becomes available which is ready to deliver another output (if any). The order of outputs remains the same. The Object delivered at the first result port of the subprocess is delivered at the first result port of the operator. Don't forget to connect all outputs in correct order. Make sure that you have connected the right number of ports.

Parameters

  • edit_parameter_settingsThe parameters are selected through the edit parameter settings menu. You can select the parameters and their possible values through this menu. This menu has an Operators window which lists all the operators in the subprocess of this operator. When you click on any operator in the Operators window, all parameters of that operator are listed in the Parameters window. You can select any parameter through the arrow keys of the menu. The selected parameters are listed in the Selected Parameters window. Only those parameters should be selected for which you want to iterate the subprocess. This operator iterates through parameter values in the specified range. The range of every selected parameter should be specified. When you click on any selected parameter (parameter in Selected Parameters window) the Grid/Range and Value List option is enabled. These options allow you to specify the range of values of the selected parameters. The Min and Max fields are for specifying the lower and upper bounds of the range respectively. As all values within this range cannot be checked, the steps field allows you to specify the number of values to be checked from the specified range. Finally the scale option allows you to select the pattern of these values. You can also specify the values in form of a list. Range: menu
  • error_handlingThis parameter allows you to select the method for handling errors occurring during the execution of the inner process. It has the following options:
    • fail_on_error: In case an error occurs, the execution of the process will fail with an error message.
    • ignore_error: In case an error occurs, the error will be ignored and the execution of the process will continue with the next iteration.
    Range: selection
  • synchronizeThis operator has two modes: synchronized and non-synchronized which depend on the setting of the synchronize parameter. If the synchronize parameter is set to false, all parameter combinations are generated and the inner operators are applied for each combination. If the synchronize parameter is set to true, no combinations are generated but the set of all pairs of the increasing number of parameters are used. For the iteration over a single parameter there is no difference between both modes. Please note that the number of parameter possibilities must be the same for all parameters in the synchronized mode. Range: boolean

Tutorial Processes

Iterating through the parameters of the SVM operator

The 'Weighting' data set is loaded using the Retrieve operator. The Loop Parameters operator is applied on it. Have a look at the Edit Parameter Settings parameter of the Loop Parameters operator. You can see in the Selected Parameters window that the C and gamma parameters of the SVM operator are selected. Click on the SVM.C parameter in the Selected Parameters window, you will see that the range of the C parameter is set from 0.001 to 100000. 11 values are selected (in 10 steps) logarithmically. Now, click on the SVM.gamma parameter in the Selected Parameters window, you will see that the range of the gamma parameter is set from 0.001 to 1.5. 11 values are selected (in 10 steps) logarithmically. There are 11 possible values of 2 parameters, thus there are 121 ( i.e. 11 x 11) combinations. The subprocess will be executed for all combinations of these values because the synchronize parameter is set to false, thus it will iterate 121 times. In every iteration, the value of the C and/or gamma parameters of the SVM(LibSVM) operator is changed. The value of the C parameter is 0.001 in the first iteration. The value is increased logarithmically until it reaches 100000 in the last iteration. Similarly, the value of the gamma parameter is 0.001 in the first iteration. The value is increased logarithmically until it reaches 1.5 in the last iteration

Have a look at the subprocess of the Loop Parameters operator. First the data is split into two equal partitions using the Split Data operator. The SVM (LibSVM) operator is applied on one partition. The resultant classification model is applied using two Apply Model operators on both the partitions. The statistical performance of the SVM model on both testing and training partitions is measured using the Performance (Classification) operators. At the end the Log operator is used to store the required results.

The log parameter of the Log operator stores five things. The iterations of the Loop Parameters operator are counted by apply-count of the SVM operator. This is stored in a column named 'Count'. The value of the classification error parameter of the Performance (Classification) operator that was applied on the Training partition is stored in a column named 'Training Error'. The value of the classification error parameter of the Performance (Classification) operator that was applied on the Testing partition is stored in a column named 'Testing Error'. The value of the C parameter of the SVM (LibSVM) operator is stored in a column named 'SVM C'. The value of the gamma parameter of the SVM (LibSVM) operator is stored in a column named 'SVM gamma'. Also note that the stored information will be written into a file as specified in the filename parameter.

Run the process and turn to the Results Workspace. Now have a look at the values saved by the Log operator. Switch to Table View to see the stored values in tabular form.