Multiply (RapidMiner Studio Core)

Synopsis

This operator copies its input object to all connected output ports. It does not modify the input object.

Description

The Multiply operator copies the objects at its input port to the output ports multiple number of times. As more ports are connected, more copies are generated. The input object is copied by reference; hence the underlying data of the ExampleSet is never copied (unless the Materialize Data operator is used). As copy-by-reference is usually lighter than copy-by-value, copying objects is cheap through this operator. When copying ExampleSets only the references to attributes are copied. It is very important to note here that when attributes are changed or added in one copy of the ExampleSet, this change has no effect on other copies. However, if data is modified in one copy, it is also modified in the other copies generated by the Multiply operator.

Input

  • input (IOObject)

    It can take various kinds of objects as input e.g. an ExampleSet or even a model.

Output

  • output (IOObject)

    There can be many output ports. As one output port is connected, another output port is created for further connections. All ports deliver unchanged copies of the input object.

Tutorial Processes

Multiplying data sets

In this Example Process the Retrieve operator is used to load the 'Labor-Negotiations' data set. A breakpoint is inserted after this operator so that the data can be viewed before applying the Multiply operator. You can see that this data set has many missing values. Press the green-colored Run button to continue the process.

4 copies of the data set are generated using the Multiply operator. The Replace Missing Values operator is applied on the first copy. The Select Attributes operator is applied on the second copy. The Generate ID operator is applied on the third copy and the forth copy is connected directly to the results port.

The Replace Missing Values operator replaces all the missing values with the average value of that attribute. As this is a change in data instead of a change in attributes, this change is made in all the copies.

The Select Attributes operator selects one attribute i.e. the duration attribute. Please note that special attributes are also selected even if they are not explicitly specified. Thus the special attribute of the Labor-Negotiations data set (i.e. the Class attribute) is automatically selected. Results from this operator show only two attributes: the Duration attribute and a special attribute (the Class attribute). As this is a change in attributes instead of a change in data, it is only applied to this copy and all other copies are not affected. Similarly the Generate ID operator adds a new attribute (the id attribute) to the data set. As this is not a change in data (it is a change in attributes), it is relevant only for this copy and other copies are not affected.

The Last copy generated by the Multiply operator is connected directly to the results port without applying any operator. This copy is not the same as the input ExampleSet. This copy has no missing values. This is because of the Replace Missing Values operator, it made a change in data and changes in data are reflected in all copies.

Multiplying models

In this Example Process the Retrieve operator is used to load the Golf data set. The k-NN operator is applied on it to learn a classification model. This model is given as input to the Multiply operator. 2 copies are generated by the the Multiply operator. One copy of model is used to apply this model on the Golf-Testset data set and the other copy is used to apply the model on the Golf data set. This simple Example Process was added to show that the Multiply operator can multiply different objects e.g. models.