Generate Function Set (RapidMiner Studio Core)
Synopsis
This is an attribute generation operator which generates new attributes by applying a set of selected functions on all attributes.Description
This operator applies a set of selected functions on all attributes of the input ExampleSet for generating new attributes. Numerous functions are available including summation, difference, multiplication, division, reciprocal, square root, power, sine, cosine, tangent, arc tangent, absolute, minimum, maximum, ceiling, floor and round. It is important to note that the functions with two arguments will be applied on all possible pairs. For example suppose an ExampleSet with three numerical attributes A, B and C. If the summation function is applied on this ExampleSet then three new attributes will be generated with values A+B, A+C and B+C. Similarly non-commutative functions will be applied on all possible permutations. This is a useful attribute generation operator but if it does not meet your requirements please try the Generate Attributes operator which is a very powerful attribute generation operator.
Input
- example set input (Data Table)
This input port expects an ExampleSet. It is the output of the Retrieve operator in the attached Example Process. The output of other operators can also be used as input.
Output
- example set output (Data Table)
New attributes are created by application of the selected functions and the resultant ExampleSet is delivered through this port.
- original (Data Table)
The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.
Parameters
- keep_allThis parameter indicates if the original attributes should be kept. Range: boolean
- use_plusThis parameter indicates if the summation function should be applied for generation of new attributes. Range: boolean
- use_diffThis parameter indicates if the difference function should be applied for generation of new attributes. Range: boolean
- use_multThis parameter indicates if the multiplication function should be applied for generation of new attributes. Range: boolean
- use_divThis parameter indicates if the division function should be applied for generation of new attributes. Range: boolean
- use_reciprocalsThis parameter indicates if the reciprocal function should be applied for generation of new attributes. Range: boolean
- use_square_rootsThis parameter indicates if the square roots function should be applied for generation of new attributes. Range: boolean
- use_power_functionsThis parameter indicates if the power function should be applied for generation of new attributes. Range: boolean
- use_sinThis parameter indicates if the sine function should be applied for generation of new attributes. Range: boolean
- use_cosThis parameter indicates if the cosine function should be applied for generation of new attributes. Range: boolean
- use_tanThis parameter indicates if the tangent function should be applied for generation of new attributes. Range: boolean
- use_atanThis parameter indicates if the arc tangent function should be applied for generation of new attributes. Range: boolean
- use_expThis parameter indicates if the exponential function should be applied for generation of new attributes. Range: boolean
- use_logThis parameter indicates if the logarithmic function should be applied for generation of new attributes. Range: boolean
- use_absolute_valuesThis parameter indicates if the absolute values function should be applied for generation of new attributes. Range: boolean
- use_minThis parameter indicates if the minimum values function should be applied for generation of new attributes. Range: boolean
- use_maxThis parameter indicates if the maximum values function should be applied for generation of new attributes. Range: boolean
- use_ceilThis parameter indicates if the ceiling function should be applied for generation of new attributes. Range: boolean
- use_floorThis parameter indicates if the floor function should be applied for generation of new attributes. Range: boolean
- use_roundedThis parameter indicates if the round function should be applied for generation of new attributes. Range: boolean
Tutorial Processes
Using the power function for attribute generation
The 'Iris' data set is loaded using the Retrieve operator. A breakpoint is inserted here so that you can have a look at the ExampleSet. You can see that the ExampleSet has 4 real attributes. The Generate Function Set operator is applied on this ExampleSet for generation of new attributes, only the Power function is used. It is not a commutative function e.g. 2 raised to power 3 is not the same as 3 raised to power 2. The non-commutative functions are applied for all possible permutations. As there are 4 original attributes, there are 16 (i.e. 4 x 4) possible permutations. Thus 16 new attributes are created as a result of this operator. The resultant ExampleSet can be seen in the Results Workspace. As the keep all parameter was set to true, the original attributes of the ExampleSet are not discarded.