# Performance (Regression) (AI Studio Core)

## Synopsis

This operator is used for statistical performance evaluation of regression tasks and delivers a list of performance criteria values of the regression task.## Description

This operator should be used for performance evaluation of regression tasks only. Many other performance evaluation operators are also available, e.g. the Performance operator, Performance (Binominal Classification) operator, Performance (Classification) operator etc. The Performance (Regression) operator is used with regression tasks only. On the other hand, the Performance operator automatically determines the learning task type and calculates the most common criteria for that type. You can use the Performance (User-Based) operator if you want to write your own performance measure.

Regression is a technique used for numerical prediction and it is a statistical measure that attempts to determine the strength of the relationship between one dependent variable ( i.e. the label attribute) and a series of other changing variables known as independent variables (regular attributes). Just like Classification is used for predicting categorical labels, Regression is used for predicting a continuous value. For example, we may wish to predict the salary of university graduates with 5 years of work experience, or the potential sales of a new product given its price. Regression is often used to determine how much specific factors such as the price of a commodity, interest rates, particular industries or sectors influence the price movement of an asset. For evaluating the statistical performance of a regression model the data set should be labeled i.e. it should have an attribute with *label* role and an attribute with *prediction* role. The *label* attribute stores the actual observed values whereas the *prediction* attribute stores the values of *label* predicted by the regression model under discussion.

## Input

- labeled data
This input port expects a labeled ExampleSet. The Apply Model operator is a good example of such operators that provide labeled data. Make sure that the ExampleSet has the

*label*and*prediction*attribute. See the Set Role operator for more details regarding the*label*and*prediction*roles of attributes. - performance
This is an optional parameter. It requires a Performance Vector.

## Output

- performance
This port delivers a Performance Vector (we call it

*output-performance-vector*for now). The Performance Vector is a list of performance criteria values. The Performance vector is calculated on the basis of the*label*and*prediction*attribute of the input ExampleSet. The*output-performance-vector*contains performance criteria calculated by this Performance operator (we call it*calculated-performance-vector*here). If a Performance Vector was also fed at the*performance*input port (we call it*input-performance-vector*here), the criteria of the*input-performance-vector*are also added in the*output-performance-vector*. If the*input-performance-vector*and the*calculated-performance-vector*both have the same criteria but with different values, the values of the*calculated-performance-vector*are delivered through the output port. This concept can be easily understood by studying the Example Process of the Performance (Classification) operator. - example set (Data table)
The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

## Parameters

- main criterionThe main criterion is used for comparisons and needs to be specified only for processes where performance vectors are compared, e.g. attribute selection or other meta optimization process setups. If no
*main criterion*is selected, the first criterion in the resulting performance vector will be assumed to be the*main criterion*. - root mean squared errorThe averaged root-mean-squared error.
- absolute errorThe average absolute deviation of the prediction from the actual value. The values of the
*label*attribute are the actual values. - relative errorThe average relative error is the average of the absolute deviation of the prediction from the actual value divided by actual value. Values of the
*label*attribute are the actual values. - relative error lenientThe average lenient relative error is the average of the absolute deviation of the prediction from the actual value divided by the maximum of the actual value and the prediction. The values of the
*label*attribute are the actual values. - relative error strictThe average strict relative error is the average of the absolute deviation of the prediction from the actual value divided by the minimum of the actual value and the prediction. The values of the
*label*attribute are the actual values. - normalized absolute errorThe absolute error divided by the error made if the average would have been predicted.
- root relative squared errorThe averaged root-relative-squared error.
- squared errorThe averaged squared error.
- correlationReturns the correlation coefficient between the
*label*and*prediction*attributes. - squared correlationReturns the squared correlation coefficient between the
*label*and*prediction*attributes. - prediction averageReturns the average of all the predictions. All the predicted values are added and the sum is divided by the total number of predictions.
- spearman rhoThe rank correlation between the actual and predicted
*labels*, using Spearman's rho. Spearman's rho is a measure of the linear relationship between two variables. The two variables in this case are the*label*and the*prediction*attribute. - kendall tauThe rank correlation between the actual and predicted
*labels*, using Kendall's tau-b. Kendall's tau is a measure of correlation, and so measures the strength of the relationship between two variables. The two variables in this case are the*label*and the*prediction*attribute. - skip undefined labelsIf set to true, examples with undefined
*labels*are skipped. - comparator classThis is an expert parameter. Fully qualified
*classname*of the*PerformanceComparator*implementation is specified here. - use example weightsThis parameter allows example
*weights*to be used for statistical performance calculations if possible. This parameter has no effect if no attribute has the*weight*role. In order to consider*weights*of examples the ExampleSet should have an attribute with the*weight*role. Several operators are available that assign*weights*e.g. the Generate Weights operator. Study the Set Roles operator for more information regarding the*weight*role.

## Tutorial Processes

### Applying the Performance (Regression) operator on the Polynomial data set

The 'Polynomial' data set is loaded using the Retrieve operator. The Filter Example Range operator is applied on it. The *first example* parameter of the Filter Example Range parameter is set to 1 and the *last example* parameter is set to 100. Thus the first 100 examples of the 'Polynomial' data set are selected. The Linear Regression operator is applied on it with default values of all parameters. The regression model generated by the Linear Regression operator is applied on the last 100 examples of the 'Polynomial' data set using the Apply Model operator. Labeled data from the Apply Model operator is provided to the Performance (Regression) operator. The *absolute error* and *prediction average* parameters are set to true. Thus the Performance Vector generated by the Performance (Regression) operator has information regarding the *absolute error* and *prediction average* in the labeled data set. The *absolute error* is calculated by adding the difference of all the predicted values from actual values of the *label* attribute, and dividing this sum by the total number of predictions. The *prediction average* is calculated by adding all the actual *label* values and dividing this sum by the total number of examples. You can verify this from the results in the Results Workspace.