# Performance (Classification) (AI Studio Core)

## Synopsis

This operator is used for statistical performance evaluation of classification tasks. This operator delivers a list of performance criteria values of the classification task.## Description

This operator should be used for performance evaluation of only classification tasks. Many other performance evaluation operators are also available, e.g. Performance operator, Performance (Binominal Classification) operator, Performance (Regression) operator etc. The Performance (Classification) operator is used with classification tasks only. On the other hand, the Performance operator automatically determines the learning task type and calculates the most common criteria for that type. You can use the Performance (User-Based) operator if you want to write your own performance measure.

Classification is a technique used to predict group membership for data instances. For example, you may wish to use classification to predict whether the train on a particular day will be 'on time', 'late' or 'very late'. Predicting whether a number of people on a particular event would be 'below- average', 'average' or 'above-average' is another example. For evaluating the statistical performance of a classification model the data set should be labeled i.e. it should have an attribute with *label* role and an attribute with *prediction* role. The *label* attribute stores the actual observed values whereas the *prediction* attribute stores the values of *label* predicted by the classification model under discussion.

## Input

- labeled data
This input port expects a labeled ExampleSet. The Apply Model operator is a good example of such operators that provide labeled data. Make sure that the ExampleSet has a

*label*attribute and a*prediction*attribute. See the Set Role operator for more details regarding*label*and*prediction*roles of attributes. - performance
This is an optional parameter. It requires a Performance Vector.

## Output

- performance
This port delivers a Performance Vector (we call it

*output-performance-vector*for now). The Performance Vector is a list of performance criteria values. The Performance vector is calculated on the basis of the*label*attribute and the*prediction*attribute of the input ExampleSet. The*output-performance-vector*contains performance criteria calculated by this Performance operator (we call it*calculated-performance-vector*here). If a Performance Vector was also fed at the*performance*input port (we call it*input-performance-vector*here), criteria of the*input-performance-vector*are also added in the*output-performance-vector*. If the*input-performance-vector*and the*calculated-performance-vector*both have the same criteria but with different values, the values of*calculated-performance-vector*are delivered through the output port. This concept can be easily understood by studying the attached Example Process. - example set (Data table)
ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

## Parameters

- main criterionThe main criterion is used for comparisons and needs to be specified only for processes where performance vectors are compared, e.g. attribute selection or other meta optimization process setups. If no
*main criterion*is selected, the first criterion in the resulting performance vector will be assumed to be the*main criterion*. - accuracyRelative number of correctly classified examples or in other words percentage of correct predictions
- classification errorRelative number of misclassified examples or in other words percentage of incorrect predictions.
- kappaThe kappa statistics for the classification. It is generally thought to be a more robust measure than simple percentage correct prediction calculation since it takes into account the correct prediction occurring by chance.
- weighted mean recallThe weighted mean of all per class recall measurements. It is calculated through class recalls for individual classes. Class recalls are mentioned in the last row of the matrix displayed in the Results Workspace.
- weighted mean precisionThe weighted mean of all per class precision measurements. It is calculated through class precisions for individual classes. Class precisions are mentioned in the last column of the matrix displayed in the Results Workspace.
- spearman rho The rank correlation between the actual and predicted
*labels*, using Spearman's rho. Spearman's rho is a measure of the linear relationship between two variables. The two variables in this case are*label*attribute and*prediction*attribute. - kendall tauThe rank correlation between the actual and predicted
*labels*, using Kendall's tau. Kendall's tau is a measure of correlation, and so measures the strength of the relationship between two variables. The two variables in this case are the*label*attribute and the*prediction*attribute. - absolute errorAverage absolute deviation of the prediction from the actual value. The values of the label attribute are the actual values.
- relative errorAverage relative error is the average of the absolute deviation of the prediction from the actual value divided by the actual value. The values of the
*label*attribute are the actual values. - relative error lenient Average lenient relative error is the average of the absolute deviation of the prediction from the actual value divided by the maximum of the actual value and the prediction. The values of the
*label*attribute are the actual values. - relative error strictAverage strict relative error is the average of the absolute deviation of the prediction from the actual value divided by the minimum of the actual value and the prediction. The values of the
*label*attribute are the actual values. - normalized absolute errorThe absolute error divided by the error made if the average would have been predicted.
- root mean squared errorThe averaged root-mean-squared error.
- root relative squared errorThe averaged root-relative-squared error.
- squared errorThe averaged squared error.
- correlationReturns the correlation coefficient between the
*label*and*prediction*attributes. - squared correlationReturns the squared correlation coefficient between the
*label*and*prediction*attributes. - cross-entropyThe cross-entropy of a classifier, defined as the sum over the logarithms of the true label's confidences divided by the number of examples.
- marginThe margin of a classifier, defined as the minimal confidence for the correct label.
- soft margin lossThe average soft margin loss of a classifier, defined as the average of all 1 - confidences for the correct
*label* - logistic lossThe logistic loss of a classifier, defined as the average of ln(1+exp(-[conf(CC)])) where 'conf(CC)' is the confidence of the correct class.
- skip undefined labelsIf set to true, examples with undefined
*labels*are skipped. - comparator classThis is an expert parameter. The fully qualified
*classname*of the*PerformanceComparator*implementation is specified here. - use example weightsThis parameter allows example
*weight*s to be used for statistical performance calculations if possible. This parameter has no effect if no attribute has*weight*role. In order to consider*weights*of examples the ExampleSet should have an attribute with*weight*role. Several operators are available that assign*weights*e.g. Generate Weights operator. Study the Set Roles operator for more information regarding*weight*role. - class weightsThis is an expert parameter. It specifies the weights 'w' for all classes. The
*Edit List*button opens a new window with two columns. The first column specifies the class name and the second column specifies the*weight*for that class. If the*weight*of a class is not specified, that class is assigned*weight = 1.*

## Tutorial Processes

### Use of performance port in Performance (Classification)

This Example Process is composed of two Subprocess operators and one Performance (Classification) operator. Double click on the first Subprocess operator and you will see the operators within this subprocess. The first subprocess 'Subprocess (labeled data provider)' loads the 'Golf' data set using the Retrieve operator and then learns a classification model using the k-NN operator. Then the learned model is applied on 'Golf-Testset' data set using the Apply Model operator. Then the Generate Weight operator is used to add an attribute with *weight* role. Thus, this subprocess provides a labeled ExampleSet with *weight* attribute. The *Breakpoint* is inserted after this subprocess to show this ExampleSet. This ExampleSet is provided at the * labeled data* input port of the Performance (Classification) operator in the main process.

The second Subprocess operator ' Subprocess (performance vector provider) ' loads the' Golf ' data set using the Retrieve operator and then learns a classification model using the k-NN operator. Then the learned model is applied on the' Golf' data set using the Apply Model operator. Then the Performance (Classification) operator is applied on the labeled data to produce a Performance Vector. The * Breakpoint* is inserted after this subprocess to show this Performance Vector. Note that this model was trained and tested on the same data set (Golf data set), so its accuracy is 100%. Thus this subprocess provides a Performance Vector with 100% * accuracy* and 0.00% * classification error*. This Performance Vector is connected to the * performance* input port of the Performance (Classification) operator in the main process.

When you run the process, first you will see an ExampleSet which is the output of the first Subprocess operator. Press the Run button again and now you will see a Performance Vector. This is the output of the second Subprocess operator. Press the Run button again and you will see various criteria in the * criterion selector * window in the Results Workspace. These include * classification error*, * accuracy*, * weighted mean recall* and * weighted mean precision*. Now select the * accuracy* from the *criterion selector* window, its value is 71.43%. On the contrary the *accuracy* of the input Performance Vector provided by the second subprocess was 100%. The* accuracy *of the final Performance Vector is 71.43% instead of 100% because if the* input-performance-vector* and the* calculated-performance-vector* both have same criteria but with different values, the values of the* calculated-performance -vector* are delivered through the *output *port. Now, note that the * classification error* criterion is added to the criteria list because of the Performance Vector provided at the * performance *input port. Disable the second Subprocess operator and run the same process again, you will see that * classification error * criterion does not appear now. This is because if a Performance Vector is fed at the* performance* input port, its criteria are also added to the *output-performance-vector*.

The* accuracy* is calculated by taking the percentage of correct predictions over the total number of examples. Correct prediction means the examples where the value of the * prediction* attribute is equal to the value of *label* attribute. If you look at the ExampleSet in the Results Workspace, you can see that there are 14 examples in this data set. 10 out of 14 examples are correct predictions i.e. their * label* and * prediction* attributes have the same values. This is why accuracy was 71.43% (10 x 100 /14 = 71.43%). Now run the same process again but this time set *use example weights* parameter to true. Check the results again. They have changed now because the weight of each example was taken into account this time. The * accuracy* is 68.89% this time. If you take the percentage of weight of correct predictions and the total weight you get the same answer (0.6889 x 100/1 = 68.89%). In this Example Process, using weights reduced the accuracy but this is not always the case.

The* weighted mean recall* is calculated by taking the average of recall of every class. As you can see in the last row of the resultant matrix in the Results Workspace, class recall for 'true no' is 60% and class recall for 'true yes' is 77.78%. Thus * weighted mean recall* is calculated by taking the average of these class recall values (((77.78%)+(60%))/2=68.89%).

The* weighted mean precision* is calculated by taking the average of precision of every class. As you can see in the last column of the resultant matrix in the Results Workspace, class precision for 'pred. no' is 60% and class precision for 'pred. yes' is 77.78%. Thus * weighted mean precision* is calculated by taking the average of these class precision values (((77.78%)+(60%))/2=68.89%). These values are for the case when the use example weights parameter is set to false.

Note: This Example Process is just for highlighting different perspectives of Performance (Classification) operator. It may not be very useful in real scenarios.