Performance (Binominal Classification) (RapidMiner Studio Core)

Synopsis

This operator is used for statistical performance evaluation of binominal classification tasks i.e. classification tasks where the label attribute has a binominal type. This operator delivers a list of performance criteria values of the binominal classification task.

Description

This operator should be used specifically for performance evaluation of only binominal classification tasks i.e. classification tasks where the label attribute has a binominal type. Many other performance evaluation operators are also available in RapidMiner e.g. the Performance operator, the Performance (Classification) operator, the Performance (Regression) operator etc. The Performance (Binominal Classification) operator is used with binominal classification tasks only. On the other hand, the Performance operator automatically determines the learning task type and calculates the most common criteria for that type. You can use the Performance (User-Based) operator if you want to write your own performance measure.

Classification is a technique used to predict group membership for data instances. For example, you may wish to use classification to predict whether the train on a particular day will be 'on time', 'late' or 'very late'. Predicting whether a number of people on a particular event would be 'below- average', 'average' or 'above-average' is another example. For evaluating the statistical performance of a classification model the data set should be labeled i.e. it should have an attribute with label role and an attribute with prediction role. The label attribute stores the actual observed values whereas the prediction attribute stores the values of the label predicted by the classification model under discussion.

Input

  • labeled data

    This input port expects a labeled ExampleSet. The Apply Model operator is a good example of such operators that provide labeled data. Make sure that the ExampleSet has a label attribute and a prediction attribute. See the Set Role operator for more details regarding label and prediction roles of attributes. Also make sure that the label attribute of the ExampleSet is of binominal type i.e. label has only two possible values.

  • performance

    This is an optional parameter. It requires a Performance Vector.

Output

  • performance

    This port delivers a Performance Vector (we call it output-performance-vector for now). A Performance Vector is a list of performance criteria values. It is calculated on the basis of the label and the prediction attribute of the input ExampleSet. The output-performance-vector contains performance criteria calculated by this Performance operator (we call it calculated-performance-vector here). If a Performance Vector was also fed at the performance input port (we call it input-performance-vector here), the criteria of the input-performance-vector are also added in the output-performance-vector. If the input-performance-vector and the calculated-performance-vector both have the same criteria but with different values, the values of the calculated-performance-vector are delivered through the output port. This concept can be easily understood by studying the attached Example Process.

  • example set (Data Table)

    The ExampleSet that was given as input is passed without changing to the output through this port. This is usually used to reuse the same ExampleSet in further operators or to view the ExampleSet in the Results Workspace.

Parameters

  • main_criterionThe main criterion is used for comparisons and needs to be specified only for processes where performance vectors are compared, e.g. attribute selection or other meta optimization process setups. If no main criterion is selected, the first criterion in the resulting performance vector will be assumed to be the main criterion. Range:
  • accuracyRelative number of correctly classified examples or in other words percentage of correct predictions. Range: boolean
  • classification_errorRelative number of misclassified examples or in other words percentage of incorrect predictions. Range: boolean
  • kappaThe kappa statistics for the classification. It is generally thought to be a more robust measure than the simple percentage correct prediction calculation since it takes into account the correct prediction occurring by chance. Range: boolean
  • AUC(optimistic)AUC is the Area Under the Curve of the Receiver Operating Characteristics (ROC) graph which is a technique for visualizing, organizing and selecting classifiers based on their performance. Given example weights are also considered. Please note that the second class is considered to be positive. AUC(optimistic) is the extreme case when all the positives end up at the beginning of the sequence. Range: boolean
  • AUCAUC is the Area Under the Curve of the Receiver Operating Characteristics (ROC) graph which is a technique for visualizing, organizing and selecting classifiers based on their performance. Given example weights are also considered. Please note that the second class is considered to be positive. Usually AUC is the average of AUC(pessimistic) and AUC(optimistic). Range: boolean
  • AUC(pessimistic)AUC is the Area Under the Curve of the Receiver Operating Characteristics (ROC) graph which is a technique for visualizing, organizing and selecting classifiers based on their performance. Given example weights are also considered. Please note that the second class is considered to be positive. AUC(pessimistic) is the extreme case when all the negatives end up at the beginning of the sequence. Range: boolean
  • precisionRelative number of correctly as positive classified examples among all examples classified as positive i.e. precision=(Positives Correctly Classified)/(Total Predicted Positives). Note that the Total Predicted Positives is the sum of True Positives and False Positives. This is the same as the positive predictive value. Range: boolean
  • recallThis parameter specifies the relative number of correctly as positive classified examples among all positive examples i.e. recall=(Positives Correctly Classified)/(Total Positives). It is also called hit rate or true positive rate. This is the same as sensitivity. Range: boolean
  • liftThis parameter specifies the lift of the positive class. Range: boolean
  • falloutThe Relative number of incorrectly as positive classified examples among all negative examples i.e. fallout=(Positives Incorrectly Classified)/(Total Negatives) Range: boolean
  • f_measureThis parameter is a combination of the precision and the recall i.e. f=2pr/(p+r) where f,r and p are f-measure, recall and precison respectively. Range: boolean
  • false_positiveThis parameter specifies the absolute number of negative examples that were incorrectly classified as positive examples. In other words, if the example is negative and it is classified as positive, it is counted as a false positive. Range: boolean
  • false_negativeThis parameter specifies the absolute number of positive examples that were incorrectly classified as negative examples. In other words, if the example is positive and it is classified as negative, it is counted as a false negative. Range: boolean
  • true_positiveThis parameter specifies the absolute number of positive examples that were correctly classified as positive examples. In other words, if the example is positive and it is classified as positive, it is counted as a true positive. Range: boolean
  • true_negativeThis parameter specifies the absolute number of negative examples that were correctly classified as negative examples. In other words, if the example is negative and it is classified as negative, it is counted as a true negative. Range: boolean
  • sensitivityThis parameter specifies the relative number of correctly as positive classified examples among all positive examples i.e. sensitivity =(Positives Correctly Classified)/(Total Positives). It is also called hit rate or true positive rate. This is same as recall. Range: boolean
  • specificityThe relative number of correctly as negative classified examples among all negative examples i.e. specificity=(Negatives Correctly Classified)/(Total Negatives). Note that Total Negatives is equal to sum of True Negatives and False Positives. Range: boolean
  • youdenThis parameter specifies the sum of sensitivity and specificity minus 1. Range: boolean
  • positive_predictive_valueThe relative number of correctly as positive classified examples among all examples classified as positive i.e. positive predictive value =(Positives Correctly Classified)/(Total Predicted Positives). Note that the Total Predicted Positives is sum of True Positives and False Positives. This is the same as precision. Range: boolean
  • negative_predictive_valueThe relative number of correctly as negative classified examples among all examples classified as negative i.e. negative predictive value =(Negatives Correctly Classified)/(Total Predicted Negatives). Note that the Total Predicted Negatives is sum of True Negatives and False Negatives. Range: boolean
  • psepThis parameter specifies the sum of the positive predictive value and the negative predictive value minus 1. i.e. psep=ppv+npv-1 where ppv and npv are positive predictive value and negative predictive value respectively. Range: boolean
  • skip_undefined_labelsIf set to true, examples with undefined labels are skipped. Range: boolean
  • comparator_classThis is an expert parameter. The fully qualified classname of the PerformanceComparator implementation is specified here. Range: string
  • use_example_weightsThis parameter allows example weights to be used for statistical performance calculations if possible. This parameter has no effect if no attribute has weight role. In order to consider weights of examples the ExampleSet should have an attribute with weight role. Several operators are available that assign weights e.g. the Generate Weights operator. Study the Set Roles operator for more information regarding weight role. Range: boolean

Tutorial Processes

Use of parameters of Performance (Binominal Classification)

The focus of this Example Process is to explain different statistical performance criteria of Binominal Classification. To get an understanding of the use of the performance port you can study the Example Process of the Performance operator or the Example Process of the Performance (Classification) operator.

The 'Golf' data set is loaded using the Retrieve operator. The Nominal to Binominal operator is applied on it to convert the label attribute (i.e. Play) from nominal to binominal type. The K-NN operator is applied on it to generate a classification model. This classification model is applied on the 'Golf-Testset' data set using the Apply Model operator. Note that the Nominal to Binominal operator is applied on the 'Golf-Testset' data set as well to convert the label attribute (i.e. Play) from nominal to binominal type to ensure that the training data set ('Golf') and the testing data set ('Golf-Testset') are in the same format. The labels were changed to binominal form because the Performance (Binominal Classification) operator can only handle binominal labels. Run the process and you can see the results in the Results Workspace.

Have a good look at the confusion matrix in the Results Workspace. This matrix will be used to explain all the parameters. The rows 'pred. no' and 'pred. yes' tell about the examples that were classified as 'no' and classified as 'yes' respectively. The columns 'true no' and 'true yes' tell about the examples that were actually labeled 'no' and actually labeled 'yes' respectively. Here is some information that we can get by simply taking a look at the confusion matrix.

True Negative = the examples that were actually labeled 'no' and were classified as 'no' = 3 False Negative = the examples that were actually labeled 'yes' and were classified as 'no' = 2 True Positive = the examples that were actually labeled 'yes' and were classified as 'yes'= 7 False Positive = the examples that were actually labeled 'no' and were classified as 'yes' = 2 Total number of examples that were actually labeled 'no' = Total Negatives = 5 (i.e. 3+2) Total number of examples that were actually labeled 'yes' = Total Positives =9 (i.e. 7+2) Total number of examples that were classified as 'no' = Total Predicted Negatives = 5 (i.e. 3+2) Total number of examples that were classified as 'yes' = Total Predicted Positives = 9 (i.e. 7+2) Total number of examples = 14 (i.e. 2+3+2+7) Total number of correct classifications = 10 (i.e.3+7) Total number of incorrect classifications = 4 (i.e.2+2)

Here is shown how different the statistical performance criteria were calculated. The terms 'Positive' and 'classified as yes' mean the same thing. Same as with other similar terms like 'Correctly Classified Positives' and 'True Positive' which mean the same thing.

accuracy = (Total Correct Classifications)/(Total number of examples) = (10)/(14) = 71.42% classification error = (Total incorrect classifications)/( Total number of examples) = (4)/(14) =28.57% precision = (True Positives)/(Total Predicted Positives) =(7)/(9) =77.78% recall = (True Positive)/(Total Positives) =(7)/(9) =77.78% fallout = (False Positives)/(Total Negatives) =(2)/(5) =40% f-measure = 2pr/(p+r) where r and p are recall and precison respectively =77.78% sensitivity = (True Positive)/(Total Positives) =(7)/(9) =77.78% specificity = (True Negatives)/(Total negatives) =(3)/(5) =60% youden = the sum of sensitivity(0.78) and specificity(0.60) minus 1= 0.378 positive predicted value = (True Positives)/(Total Predicted Positives) =(7)/(9) =77.78% negative predicted value = (True Negatives )/(Total Predicted Negatives) =(3)/(5) =60% psep = the sum of the positive predictive value (0.78) and the negative predictive value(0.60) minus 1= 0.378