In machine learning problems, evaluating a model is as important as building one. Without evaluation, the whole process of formulating the business problem and creating a machine learning model for it goes in vain. In one of my previous articles, I have mentioned the critical aspects of the Confusion Matrix. ROC(Receiver Operating Characteristic) curve is likewise one such tool to identify the suitability of a classification model. It’s a curve that can assist us with seeing how best we can recognize two comparative reactions. For example, distinguishing if the email is spam? These reactions change dependent on the business problem that we are attempting to settle. Subsequently, an appropriate comprehension of what the ROC curve is can make our work more agreeable now and again.

What is a classification model?

Classification is a problem where it uses machine learning algorithms that learn how to assign a class if given data. Here, Classes are also called targets/labels or categories. A classification model attempts to draw some conclusions from observed values. Given one or more inputs, a classification model will commit to predict the worth of one or more outcomes.

What is the Auc – Roc curve?

AUC-ROC curve may be a performance measurement for the classification problem at various threshold settings. ROC curve would be a likelihood curve and AUC addresses the degree or proportion of separability. It tells how much the model is fit for distinguishing between classes. The higher the AUC, the better the model is at predicting 0’s as 0’s and 1’s as 1’s. By relationship, the higher the AUC, the better the model is at recognizing patients with the infection and no illness.  

The ROC curve plots two parameters:

  • True positive Rate
  • False-positive Rate

True positive rate(TPR) is an alternative name of sensitivity which is defined as, When the actual value is positive, how often is the positive prediction is correct?

I.e True positive Rate = TP/(FN+TP)

False Positive Rate(FPR) False Positive Rate is defined as how often is the positive prediction incorrect when the actual value is negative? 

I.e False Positive Rate = FP/(TN+FP)

A Roc curve plots the true positive rate versus the false positive rate at various classification thresholds. cutting down the classification threshold classifies more things as positive, thus increasing both False-positive and True Positives. The following figure shows a regular ROC Curve. 

TP vs. FP rate at different classification thresholds.

To compute the points in a ROC curve, we could evaluate a logistic regression model many times with different classification thresholds, however, this would be insufficient. Luckily, there’s an efficient, arranging-based algorithm that can give this information for us, called AUC.

AUC: Area Under the ROC Curve

AUC means ”Region under the ROC curve.” That is, AUC estimates the whole two-dimensional zone under the whole ROC curve. 

AUC (Area under the ROC Curve).

AUC gives a total proportion of performance across all possible classification thresholds. One method of interpreting AUC is the likelihood that the model ranks a random positive example more exceptionally than a random negative example. For example, given the following examples, which are arranged from left to right in ascending order of logistic regression predictions: 

Predictions ranked in ascending order of logistic regression score.

AUC addresses the likelihood that a random positive(green) example is positioned to the right of a random negative(red) example. 

AUC goes in esteem from 0-1. A model whose predictions are 100% wrong has an AUC of 0.0; one whose predictions are 100% correct has an AUC of 1.0.

AUC is adviseble for the following two reasons:

  • AUC is scale-invariant. It estimates how well predictions are positioned, instead of  their absolute values.
  • AUC is classification-threshold-invariant. It estimates the quality of the model’s predictions regardless of what classification threshold is chosen.

However, both these reasons come with caveats, which may ristrict the value of AUC in certain use cases:

  • Scale invariance is not always advisable. For example, here and there we truly need well-calibrated likelihood yields, and AUC won’t inform us concerning that.
  • Classification-threshold invariance is not always adviseble. In situations where there are wide differences in the expense of false negatives versus false positives, it might be critical to minimize one type of classification error. For example, when doing email spam detection, you probably need to prioritize minimizing false positives (regardless of whether that outcomes in a huge increment of false negatives). AUC is certifiably not a helpful measurement for this kind of optimization.