Confusion Matrix is one of the fundamentals of Machine Learning. In this post, we try and simplify it for you.
Say, you have some clinical measurements like chest pain, blood circulation, blocked arteries, and weight. You wish to use a machine-learning algorithm to predict whether or not someone will develop heart disease. To do this you could use Logistic Regression, K-NN, Decision Tree, or some other ML algorithms to classify each of the cases as positive or negative.
First up, you divide the data into training and testing sets. Next, you train all the relevant algorithms using training data. Subsequently, you test each method. Finally, you evaluate how each method performed on the testing data.
To evaluate the performance, you find the Classification Accuracy of each model. Classification accuracy is the ratio of correct predictions to total predictions made, i.e.,
Classification accuracy (CA) = Correct predictions (CP) / Total predictions (TP)
It is often presented as a percentage by multiplying the result by 100.
Error rate = (1 - (CP / TP)) * 100
While classification accuracy is a great place to start, it often encounters problems in practice. The main problem with it is that it does not provide a detailed view of the performance of your classification model.
It’s here that the confusion matrix comes into the picture.
So, what is a Confusion Matrix?
Confusion Matrix is a table that is used to measure the performance of the classification Model. The general idea is to count the number of times instances of class A are classified as class B.
Let’s understand this with an example of spam email prediction.
The simplest confusion matrix looks like the following table
For python the labels are slightly different:
- TP -> True Positive (Predicting spams as spam).
- FN ->False Negative (Predicting spammers as non-spam).
- FP -> False Positive (predicting Non-spammers as spam).
- TN -> True Negative (Predicting Non-spammers as non-spam).
By considering the above email spam example, FN is the riskiest because our model is predicting spam as non-spam. By this, the user may undergo any fraud. So, let’s call it a Type-1 error. While adding the cost function our main aim is to reduce the Type-1 error.
FP is the type-2 error because when the regular emails are getting marked spam. To increase accuracy, we need to work on both the errors.
The error types will differ from one problem to another. To evaluate the performance, we mainly focus on the parameters explained below.
Parameters to evaluate the performance of a model
For most models, the performance is evaluated on the basis of the following four parameters:
- Sensitivity (True Positive Rate) (TPR):
When the actual value is positive, how often is the positive prediction is correct?
“FP -> If mail is not spam and our model is detecting it as spam”
“FN -> If mail is spam but our model is detecting it as non-spam”
- Specificity (True Negative Rate): When the actual value is negative, how frequently is the negative prediction correct?
Alternatively, Specificity= 1 - false positive rate
- False Positive Rate: How often is the positive prediction incorrect, when the actual value is negative?
False positive Rate= FP/(TN+FP)
OR, False positive Rate=1-Specificity
- Precision: When a positive value is predicted, how often is the prediction correct?
The rate of success can be calculated as
r = (TN+TP)/(FN+FP)
If that’s too many equations to digest, here’s a perfect illustration explaining the confusion matrix:
Notably, the confusion matrix is not limited to binary classifiers. The size of the confusion matrix is determined by the number of things we want to predict.
In summary, the Confusion Matrix tells you what your machine learning algorithm did right and what it did wrong. Here’s an interesting course which may help you explore it further: <link>