Precision, Recall and F1 Score For Data Scientists

In this blog, we will learn about recall and precision values. You may have heard these terms somewhere if you are learning machine learning. Those are validation parameters, just like accuracy.

by the end of this blog, you will know,

What is Recall?
What is Precision?
How recall and precision are effective? why should we care about them?
What is an F1 Score? Why do we need it?
Why is it important to balance Recall and precision values?

I suggest you read this blog first if you don’t know anything about the confusion matrix. I have explained it there in detail.

Contents hide

1 True Positive, True Negative, False Positive and False Negative

1.1 Confusion Matrix

2 Why do we need precision or recall?

3 What is the precision of the model?

4 What is the recall of the model?

5 Why should we care about precision and recall?

6 What is an F1 score?

6.1 Share this:

6.2 Like this:

6.3 Discover more from Arshad Kazi

But still, for a revision, here are some notes that we need for this blog, skip if you already know it.

True Positive, True Negative, False Positive and False Negative

There are four values that we need to know.

True Positive: The sample is positive, our model predicted as positive
True Negative: The sample is negative, our model predicted as negative
False Positive: The sample is negative but our model predicted as positive
False Negative: The sample is positive but our model predicted as negative

Here’s code to calculate these values,


# y_true is the list of actual values
# y_pred is the list of predicted values

def true_positive(y_true, y_pred):
    tp = 0
    for yt, yp in zip(y_true, y_pred):
        if yt == 1 and yp == 1:
            tp += 1
    return tp

def true_negative(y_true, y_pred):
    tn = 0
    for yt, yp in zip(y_true, y_pred):
        if yt == 0 and yp == 0:
            tn += 1
    return tn

def false_positive(y_true, y_pred):
    fp = 0
    for yt, yp in zip(y_true, y_pred):
        if yt == 0 and yp == 1:
            fp += 1
    return fp

def false_negative(y_true, y_pred):
    fn = 0
    for yt, yp in zip(y_true, y_pred):
        if yt == 1 and yp == 0:
              fn += 1
    return fn

Confusion Matrix

And the confusion matrix is the matrix that intuitively represents these four values. We learn how and why recall and precision are related to the confusion matrix.

Why do we need precision or recall?

Consider, you have built a binary classifier and found the model’s accuracy to be 80% on skewed data.

That skewed data contains 80% of positive samples (any positive classification, say dogs) and 20% of negative samples (say cats). If your model is not well-trained and gives always-positive (always dogs). Then your model will still have an accuracy of the 80%.

So, you may not use accuracy as your metric to validate your model always.

There we come to precision and recall concepts.

What is the precision of the model?

Precision is the no. that defines how many times our model predicts positive samples correctly out of all predicted positive samples.

See the following formula for the precision,

precision = true_positive/ (true_positive + false_positive)

Now, what does it signify? how can we know it?

As you can see, precision lies in the positive column. You might have guessed that it is just a ratio of getting True positives (correctly predicting positive results) upon predicting total positive results (both correctly and incorrectly (FP and TP)).

Suppose we get a cat vs dog example (in which dog is 1 and cat is 0) if our model predicts 70 dogs correctly out of a total of 80 dogs and predicts 10 cats correctly out of 20 total. Then we will have an accuracy of 80%.
Let’s calculate precision for our model.


tp = 70/80 # was dog, predicted dog
fp = 10/20 # was cat, predicted dog
tn = 10/20 # was cat, predicted cat
fn = 10/80 # was dog, predicted cat

precision = (7/8) / ((7/8) + (1/2)) = 0.6363%

That means our model is 63% good when is trying to predict the positive samples correctly. See the code below to calculate precision,


def precision(y_true, y_pred):
    tp = true_positive(y_true, y_pred)
    fp = false_positive(y_true, y_pred)
    precision = tp / (tp + fp)
    return precision

But this is not at all end of the story. We are still not considering anything about negative outcomes. How can we consider those negative outcomes too?

What is the recall of the model?

A recall is a value that tells us how good is our model in predicting positive samples correctly.

Don’t get confused, here, we are still considering the positive class, but now we are not calculating how good is our model in predicting positive class, rather, we are now calculating how correct our model is in calculating actual positive samples.

recall = true_positive / (true_positive + false_negative)

We can see that recall depends on the row of the confusion matrix (unlike precision which depends on the column), which means, it gives us the value which tells something about the actual accuracy of the model on positive data (data which was positive, but model predicted it positive/negative)

Here’s code to calculate recall of the model,


def recall(y_true, y_pred):
    tp = true_positive(y_true, y_pred)
    fn = false_negative(y_true, y_pred)
    recall = tp / (tp + fn)
    return recall

But why do we need these values in the first place?

Why should we care about precision and recall?

In any binary classification, if we train our model on skewed data, (means, if the no of positive cases (or negative) are significantly higher than those negative ones) our model will train to predict positive class better than negative cases. We want to detect this. We can find the accuracy of individual classes, but that too will not give us a satisfactory answer because we want to combine both of the classes to detect this bias (wrong word, but it’s for our understanding).

That is what we do while calculating precision and recall values.

Now, we are still with two values. We further want to combine these values. There comes, F1 score.

What is an F1 score?

F1 score combines both Precision and Recall values, simply taking their harmonic mean. We want values like the F1-score, because, we want one value that defines how good our model is. A good model will have high precision and a high recall value.


# P is precision, R is recall

F1 = 2*P*R/ (P + R)  

# In tp, fp, fn terms the formula becomes 

F1 = 2 * tp / (2*tp + fp + fn)

See the following code to calculate F1 score,


def recall(y_true, y_pred):
    tp = true_positive(y_true, y_pred)
    fn = false_negative(y_true, y_pred)
    fp = false_positive(y_true, y_pred)

    p = tp / (tp + fp)
    r = tp / (tp + fn)
    score = 2 * p * r/ (p + r) 
    return score

I hope that helped. If it didn’t, or something looks ambiguous to you, don’t forget to give feedback, or hit on [email protected]!

Contact:

Discover more from Arshad Kazi

Subscribe to get the latest posts sent to your email.

Precision, Recall and F1 Score For Data Scientists

True Positive, True Negative, False Positive and False Negative

Confusion Matrix

Why do we need precision or recall?

What is the precision of the model?

What is the recall of the model?

Why should we care about precision and recall?

What is an F1 score?

Share this:

Like this:

Discover more from Arshad Kazi

Leave a Reply/Feedback :)Cancel reply

Discover more from Arshad Kazi