Precision and Recall in Python

Precision And Recall

Let’s talk about Precision and Recall in today’s article. Whenever we implement a classification problem (i.e decision trees) to classify data points, there are points that are often misclassified.

Even though accuracy gives a general idea about how good the model is, we need more robust metrics to evaluate our model.

Let’s consider an Example.

Suppose you are a data scientist working at a firm, and you been assigned a task to identify a fraud transaction when it’s occurring. Now you have to build a model that seems to give good accuracy but there’s a catch.

I would like you to imagine two scenarios in this problem.

  • Scenario 1: Your Model classified a non-Fraud transaction as Fraud.
  • Scenario 2: Your Model classified a Fraud Transaction as Non-Fraudulent.

Among these two scenarios which is the most important situation to pay attention to given the fact that fraud transactions can impart huge losses?

I hope you guessed it correct.

It’s Scenario 2. If your model classifies fraud transactions as a non-fraud one it can make your organization suffer a significant amount of loss. You don’t want that, do you? 馃檪

Accuracy doesn’t provide any means to deal with such problems.

In this article, we will see how we can deal with such problems by gaining knowledge about Precision and Recall.

Understanding the confusion matrix

Before diving into precision and recall we must know confusion matrix.

The confusion matrix for a binary classification problem looks like this. where we either classify points correctly or we don’t, but these misclassified points can be further divided as False Positive and False Negative.

Confusion Matrix
Confusion Matrix

Let’s understand the terminology now.

  • True Positive (TP): The actual positive class is predicted positive.
  • True Negative (TN): The actual negative class is predicted negative.
  • False Positive (FP): The actual class is negative but predicted as Positive.
  • False Negative (FN): The actual class is positive but predicted as negative.

Both precision and recall can be interpreted from the confusion matrix. Let’s see what they are.

What do you mean by Precision?

In the simplest terms, Precision is the ratio between the True Positives and all the points that are classified as Positives.

To calculate a model鈥檚 precision, we need the positive and negative numbers from the confusion matrix.

Precision = TP/(TP + FP)

Well to look over precision we just see it as some fancy mathematical ratio, but what in world does it mean?

Referring to our Fraudulent transaction example from above. This would mean that among all the transactions that are been classified as positive (Fraud) how many are actually positive.

What do you mean by Recall?

To put it simply, Recall is the measure of our model correctly identifying True Positives. It is also called a True positive rate.

It is the ratio of True Positive and the sum of True positive and False Negative. This means that of all the points which are actually positive, what fraction did we correctly predicted as positive?

Recall = TP/(TP + FN)

Referring to our example from before. we say that among all the transactions that were actually fraud, how many of them did we predict as Fraud.

Recall Intuition
Recall Intuition

What is the F1 Score?

F1-score is the Harmonic mean of the Precision and Recall

It can be calculated as:

F1 Score
F1 Score

F1-score is a better metric when there are imbalanced classes. It is needed when you want to seek a balance between Precision and Recall.

In most real-life classification problems, imbalanced class distribution exists and thus F1-score is a better metric to evaluate our model.

Calculating Precision and Recall in Python

Let’s see how we can calculate precision and recall using python on a classification problem.

We’ll make use of sklearn.metrics module.

#Importing the required libraries
from sklearn import datasets
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_recall_curve
from sklearn.metrics import plot_precision_recall_curve
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
import matplotlib.pyplot as plt

#Loading the data
data = datasets.load_breast_cancer()
df = pd.DataFrame(, columns=data.feature_names)
df['target'] =

#Splitting the data into training and test set
X_train, X_test, y_train, y_test = train_test_split(
                                    df.iloc[:,:-1], df.iloc[:,-1], test_size=0.3, random_state=42)

# Initialize and fit the Model
model = LogisticRegression(), y_train)

#Make prediction on the test set
pred = model.predict(X_test)

#calculating precision and reall
precision = precision_score(y_test, pred)
recall = recall_score(y_test, pred)

print('Precision: ',precision)
print('Recall: ',recall)

#Plotting Precision-Recall Curve
disp = plot_precision_recall_curve(model, X_test, y_test)
Precision:  0.963963963963964
Recall:  0.9907407407407407
Precision Recall Curve
Precision Recall Curve

precision_score( ) and recall_score( ) functions from sklearn.metrics module requires true labels and predicted labels as input arguments and returns precision and recall scores respectively.


The ability to have high values on Precision and Recall is always desired but, it鈥檚 difficult to get that. Depending on the type of application we need to either increase Precision or Recall. This article was all about understanding two very very crucial model evaluation metrics.

Happy Learning!