Bayesian Inference in Python: A Comprehensive Guide with Examples


Data-driven decision-making has become essential across various fields, from finance and economics to medicine and engineering. Understanding probability and statistics is crucial for making informed choices today. Bayesian inference, a powerful tool in probabilistic reasoning, allows us to update our beliefs about an event based on new evidence.

Bayes’s theorem, a fundamental concept in probability theory, forms the foundation of Bayesian inference. This theorem provides a way to calculate the probability of an event occurring given prior knowledge and observed data. Combining our initial beliefs with the likelihood of the evidence, we can arrive at a more accurate posterior probability.

Bayesian inference is a statistical method based on Bayes’s theorem, which updates the probability of an event as new data becomes available. It is widely used in various fields, such as finance, medicine, and engineering, to make predictions and decisions based on prior knowledge and observed data. In Python, Bayesian inference can be implemented using libraries like NumPy and Matplotlib to generate and visualize posterior distributions.

This article will explore Bayesian inference and its implementation using Python, a popular programming language for data analysis and scientific computing. We will start by understanding the fundamentals of Bayes’s theorem and formula, then move on to a step-by-step guide on implementing Bayesian inference in Python. Along the way, we will discuss a real-world example of predicting website conversion rates to illustrate the practical application of this powerful technique.

Recommended: Python and Probability: Simulating Blackjack Card Counting with Python Code

Recommended: Understanding Joint Probability Distribution with Python

What is Bayesian Inference?

Bayesian inference is based on Bayes’s theorem, which is based on the prior probability of an event. As events happen, the probability of the event keeps updating. Let us look at the formula of Baye’s theorem.

Bayes Theorem Formula
Bayes Theorem Formula

Implementing Bayesian Inference in Python

Let us try to implement the same in Python with the code below.

import numpy as np
import matplotlib.pyplot as plt

# Generate some synthetic data
true_mu = 5
true_sigma = 2
data = np.random.normal(true_mu, true_sigma, size=100)

# Define the prior hyperparameters
prior_mu_mean = 0
prior_mu_precision = 1  # Variance = 1 / precision
prior_sigma_alpha = 2
prior_sigma_beta = 2  # Beta = alpha / beta

# Update the prior hyperparameters with the data
posterior_mu_precision = prior_mu_precision + len(data) / true_sigma**2
posterior_mu_mean = (prior_mu_precision * prior_mu_mean + np.sum(data)) / posterior_mu_precision

posterior_sigma_alpha = prior_sigma_alpha + len(data) / 2
posterior_sigma_beta = prior_sigma_beta + np.sum((data - np.mean(data))**2) / 2

# Calculate the posterior parameters
posterior_mu = np.random.normal(posterior_mu_mean, 1 / np.sqrt(posterior_mu_precision), size=10000)
posterior_sigma = np.random.gamma(posterior_sigma_alpha, 1 / posterior_sigma_beta, size=10000)

# Plot the posterior distributions
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.hist(posterior_mu, bins=30, density=True, color='skyblue', edgecolor='black')
plt.title('Posterior distribution of $\mu$')

plt.subplot(1, 2, 2)
plt.hist(posterior_sigma, bins=30, density=True, color='lightgreen', edgecolor='black')
plt.title('Posterior distribution of $\sigma$')


# Calculate summary statistics
mean_mu = np.mean(posterior_mu)
std_mu = np.std(posterior_mu)
print("Mean of mu:", mean_mu)
print("Standard deviation of mu:", std_mu)

mean_sigma = np.mean(posterior_sigma)
std_sigma = np.std(posterior_sigma)
print("Mean of sigma:", mean_sigma)
print("Standard deviation of sigma:", std_sigma)

Let us look at the output of the same.

Bayes Theorem Plot
Bayes Theorem Plot
Bayes Theorem Output
Bayes Theorem Output

Real-World Example: Predicting Website Conversion Rates with Bayesian Inference

Let us now look at the case of a website. We are trying to predict how many people will buy the product to the ratio of the number of visitors. We have also created some parameters.

import numpy as np
import matplotlib.pyplot as plt

# Observed data
num_visitors = 1000  # Total number of visitors to the website
num_conversions = 50  # Number of conversions (desired actions)

# Prior hyperparameters for the Beta distribution
prior_alpha = 1  # Shape parameter
prior_beta = 1   # Shape parameter

# Update the prior with the observed data to get the posterior parameters
posterior_alpha = prior_alpha + num_conversions
posterior_beta = prior_beta + (num_visitors - num_conversions)

# Generate samples from the posterior Beta distribution
posterior_samples = np.random.beta(posterior_alpha, posterior_beta, size=10000)

# Plot the posterior distribution
plt.figure(figsize=(8, 6))
plt.hist(posterior_samples, bins=30, density=True, color='skyblue', edgecolor='black', alpha=0.7)
plt.title('Posterior Distribution of Conversion Rate')
plt.xlabel('Conversion Rate')
plt.xlim(0, 0.1)  # Limiting x-axis to focus on conversion rates close to zero

# Calculate summary statistics
mean_conversion_rate = posterior_alpha / (posterior_alpha + posterior_beta)
mode_conversion_rate = (posterior_alpha - 1) / (posterior_alpha + posterior_beta - 2)  # Mode of the Beta distribution

print("Mean conversion rate:", mean_conversion_rate)
print("Mode conversion rate:", mode_conversion_rate)

Let us look at the output of the above code and try to deduce some information from it.

Bayesian Inference Output
Bayesian Inference Output

According to our predictions, the probability of conversion is around 5%.


Here you go! Now, you know a lot more about Bayesian inference. In this article, we learned what Bayesian inference is and also touched upon how to implement it in the Python programming language. We also learned about a very simple case study where we calculated the probability of customers’ conversions if they visited a particular website. Bayesian inference, as mentioned, is also used heavily in the fields of finance, economics, and engineering.

Hope you enjoyed reading it!!

Recommended: Monte-Carlo Simulation to find the probability of Coin toss in python

Recommended: 5 Ways to Detect Fake Dollar Bills Using Python Machine Learning