Calculating Maximum a Posteriori (MAP) Estimation in Python

Maximum a posteriori ( MAP ) is an estimation technique that estimates the most probable value ( mode ) of an unknown quantity of some data. In this article, we calculate MAP using the combination of prior distribution and likelihood function which is Posterior Distribution. The posterior distribution reflects the belief about the unknown quantity after considering both the prior knowledge and the observed data. Let us look at its formula.

Maximum A Posterior Formula

In this article, we will calculate MAP using two different methods. We will start with just the Numpy library to estimate parameters, then combine the Numpy and Scipy libraries to calculate MAP.

Estimating MAP using NumPy in Python

We use the Numpy library in the code below to calculate the MAP parameter. We have defined the mean of a normal distribution and then randomly created data for the distribution. We then calculate the likelihood and prior probabilities and thereafter calculate the posterior probability. We plot the normalized graph of all the points and return the estimated mean value (the largest value in the calculated distribution ).

import numpy as np

# Define the true parameter (mean of a normal distribution)
true_mean = 5

# Generate random data from the normal distribution
data = np.random.normal(true_mean, 1, size=100)

# Define the prior distribution (another normal distribution) as prior knowledge
prior_mean = 4
prior_std = 2

def pdf_normal(x, mean, std):
  # Calculate the probability density function of a normal distribution
  return 1 / (std * np.sqrt(2 * np.pi)) * np.exp(-(x - mean)**2 / (2 * std**2))

def posterior(x):
  # Likelihood (product of individual data point probabilities)
  likelihood = 1
  for datum in data:
    likelihood *= pdf_normal(datum, x, 1)

  # Prior probability
  prior = pdf_normal(x, prior_mean, prior_std)

  # Posterior probability (product of likelihood and prior)
  return likelihood * prior

# Discretize the parameter space
x_grid = np.linspace(0, 10, 100)  # Adjust range for better fit

# Calculate posterior for each grid point
posterior_values = np.zeros_like(x_grid)
for i, x in enumerate(x_grid):
  posterior_values[i] = posterior(x)

# Normalize the posterior to get a probability distribution
posterior_normalized = posterior_values / np.sum(posterior_values)

# Find the MAP estimate (index of maximum value)
map_estimate = x_grid[np.argmax(posterior_normalized)]

print("True mean:", true_mean)
print("MAP estimate:", map_estimate)

Let us look at the output for the above code.

We get the value of the MAP estimate as 5.1515……

MAP Estimation with SciPy and NumPy

We now calculate MAP using the combination of Scipy and Numpy libraries of Python programming language. In the code below, we define the mean as 5. All the calculation steps remain the same, the only difference is that this code uses the Scipy library norm module to plot the distributions. Moreover, we optimize the calculated distributions to calculate the MAP estimate in this method. The code below is more efficient in the calculation of MAP.

import numpy as np
from scipy.stats import norm

# Define the true parameter (mean of a normal distribution)
true_mean = 5

# Generate random data from the normal distribution
data = norm.rvs(true_mean, 1, size=100)

# Define the prior distribution (another normal distribution) as prior knowledge
prior_mean = 4
prior_std = 2

def posterior(x):
  # Likelihood function (normal distribution)
  likelihood = norm.pdf(data, x, 1)

  # Prior probability density function (normal distribution)
  prior = norm.pdf(x, prior_mean, prior_std)

  # Posterior probability proportional to likelihood * prior
  posterior_prop = likelihood * prior

  # Normalize the posterior (optional, not required for finding MAP)
  # posterior = posterior_prop / np.sum(posterior_prop)

  return posterior_prop

# Use optimization to find the value of x that maximizes the posterior
map_estimate = np.argmax(posterior(np.linspace(0, 10, 100)))  # Adjust range for better fit

print("True mean:", true_mean)
print("MAP estimate:", map_estimate)

Let us look at the output of the above code.

The output we get is that the MAP estimate is 4.9494…

Conclusion

You now know how to use Maximum a Posteriori (MAP) estimation to find the most likely value for something unknown. Python’s NumPy and SciPy libraries make it easy to estimate MAP. Play around with different starting beliefs and ways of looking at the data to see what works best. The more you use MAP, the more you’ll see how cool and helpful it is. So go out there and estimate like a pro! Have fun figuring out what your data is trying to tell you.

Recommended: How To Calculate Power Statistics?