Calculating Softmax in Python

Hello learners!! In this tutorial, we will learn about the Softmax function and how to calculate the softmax function in Python using NumPy. We will also get to know frameworks that have built-in methods for Softmax. So let’s get started.

What is the Softmax function?

Softmax is a mathematical function that takes as input a vector of numbers and normalizes it to a probability distribution, where the probability for each value is proportional to the relative scale of each value in the vector.

Before applying the softmax function over a vector, the elements of the vector can be in the range of (-∞, ∞).

Some elements can be negative while some can be positive.

After applying the softmax function, each value will be in the range of [0, 1], and the values will sum up to 1 so that they can be interpreted as probabilities.

The formula for softmax calculation is

$\sigma(\vec{z})_{i}=\frac{e^{z_{i}}}{\sum_{j=1}^{K} e^{z_{j}}}$

where we first find the exponential of each element in the vector and divide them by the sum of exponentials calculated.

Softmax function is most commonly used as an activation function for Multi-class classification problem where you have a range of values and you need to find probability of their occurance. The softmax function is used in the output layer of neural network models that predict a multinomial probability distribution.

Implementing Softmax function in Python

Now we know the formula for calculating softmax over a vector of numbers, let’s implement it. We will use NumPy exp() method for calculating the exponential of our vector and NumPy sum() method to calculate our denominator sum.

import numpy as np

def softmax(vec):
  exponential = np.exp(vec)
  probabilities = exponential / np.sum(exponential)
  return probabilities

vector = np.array([1.0, 3.0, 2.0])
probabilities = softmax(vector)
print("Probability Distribution is:")
print(probabilities)

Probability Distribution is:
[0.09003057 0.66524096 0.24472847]

Using frameworks to calculate softmax

Many frameworks provide methods to calculate softmax over a vector to be used in various mathematical models.

1. Tensorflow

You can use tensorflow.nn.softmax to calculate softmax over a vector as shown.

import tensorflow as tf
import numpy as np

vector = np.array([5.5, -13.2, 0.5])

probabilities = tf.nn.softmax(vector).numpy()

print("Probability Distribution is:")
print(probabilities)

Probability Distribution is:
[9.93307142e-01 7.51236614e-09 6.69285087e-03]

2. Scipy

Scipy library can be used to calculate softmax using scipy.special.softmax as shown below.

import scipy
import numpy as np

vector = np.array([1.5, -3.5, 2.0])
probabilities = scipy.special.softmax(vector)
print("Probability Distribution is:")
print(probabilities)

Probability Distribution is:
[0.3765827  0.00253739 0.62087991]

3. PyTorch

You can use Pytorch torch.nn.Softmax(dim) to calculate softmax, specifying the dimension over which you want to calculate it as shown.

import torch

vector = torch.tensor([1.5, -3.5, 2.0])
probabilities = torch.nn.Softmax(dim=-1)(vector)
print("Probability Distribution is:")
print(probabilities)

Probability Distribution is:
tensor([0.3766, 0.0025, 0.6209])

Conclusion

Congratulations!!, Now you have learned about softmax function and how to implement it using various ways, you can use it in for your multi-class classification problems in Machine Learning.

Thanks for reading!!