Hello learners!! In this tutorial, we will learn about the Softmax function and how to calculate the softmax function in Python using NumPy. We will also get to know frameworks that have built-in methods for Softmax. So let’s get started.
What is the Softmax function?
Softmax is a mathematical function that takes as input a vector of numbers and normalizes it to a probability distribution, where the probability for each value is proportional to the relative scale of each value in the vector.
Before applying the softmax function over a vector, the elements of the vector can be in the range of (-∞, ∞)
.
Some elements can be negative while some can be positive.
After applying the softmax function, each value will be in the range of [0, 1]
, and the values will sum up to 1 so that they can be interpreted as probabilities.
The formula for softmax calculation is
where we first find the exponential of each element in the vector and divide them by the sum of exponentials calculated.
Softmax function is most commonly used as an activation function for Multi-class classification problem where you have a range of values and you need to find probability of their occurance. The softmax function is used in the output layer of neural network models that predict a multinomial probability distribution.
Implementing Softmax function in Python
Now we know the formula for calculating softmax over a vector of numbers, let’s implement it. We will use NumPy exp()
method for calculating the exponential of our vector and NumPy sum()
method to calculate our denominator sum.
import numpy as np
def softmax(vec):
exponential = np.exp(vec)
probabilities = exponential / np.sum(exponential)
return probabilities
vector = np.array([1.0, 3.0, 2.0])
probabilities = softmax(vector)
print("Probability Distribution is:")
print(probabilities)
Probability Distribution is:
[0.09003057 0.66524096 0.24472847]
Using frameworks to calculate softmax
Many frameworks provide methods to calculate softmax over a vector to be used in various mathematical models.
1. Tensorflow
You can use tensorflow.nn.softmax
to calculate softmax over a vector as shown.
import tensorflow as tf
import numpy as np
vector = np.array([5.5, -13.2, 0.5])
probabilities = tf.nn.softmax(vector).numpy()
print("Probability Distribution is:")
print(probabilities)
Probability Distribution is:
[9.93307142e-01 7.51236614e-09 6.69285087e-03]
2. Scipy
Scipy library can be used to calculate softmax using scipy.special.softmax
as shown below.
import scipy
import numpy as np
vector = np.array([1.5, -3.5, 2.0])
probabilities = scipy.special.softmax(vector)
print("Probability Distribution is:")
print(probabilities)
Probability Distribution is:
[0.3765827 0.00253739 0.62087991]
3. PyTorch
You can use Pytorch torch.nn.Softmax(dim)
to calculate softmax, specifying the dimension over which you want to calculate it as shown.
import torch
vector = torch.tensor([1.5, -3.5, 2.0])
probabilities = torch.nn.Softmax(dim=-1)(vector)
print("Probability Distribution is:")
print(probabilities)
Probability Distribution is:
tensor([0.3766, 0.0025, 0.6209])
Conclusion
Congratulations!!, Now you have learned about softmax function and how to implement it using various ways, you can use it in for your multi-class classification problems in Machine Learning.
Thanks for reading!!