Hello learners!! In this tutorial, we will learn about the Softmax function and how to calculate the softmax function in Python using NumPy. We will also get to know frameworks that have built-in methods for Softmax. So let’s get started.
What is the Softmax function?
Softmax is a mathematical function that takes as input a vector of numbers and normalizes it to a probability distribution, where the probability for each value is proportional to the relative scale of each value in the vector.
Before applying the softmax function over a vector, the elements of the vector can be in the range of
Some elements can be negative while some can be positive.
After applying the softmax function, each value will be in the range of
[0, 1], and the values will sum up to 1 so that they can be interpreted as probabilities.
The formula for softmax calculation is
where we first find the exponential of each element in the vector and divide them by the sum of exponentials calculated.
Softmax function is most commonly used as an activation function for Multi-class classification problem where you have a range of values and you need to find probability of their occurance. The softmax function is used in the output layer of neural network models that predict a multinomial probability distribution.
Implementing Softmax function in Python
Now we know the formula for calculating softmax over a vector of numbers, let’s implement it. We will use NumPy
exp() method for calculating the exponential of our vector and NumPy
sum() method to calculate our denominator sum.
import numpy as np def softmax(vec): exponential = np.exp(vec) probabilities = exponential / np.sum(exponential) return probabilities vector = np.array([1.0, 3.0, 2.0]) probabilities = softmax(vector) print("Probability Distribution is:") print(probabilities)
Probability Distribution is: [0.09003057 0.66524096 0.24472847]
Using frameworks to calculate softmax
Many frameworks provide methods to calculate softmax over a vector to be used in various mathematical models.
You can use
tensorflow.nn.softmax to calculate softmax over a vector as shown.
import tensorflow as tf import numpy as np vector = np.array([5.5, -13.2, 0.5]) probabilities = tf.nn.softmax(vector).numpy() print("Probability Distribution is:") print(probabilities)
Probability Distribution is: [9.93307142e-01 7.51236614e-09 6.69285087e-03]
Scipy library can be used to calculate softmax using
scipy.special.softmax as shown below.
import scipy import numpy as np vector = np.array([1.5, -3.5, 2.0]) probabilities = scipy.special.softmax(vector) print("Probability Distribution is:") print(probabilities)
Probability Distribution is: [0.3765827 0.00253739 0.62087991]
You can use Pytorch
torch.nn.Softmax(dim) to calculate softmax, specifying the dimension over which you want to calculate it as shown.
import torch vector = torch.tensor([1.5, -3.5, 2.0]) probabilities = torch.nn.Softmax(dim=-1)(vector) print("Probability Distribution is:") print(probabilities)
Probability Distribution is: tensor([0.3766, 0.0025, 0.6209])
Congratulations!!, Now you have learned about softmax function and how to implement it using various ways, you can use it in for your multi-class classification problems in Machine Learning.
Thanks for reading!!