A Complete Guide to Keras Loss Functions


Deep Learning is one of the most happening technologies with recent developments such as deepfake and autonomous vehicles. In this post, we will understand the crucial elements of building these deep learning models.

You are all done building your model and want to test if the model is working as expected. Loss Functions are of great help in such scenarios where you would want to check how close the model’s results are to the expected outputs.

In other words, a loss function(an objective function or a cost function) is used to measure the difference between the actual and predicted values of a model. Loss functions also help to assess the model’s performance; and how well the model adapts to the training data.

Loss functions play an important role in backpropagation where the gradient of the loss function is sent back to the model to improve.

Through this article, we will understand loss functions thoroughly and focus on the types of loss functions available in the Keras library.

Learn about the popular deep-learning algorithms here!

What Is a Loss Function?

A loss function, just as the name suggests calculates the loss or the difference between the model’s predicted values and the actual target values. When we are training or building a model, our main objective should be to minimize this loss to obtain an optimized model.

During training, the weights and biases of a deep learning model are often updated to minimize this loss.

The general loss function or cost function can be considered as below.

Loss Function
Loss Function

J is the loss function, wT is the training weight and b is the bias applied to the network. y^ is the predicted value and y is the actual value. Coming to the topic at hand, let us take a look at all the loss functions the Keras Library has to offer.

Keras Loss Functions

The Keras library provides a Pythonic interface for building deep learning models on smartphones and the web. It offers numerous services being an open-source library. It has an extensive set of loss functions to be used for different use cases.

There are two types of losses- probabilistic and Regression, each providing a variety of losses.

Probabilistic Losses

Probabilistic losses can be used for both regression and classification tasks. These losses can be used for models which give out a probability for prediction.

These are the available probabilistic losses. These losses can be used in both class and function forms.

You might notice that the type of loss are repetitive. That is because the losses can be called in the form of a class and a function too. While they serve the same purpose, the class form and a function form differ by their names.

  • BinaryCrossentropy class
  • CategoricalCrossentropy class
  • SparseCategoricalCrossentropy class
  • Poisson class
  • binary_crossentropy function
  • categorical_crossentropy function
  • sparse_categorical_crossentropy function
  • Poisson function
  • KLDivergence class
  • kl_divergence function

Let us see the usage of each loss function.

BinaryCrossentropy class

The binary cross entropy loss computes the cross entropy between the true and predicted labels. It can be used for classification problems that have a binary prediction(0 or 1).

Let us see an example of using this loss function.

y_true = [0, 1, 1, 0]
y_pred = [-18.6, 0.51, 2.94, -12.8]
bce = tf.keras.losses.BinaryCrossentropy(from_logits=True)
bce(y_true, y_pred).numpy()

There are two lists of actual(y_true) and predicted(y_pred) values. The binary cross-entropy loss class is accessed using the variable bce, which is used to calculate the loss between the predicted and actual values.

Binary Cross entropy loss class
Binary Cross entropy loss class

In the same way, the binary cross entropy function can be called by using the following syntax.

    y_true, y_pred, from_logits=False, label_smoothing=0.0, axis=-1)

CategoricalCrossentropy class

The categorical cross-entropy loss is used when there are multiple class labels. The class labels must be provided in a one-hot encoded form, which means the classes should be either 0 or 1.

y_true = [[0, 1, 0], [0, 0, 1]]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.95]]
cce = tf.keras.losses.CategoricalCrossentropy()
cce(y_true, y_pred).numpy()

There are two instances and three classes, where the first instance belongs to the second label, and the second instance belongs to the third label. The y_pred array gives the probability of the instance belonging to a particular class.

Categorical Crossentropy class
Categorical Crossentropy class

The categorical cross entropy function can be called from the Keras framework as below.

    y_true, y_pred, from_logits=False, label_smoothing=0.0, axis=-1

SparseCategoricalCrossentropy class

This class is used when the labels are integers and not encoded(example – 1,2,3). In this case, only the y_true variable changes from the categorical cross entropy class.

The function can be similarly called from keras.

    y_true, y_pred, from_logits=False, axis=-1, ignore_class=None

Poisson Class

The Poisson loss is particularly used when predicting count data. It is used for regression tasks and use cases like the number of customers purchasing a product.

The poisson class and function can be called using the syntax:

Poisson class
tf_keras.losses.Poisson(reduction="auto", name="poisson")
Poisson function
tf_keras.losses.poisson(y_true, y_pred)

KL Divergence Loss

In general, the Kullback-Leibler divergence measures how a probability distribution is different from another. The KL Divergence loss class and functions compute the KL loss between the predicted and actual values.

The KL loss is calculated as follows:

loss = y_true * log(y_true / y_pred)

The KL Divergence class and function can be called similar to the other losses.

KL Divergence Class
tf_keras.losses.KLDivergence(reduction="auto", name="kl_divergence")
KL Divergence Function 
tf_keras.losses.kl_divergence(y_true, y_pred)

Regression Losses

The regression losses are used when dealing with regression problems which typically predict a numerical value.

Similar to the probabilistic losses, the regression losses can also be used in both class and function representations.

These are the loss functions Keras provides for regression tasks.

  • MeanSquaredError class or mean_squared_error function
  • MeanAbsoluteError class or mean_absolute_error function
  • MeanAbsolutePercentageError class or mean_absolute_percentage_error function
  • MeanSquaredLogarithmicError class or mean_squared_logarithmic_error function
  • CosineSimilarity class or cosine_similarity function

These functions can be used with a similar syntax as the probabilistic losses.

For example,


The popular regression loss functions or error metrics are explained here


To recapitulate, we have discussed what are loss functions and understood the types of loss functions available in the Keras library in detail. The choice of the right loss function purely depends on the use case and the predicting variable.


Keras Documentation