Backpropagation in Python – A Quick Guide

Backpropagation

Sometimes you need to improve the accuracy of your neural network model, and backpropagation exactly helps you achieve the desired accuracy. The backpropagation algorithm helps you to get a good prediction of your neural network model. In this article, we will learn about the backpropagation algorithm in detail and also how to implement it in Python.

What is backprograpation and why is it necessary?

The backpropagation algorithm is a type of supervised learning algorithm for artificial neural networks where we fine-tune the weight functions and improve the accuracy of the model. It employs the gradient descent method to reduce the cost function. It reduces the mean-squared distance between the predicted and the actual data. This type of algorithm is generally used for training feed-forward neural networks for a given data whose classifications are known to us.

You can also think of backward propagation as the backward spread of errors in order to achieve more accuracy. If we have received a prediction from a neural network model which has a huge difference from the actual output, we need to apply the backpropagation algorithm to achieve higher accuracy.

Note: Feed-forward neural networks are generally multi-layered neural networks (MLN). The data travels from the input layer to the hidden layer to the output layer.

How Does Backpropagation in Python Work?

Now let’s get the intuition about how the algorithm works actually. There are mainly three layers in a backpropagation model i.e input layer, hidden layer, and output layer. Following are the main steps of the algorithm:

  • Step 1:The input layer receives the input.
  • Step 2:The input is then averaged overweights.
  • Step 3:Each hidden layer processes the output. Each output is referred to as “Error” here which is actually the difference between the actual output and the desired output.
  • Step 4:In this step, the algorithm moves back to the hidden layers again to optimize the weights and reduce the error.

Types of Backpropagation in Python

There are mainly two types of backpropagation methods i.e Static backpropagation and Recurrent backpropagation. Let’s look at what each of the two types actually means. In static backpropagation, static inputs generate static outputs. This is specifically used for static classification problems such as Optical Character Recognition. On the other hand, recurrent propagation keeps on taking place until it reaches a definite value or threshold value. Once it reaches the fixed value, the error is propagated backward.

Implementing Backpropagation in Python

Let’s see how we can implement Backpropagation in Python in a step-by-step manner. First of all, we need to import all the necessary libraries.

1. Import Libraries

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

Now let’s look at what dataset we will be working with.

2. Load the Dataset

We will be working with a very simple dataset today i.e the iris dataset. We will load the dataset using load_iris() function, which is part of the scikit-learn library. The dataset consists of three main classes. We will divide them into target variables and features.

# Loading dataset
data = load_iris()

# Dividing the dataset into target variable and features
X=data.data
y=data.target

3. Split Dataset in Training and Testing

Now we will split the dataset into training and test sets. We will use the function train_test_split(). The function takes three parameters: the features, target, and size of the test set.

# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20, random_state=4)

Now in the next step, we have to start initializing the hyperparameters. We will input the learning rate, iterations, input size, number of hidden layers, and number of output layers.

learning_rate = 0.1
iterations = 5000
N = y_train.size

# Input features
input_size = 4

# Hidden layers 
hidden_size = 2 

# Output layer
output_size = 3  

3. Initialize Weights

np.random.seed(10)

# Hidden layer
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))   

# Output layer
W2 = np.random.normal(scale=0.5, size=(hidden_size , output_size)) 

Now we will create helper functions such as mean squared error, accuracy and sigmoid.

# Helper functions

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def mean_squared_error(y_pred, y_true):
    # One-hot encode y_true (i.e., convert [0, 1, 2] into [[1, 0, 0], [0, 1, 0], [0, 0, 1]])
    y_true_one_hot = np.eye(output_size)[y_true]
    
    # Reshape y_true_one_hot to match y_pred shape
    y_true_reshaped = y_true_one_hot.reshape(y_pred.shape)
    
    # Compute the mean squared error between y_pred and y_true_reshaped
    error = ((y_pred - y_true_reshaped)**2).sum() / (2*y_pred.size)

    return error

def accuracy(y_pred, y_true):
    acc = y_pred.argmax(axis=1) ==  y_true.argmax(axis=1)
    return acc.mean()

results = pd.DataFrame(columns=["mse", "accuracy"])

Now we will start building our backpropagation model.

4. Building the Backpropogation Model in Python

We will create a for loop for a given number of iterations and will update the weights in each iteration. The model will go through three phases feedforward propagation, the error calculation phase, and the backpropagation phase.

# Training loop

for itr in range(iterations):
    # Feedforward propagation
    Z1 = np.dot(X_train, W1)
    A1 = sigmoid(Z1)
    Z2 = np.dot(A1, W2)
    A2 = sigmoid(Z2)

    # Calculate error
    mse = mean_squared_error(A2, y_train)
    acc = accuracy(np.eye(output_size)[y_train], A2)
    new_row = pd.DataFrame({"mse": [mse], "accuracy": [acc]})
    results = pd.concat([results, new_row], ignore_index=True)

    # Backpropagation
    E1 = A2 - np.eye(output_size)[y_train]
    dW1 = E1 * A2 * (1 - A2)
    E2 = np.dot(dW1, W2.T)
    dW2 = E2 * A1 * (1 - A1)

    # Update weights
    W2_update = np.dot(A1.T, dW1) / N
    W1_update = np.dot(X_train.T, dW2) / N
    W2 = W2 - learning_rate * W2_update
    W1 = W1 - learning_rate * W1_update

Now we will plot the mean squared error and accuracy using the pandas plot() function.

results.mse.plot(title="Mean Squared Error")
plt.show()
results.accuracy.plot(title="Accuracy")
plt.show()
Image 6
Accuracy and Mean squared error plot.

Now we will calculate the accuracy of the model.

# Test the model

Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
test_acc = accuracy(np.eye(output_size)[y_test], A2)
print("Test accuracy: {}".format(test_acc))

Output:

Accuracy: 0.95

You can see the accuracy of the model have been significantly increased to 80%.

Putting It All Together

To make things easier, here’s the complete code for performing backpropogation.

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# Loading dataset
data = load_iris()
X = data.data
y = data.target

# Split dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=20, random_state=4)

# Hyperparameters
learning_rate = 0.1
iterations = 5000
N = y_train.size
input_size = 4
hidden_size = 2
output_size = 3

np.random.seed(10)
W1 = np.random.normal(scale=0.5, size=(input_size, hidden_size))
W2 = np.random.normal(scale=0.5, size=(hidden_size, output_size))

# Helper functions

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def mean_squared_error(y_pred, y_true):
    # One-hot encode y_true (i.e., convert [0, 1, 2] into [[1, 0, 0], [0, 1, 0], [0, 0, 1]])
    y_true_one_hot = np.eye(output_size)[y_true]
    
    # Reshape y_true_one_hot to match y_pred shape
    y_true_reshaped = y_true_one_hot.reshape(y_pred.shape)
    
    # Compute the mean squared error between y_pred and y_true_reshaped
    error = ((y_pred - y_true_reshaped)**2).sum() / (2*y_pred.size)

    return error

def accuracy(y_pred, y_true):
    acc = y_pred.argmax(axis=1) ==  y_true.argmax(axis=1)
    return acc.mean()

results = pd.DataFrame(columns=["mse", "accuracy"])

# Training loop

for itr in range(iterations):
    # Feedforward propagation
    Z1 = np.dot(X_train, W1)
    A1 = sigmoid(Z1)
    Z2 = np.dot(A1, W2)
    A2 = sigmoid(Z2)

    # Calculate error
    mse = mean_squared_error(A2, y_train)
    acc = accuracy(np.eye(output_size)[y_train], A2)
    new_row = pd.DataFrame({"mse": [mse], "accuracy": [acc]})
    results = pd.concat([results, new_row], ignore_index=True)

    # Backpropagation
    E1 = A2 - np.eye(output_size)[y_train]
    dW1 = E1 * A2 * (1 - A2)
    E2 = np.dot(dW1, W2.T)
    dW2 = E2 * A1 * (1 - A1)

    # Update weights
    W2_update = np.dot(A1.T, dW1) / N
    W1_update = np.dot(X_train.T, dW2) / N
    W2 = W2 - learning_rate * W2_update
    W1 = W1 - learning_rate * W1_update

# Visualizing the results

results.mse.plot(title="Mean Squared Error")
plt.show()

results.accuracy.plot(title="Accuracy")
plt.show()

# Test the model

Z1 = np.dot(X_test, W1)
A1 = sigmoid(Z1)
Z2 = np.dot(A1, W2)
A2 = sigmoid(Z2)
test_acc = accuracy(np.eye(output_size)[y_test], A2)
print("Test accuracy: {}".format(test_acc))

Output

Test accuracy: 0.95
Image 6
Accuracy and Mean squared error plot.

Advantages of Backpropagation in Python

It is relatively faster and simple algorithm to implement. Extensively used in the field of face recognition and speech recognition.Moreover, it is a flexible method as no prior knowledge of the neural network is needed.

Disdavantages of Backpropagation

The algorithm is not disadvantageous for noisy and irregular data.The performance of the backpropagation highly depends on the input.

Conclusion

We learnt that backpopagation is a great way to improve the accuracy of feed-forward nerural network model. It is quite easy and flexible algorithm  but does not work well with noisy data. It is a great way to reduce the error and improve the accuracy of the model.It optimizes the weights by going backwards by minimizing the loss function with the help of gradient descent.