# Optimizing Neural Networks with torch.optim in PyTorch

Pytorch is a prevalent machine learning library in Python programming language. Pytorch is a handy tool in neural networks and torch.optim module is used in various neural network models for training. This module provides us with multiple optimization algorithms for training neural networks.

In this article, we will understand in depth about the torch.optim module and also learn about its key components with its Python implementation.

The torch.optim module in PyTorch provides various optimization algorithms commonly used for training neural networks. These algorithms minimize the loss function by adjusting the weights and biases of the network, ultimately improving the model’s performance.

Recommended: Converting Between Pytorch Tensors and Numpy Arrays in Python

## What is torch.optim?

The torch.optim module, as mentioned above, provides us with multiple optimization algorithms that are most commonly used to minimize the loss function during the training of neural networks. In short, these algorithms adjust the weights and biases of the neural network to improve the performance of the model.

### Key Components of torch.optim

1. Optimizer Classes

torch.optim gives us various classes that present us with specific optimization algorithms. Some popular optimizers are SGD (Stochastic Gradient Descent which changes model parameters to reduce losses), Adam (it combines both momentum and RMSprop), Adagrad(optimization algorithm that adjusts the learning rate of parameters based on historical gradient) and RMSprop (an adaptive optimization algorithm )

2. Parameter Groups

An optimization algorithm in PyTorch handles multiple parameter groups. A parameter group is essentially a dictionary and its optimization groups. It allows users to change learning rates and weights in different parts of the model.

3. Learning Rate Schedulers

torch.optim also includes learning rate schedulers that adjust the learning rate during training, Some common schedulers are StepLR, MultiStepLR, etc.

Let us now further understand torch.optim with an example in Python programming language.

Recommended: What Are the Pre-trained Models Available in PyTorch?

## Example: SGD Optimizer

In this example, we will create a simple neural network and train it on a dataset using the SGD optimizer. Let us look at the code.

```import torch
import torch.nn as nn
import torch.optim as optim

# Define a simple neural network class
class SimpleNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)

def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x

# Set random seed for reproducibility
torch.manual_seed(42)

# Define input size, hidden size, and output size
input_size = 10
hidden_size = 20
output_size = 5

# Create an instance of the SimpleNN class
model = SimpleNN(input_size, hidden_size, output_size)

# Define a synthetic dataset
input_data = torch.randn(100, input_size)
target = torch.randn(100, output_size)

# Define a loss function
criterion = nn.MSELoss()

# Define the SGD optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Training loop
epochs = 100
for epoch in range(epochs):
# Forward pass
output = model(input_data)

# Compute the loss
loss = criterion(output, target)

# Backward pass and optimization
loss.backward()
optimizer.step()

# Print the loss for every few epochs
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

```

Let us look at the output below.

Thus we have used SGD optimizer to minimize the mean squared error loss. The learning rate is set to 0.01 and the model is trained for 100 iterations.

Let us look at another Python code where we have used Adam Optimizer.

```import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import numpy as np

# Generate synthetic dataset
torch.manual_seed(42)  # For reproducibility

# Generate random data
X = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)
y = 3 * X + 1 + 0.2 * torch.randn(X.size())

# Define a simple linear regression model
class LinearRegression(nn.Module):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(1, 1)

def forward(self, x):
return self.linear(x)

# Instantiate the model
model = LinearRegression()

# Define the Mean Squared Error (MSE) loss
criterion = nn.MSELoss()

# Training loop
num_epochs = 1000
losses = []

for epoch in range(num_epochs):
# Forward pass
predictions = model(X)
loss = criterion(predictions, y)

# Backward pass and optimization
loss.backward()
optimizer.step()

# Save the loss for plotting
losses.append(loss.item())

# Print the loss every 100 epochs
if (epoch + 1) % 100 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Plot the training progress
plt.plot(range(1, num_epochs+1), losses, label='Training Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss over Epochs')
plt.legend()
plt.show()

# Make predictions using the trained model
predicted_y = model(X)

# Plot the original data and the predicted values
plt.scatter(X.numpy(), y.numpy(), label='Original Data')
plt.plot(X.numpy(), predicted_y.numpy(), 'r-', label='Predicted Line')
plt.xlabel('X')
plt.ylabel('y')