What Are the Pre-trained Models Available in PyTorch?

When it comes to machine learning or deep learning projects, the important and most crucial step after collecting and preprocessing data is the model selection. Choosing the right model will result in good accuracy on the training data and performance on the test data.

PyTorch offers various pre-trained deep learning models like ResNet, AlexNet, VGG, and more for computer vision tasks. We can load them easily with get_model() and use their readily available weights to build powerful AI apps for image classification, segmentation, detection without training models from scratch.

However, model building is not a piece of cake. The model weights should be fine-tuned, there is a lot of tweaking around hyperparameters, and of course, the computing resources. In such cases, pre-trained models come to our rescue.

Pre-trained models are the regular deep learning models like ResNet and VGG models already trained on state-of-the-art datasets like ImageNet, CIFAR, etc.

In this tutorial, we are going to take a look at the PyTorch pre-trained models.

Meanwhile, do check out deep learning in 7 steps!

Why Pre-trained Models

In this section, we will try to cover all the available pre-trained models in PyTorch, with their use cases and more information. The objective is to provide information about all the models available in PyTorch for computer vision tasks.

Under the torchvision package, there are many pre-trained models(with or without weights) for the following tasks – Classification, Semantic Segmentation, Object Detection, Instance Segmentation, and key point detection.

Available Pre-trained Classification Models

Image classification refers to the process of classifying the image as per the label for example classifying an image as a cat or dog.

For this task, we have many models like – MobileNet, ResNet, VGG, AlexNet, etc.

AlexNet

The AlexNet model was introduced in a paper titled ImageNet Classification with Deep Convolutional Neural Networks. As the name suggests, this model is trained on the ImageNet dataset. This is one of the models that used GPU for execution.

torchvision.models.alexnet(*, weights: Optional[AlexNet_Weights] = None, progress: bool = True, **kwargs: Any) → AlexNet

Convnext

Convnext alternatively known as the ConvNet for the 2020s, is inspired by the Visual Transformers(ViTs).PyTorch has four variants for this model – ConvNext base, ConvNext large, small, and tiny.

The model class can be loaded as below.

torchvision.models.convnext_base(*, weights: Optional[ConvNeXt_Base_Weights] = None, progress: bool = True, **kwargs: Any) → ConvNeXt

DenseNet

The densely Connected Convolutional model(DenseNet) is based on a feed-forward CNN with the layers connected. It is developed to eliminate the vanishing gradients in the other neural networks. This network has four variants based on the number of layers.DenseNet 121, DenseNet 161,DenseNet 201,DenseNet209.

torchvision.models.densenet121(*, weights: Optional[DenseNet121_Weights] = None, progress: bool = True, **kwargs: Any) → DenseNet

EfficientNet

EfficientNet, just as the name suggests is an efficient convolutional neural network, involving scaling the dimensions of the input uniformly so the requirement of the computing resources reduces. It has several variants, such as efficientnet_b0, efficientnet_b1……, and efficientnet_b7. The only difference between these variants is the scaling factor.

The baseline architecture of the efficientnet has the following syntax:

torchvision.models.efficientnet_b0(*, weights: Optional[EfficientNet_B0_Weights] = None, progress: bool = True, **kwargs: Any) → EfficientNet

GoogLeNet

GoogLeNet was introduced in a paper titled “Going Deeper with Convolutions” by Google researchers and as it has hinted, this layer is almost 22 layers deep! Just like other state-of-the-art networks, this model is also designed to consider the computing resources.

torchvision.models.googlenet(*, weights: Optional[GoogLeNet_Weights] = None, progress: bool = True, **kwargs: Any)

ResNet

Residual Network(ResNet) was introduced in a paper in 2015, and aimed to eliminate the vanishing gradient problem in the neural networks. This network introduced a new parameter called the skip connections, which has been widely used to enable the smooth flow of information in between the layers.

This model or network also has a few variants like the ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152 depending on the number of layers it contains.

torchvision.models.resnet18(*, weights: Optional[ResNet18_Weights] = None, progress: bool = True, **kwargs: Any) → ResNet

The VGG Family

Visual Geometry Group(VGG) model is a deep convolutional neural network with multiple layers, designed for image classification and localization. It also has a few variants depending on the number of layers – VGG11, VGG13, VGG16, VGG19.

torchvision.models.vgg11(*, weights: Optional[VGG11_Weights] = None, progress: bool = True, **kwargs: Any) → VGG

We have a brief understanding of the majority of models available in PyTorch.Now, it is time to get to know the crucial functions that help us in using the particular models.

Loading Pre-trained Models in PyTorch

This function is used to get the model given its name and instantiates the model which can be used further.

torchvision.models.get_model(name: str, **config: Any) → Module

Here is an example of using this function.

from torchvision.models import get_model
model = get_model("vgg11", weights = None)
model

Retrieving Available Model Weights

This function is used to get the enum class of weights given the model name.

torchvision.models.get_model_weights(name: Union[Callable, str]) → Type[WeightsEnum]

Find All Available PyTorch Models

We can get a list of all the available models with their names. Here’s how to do it.

import torchvision
from torchvision.models import list_models

The list_models is the module used to get the information about all the models available.

all_models = list_models()
classification_models = list_models(module=torchvision.models)
for model in classification_models:
  print(model)

Classification models1 — Classification models 1

Summary

Pre-trained deep learning models are great timesaving tools for building computer vision applications. With options like ResNet, AlexNet, and more readily available in PyTorch, you can get started with your code without worrying about the underlying mathematics.

So with PyTorch handling the heavy lifting of deep learning, what will you create next? Perhaps a mobile app to recognize pets? An automated threat detection system? An assistive tool for the visually impaired?

References

PyTorch Pre-trained models