Image Recognition with AI(TensorFlow)

Image recognition is the process of determining the label or name of an image supplied as testing data. Image recognition is the process of determining the class of an object in an image. If the image is of a cat, the model should predict the label as cat.

Image recognition can be considered a subfield of computer vision. Computer vision is a field that focuses on developing or building machines that have the ability to see and visualise the world around us just like we humans do. With recent developments in the sub-fields of artificial intelligence, especially deep learning, we can now perform complex computer vision tasks such as image recognition, object detection, segmentation, and so on.

Deep learning is a subset of machine learning that consists of neural networks that mimic the behavior of neurons in the human brain. Deep learning uses artificial neural networks (ANNs), which provide ease to programmers because we don’t need to program everything by ourselves. When supplied with input data, the different layers of a neural network receive the data, and this data is passed to the interconnected structures called neurons to generate output.

Refer to this article to get comfortable with the frameworks for deep learning.

Image recognition based on AI techniques can be a rather nerve-wracking task with all the errors you might encounter while coding. In this article, we are going to look at two simple use cases of image recognition with one of the frameworks of deep learning. Let us first understand the libraries we are going to use.

TensorFlow

TensorFlow is an open-source platform for machine learning developed by Google for its internal use. TensorFlow is a rich system for managing all aspects of a machine learning system. TensorFlow is known to facilitate developers in creating and training various types of neural networks, including deep learning models, for tasks such as image classification, natural language processing, and reinforcement learning. It supports both CPU and GPU computations.

Refer to this article to compare the most popular frameworks of deep learning.

Keras

Keras is a deep learning framework that is built on top of TensorFlow, Theano, and CNTK. In fact, the TensorFlow library accommodates Keras in the form of an API. Keras is embedded in TensorFlow and can be used to perform deep learning fast as it provides in-built modules for all neural network

Learn more about Keras here.

Image Recognition With TensorFlow

In this section, we are going to look at two simple approaches to building an image recognition model that labels an image provided as input to the machine.

Prerequisites

Before we move on to coding, we need to make sure Tensorflow is available in our system.

pip install tensorflow keras

Tensorflow needs a Python version of 3.x. So make sure to install the compatible version.

If Python is already installed, you can check the version by following this command.

python --version

Let’s get to coding!

How about we pass an image as input to the model and let it predict the top 5 probable labels of the image?

Let us see the image we are going to pass as input to the model.

You were able to recognize the image right away as an airplane, right? Let us see if the model we build will do the same.

Let us see the code.

import tensorflow as tf
model = tf.keras.applications.MobileNetV2(weights='imagenet')
imgp = '/content/aeroplane.jpg'
image = tf.keras.preprocessing.image.load_img(imgp, target_size=(224, 224))
input_image = tf.keras.preprocessing.image.img_to_array(image)
input_image = tf.keras.applications.mobilenet_v2.preprocess_input(input_image)
input_image = tf.expand_dims(input_image, axis=0)
predictions = model.predict(input_image)
predicted_classes = tf.keras.applications.mobilenet_v2.decode_predictions(predictions, top=5)[0]
for class_id, class_name, probability in predicted_classes:
    print(f"{class_name} ({class_id}): {probability}")

In the first line, we are importing the Tensorflow library we installed earlier.

We are not going to build any model but use an already-built and functioning model called MobileNetV2 available in Keras that is trained on a dataset called ImageNet.

The image we pass to the model (in this case, aeroplane.jpg) is stored in a variable called imgp.

The image is loaded and resized by tf.keras.preprocessing.image.load_img and stored in a variable called image. This image is converted into an array by tf.keras.preprocessing.image.img_to_array. This array is pre-processed according to the requirements of the model.

Then we expand the image across its first dimension. The predictions made by the model on this image’s labels are stored in a variable called predictions.

The predicted_classes is the variable that stores the top 5 labels of the image provided. The for loop is used to iterate over the classes and their probabilities.

Image Recognition Using A Simple Pre-trained Model

As you can see, the top 5 predictions of the image we supplied are an airliner with a 0.20 probability, a warplane with a 0.12 probability, a space shuttle with a 0.05 probability, and the last two predictions being a starfish and a wing with the least probabilities. We can say that the first prediction has the closest label.

This is fine. but I had to show you the image we are going to work with prior to the code. There is a way to display the image and its respective predicted labels in the output. We can also predict the labels of two or more images at once, not just sticking to one image. For all this to happen, we are just going to modify the previous code a bit.

import tensorflow as tf
import matplotlib.pyplot as plt
model = tf.keras.applications.MobileNetV2(weights='imagenet')
img_paths = ['cats.jpg','dog.jpg','bird.jpg','umbrella.jpg']

for image_path in img_paths:
    img = tf.keras.preprocessing.image.load_img(image_path, target_size=(224,224))
    input_image = tf.keras.preprocessing.image.img_to_array(img)
    input_image = tf.keras.applications.mobilenet_v2.preprocess_input(input_image)
    input_image = tf.expand_dims(input_image, axis=0)
    predictions = model.predict(input_image)
    predicted_classes = tf.keras.applications.mobilenet_v2.decode_predictions(predictions, top=10)[0]
    plt.imshow(img,interpolation='bicubic')
    plt.axis('off')
    plt.show()
    print("Predictions:")
    first_prediction = True
    for _, class_name, probability in predicted_classes:
        if first_prediction:
            print(f"{class_name}: {probability}")
            first_prediction = False
        else:
            print(f"{class_name}: {probability}")
    print()

In this version, we are taking four different classes to predict- a cat, a dog, a bird, and an umbrella. We are going to try a pre-trained model and check if the model labels these classes correctly. We are also increasing the top predictions to 10 so that we have 10 predictions of what the label could be.

Everything is the same until the predictions part. The explanation for the highlighted code is given below.

In line 12, we are using the imshow method to display the image. The interpolation is used so that we can get a clear image. We don’t need an axis for these images, so it is turned off in line 13.

The title of the images is set to Predictions. We are setting the first prediction to True so that we only get one image associated with the first label. Otherwise, the image will be displayed every time for 10 predictions. The predicted label, along with its probability, is printed.

Let us see the predictions.

The cat here appears to be a tabby cat based on the probabilities.

We sure know this pup here is a Chow-Chow!

Based on the probabilities, this bird is supposed to be an Indigo Bunting.

The model seems to be very confident about this one.

Conclusion

Image recognition can be a very complex task dealing with AI. The intent of this tutorial was to provide a simple approach to building an AI-based Image Recognition system to start off the journey.

We have used TensorFlow for this task, a popular deep learning framework that is used across many fields such as NLP, computer vision, and so on. The TensorFlow library has a high-level API called Keras that makes working with neural networks easy and fun.

We have used a pre-trained model of the TensorFlow library to carry out image recognition. We have seen how to use this model to label an image with the top 5 predictions for the image.

The next model we built is based on the previous one. We modified the code so that it could give us the top 10 predictions and also the image we supplied to the model along with the predictions.