In this article, we will know how to perform Optical Character Recognition using PyTesseract or python-tesseract. Pytesseract is a wrapper for Tesseract-OCR Engine. Tesseract is an open-source OCR Engine, managed by Google.
There are times when we have texts in our images and we need to type it on our computer.
It is very easy for us to perceive what is written in the image but for a computer to understand the texts inside the image is a really difficult task.
A computer will just perceive an image as an array of pixels.
OCR comes in handy with this task. OCR detects the text content on images and translates the information to encoded text that the computer can easily understand.
In this article we’ll see how to perform OCR task with Python.
Implementing Basic Optical Character Recognition in Python
Install the Python wrapper for tesseract using pip.
$ pip install pytesseract
You can refer to this query on stack overflow to get details about installing Tesseract binary file and making pytesseract work.
1. Get An Image With Clearly Visible Text
Let’s now look at one sample image and extract text from it.
2. Code to Extract Text From Image
The image above is in jpeg format and we’ll try to extract the text information from it.
#Importing libraries import cv2 import pytesseract #Loading image using OpenCV img = cv2.imread('sample.jpg') #Converting to text text = pytesseract.image_to_string(img) print(text)
On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look.
After loading the image using OpenCV, we used pytesseract image_to_string method which needs an image as an input argument. This single line of code will transform the text information in the images to encoded texts.
However, real-life tasks for OCR would be challenging if we don’t preprocess the images as the efficiency of conversion is directly affected by the quality of the input image.
Implementing OCR After Preprocessing Using OpenCV
Steps we’ll use to preprocess our image:
- Convert image to Grayscale – Images need to be converted into a binary image, so first, we convert the colored image to grayscale.
- Thresholding is used to convert grayscale images into binary images. it decides whether the value of the pixel is below or above a certain threshold. All pixels below are turned to a white pixel, all pixels above are turned to a black pixel.
- Now invert the image to using
- Applying various noise reduction techniques like eroding, dilating, etc.
- Applying the text extraction method to the preprocessed image.
1. Find an Image With Clear Text
Let’s implement above steps in a code using the image below:
2. Complete Code to Preprocess and Extract Text from Images using Python
We’ll now follow the steps to pre-process the file and extract the text from the image above. Optical character recognition works best when the image is readable and clear for the machine learning algorithm to take cues from.
#Importing libraries import cv2 import pytesseract import numpy as np #Loading image using OpenCV img = cv2.imread('sample_test.jpg') #Preprocessing image #Converting to grayscale gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #creating Binary image by selecting proper threshold binary_image = cv2.threshold(gray_image ,130,255,cv2.THRESH_BINARY + cv2.THRESH_OTSU) #Inverting the image inverted_bin = cv2.bitwise_not(binary_image) #Some noise reduction kernel = np.ones((2,2),np.uint8) processed_img = cv2.erode(inverted_bin, kernel, iterations = 1) processed_img = cv2.dilate(processed_img, kernel, iterations = 1) #Applying image_to_string method text = pytesseract.image_to_string(processed_img) print(text)
On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document. You can use these galleries to insert tables, headers, footers, lists, cover pages, and other document building blocks. When you create pictures, charts, or diagrams, they also coordinate with your current document look, You can easily change the formatting of selected text in the documenttext by choosing a look for the selected text from the Quick Styies gallery on the Home tab. You can also format text directly by using the other controls on the Home tab. Most controls offer a choice of using the look from the current theme or using a tormat that you specify directly. To change the overall look of your document, choose new Theme elements on the Page Layout tab. To change the looks available in the Quick Style gallery, use the Change Current Quick Style Set command. Both the Themes gallery and the Quick Styles gallery provide reset commands so that you can
You can know more about OpenCV and its functions for image transformations here.
This article was all about implementing optical character recognition in Python using PyTesseract wrapper and some pre-processing steps that might be helpful to get better results.