The xlrd Module – How To Handle Excel Files In Python?

FeaImg Xlrd Module

Have you ever wondered how can one handle the MS-Excel files using a programming language? In this tutorial, we will be learning how to handle Excel files with the help of the xlrd module in the Python Programming Language.

So let’s get started!


Introduction to MS-Excel

Microsoft Excel is one of the most significant computer applications because of the critical function it plays in many industries. It is the most widely used spreadsheet application in various corporate tasks, classroom assignments, and even personal data management.

MSExcel Logo

Excel was initially introduced in 1985. Since then, it has played an important part in formula-based arithmetic and computations, as well as other activities that may necessitate mathematical calculations.

Excel has been used by many organizations, personal and institutional enterprises because of its usability and capacity to serve as a visual basis for many applications.


Introduction to xlrd module

The xlrd module may be used to obtain data from a spreadsheet. Python, for example, may be used to read, write, or alter data.

In addition, the user may be required to travel through several sheets and extract data based on some criteria, as well as edit some rows and columns and perform a great deal of labor. To extract data from a spreadsheet, use the xlrd module.

Before implementing the xlrd module, one needs to make sure that the module is imported into the program using the command line mentioned below in the CMD on your system.

pip install xlrd

Using the xlrd module in Python

We would be working on a few things with the help of the xlrd module one after another. Before doing anything let’s look at the dataset we would be using.

You can use any dataset you find suitable or create your own custom dataset. I downloaded a random finance dataset which can be downloaded here.

Financial Data Xlrd Module
Financial Data Xlrd Module

1. Loading the Excel File Into the Program

Initially, we would be importing the xlrd module and then use the open_workbook function under the xlrd module in order to load the workbook whose path is mentioned inside the function.

# Importing Module
import xlrd
# Loading Excel file
wb = xlrd.open_workbook('Financial Sample.xlsx')

2. Reading the Excel File

Now there can be multiple workbooks inside a single excel file but we want to work on the first workbook and for the same, we will make use of the sheet_by_index function and pass the index of the sheet we want to work on.

# Storing the first sheet into a variable
sheet = wb.sheet_by_index(0)

Printing value of a cell

After loading a certain workbook, we would like to print the value of a certain cell using the cell_value function which would require the row and column number of the cell we want to read.

# Printing various cell values
print("Value of 0-0 cell: ",sheet.cell_value(0, 0))
print("Value of 20-4 cell: ",sheet.cell_value(20, 4))

The output after execution is:

Value of 0-0 cell:  Segment
Value of 20-4 cell:  1006.0

Printing the Number of rows and columns

In order to print the number of rows and columns in the excel file, we would make use of the ‘nrows’ and ‘ncols’ function.

# Get max no of rows and columns
print("Number of Rows: ", sheet.nrows)
print("Number of Columns: ",sheet.ncols)

The output after execution is:

Number of Rows:  701
Number of Columns:  16

Printing the Names of All the Columns

This can be achieved using the code mentioned below. We just need to print the value of each column cell in the first row.

# Get all column names
print("ALL COLUMN NAMES ARE: ")
for i in range(sheet.ncols):
    print(sheet.cell_value(0,i))

The output after execution is shown below:

ALL COLUMN NAMES ARE: 
Segment
Country
Product
Discount Band
Units Sold
Manufacturing Price
Sale Price
Gross Sales
Discounts
 Sales
COGS
Profit
Date
Month Number
Month Name
Year

Printing the first 10 rows of 5 columns

This can be achieved using the code mentioned below. You can print multiple rows and columns if you wish to.

# Get first 10 rows for 5 columns
for i in range(11):
    for j in range(5):
        print(sheet.cell_value(i,j), end="\t\t\t")
    print()

The output is mentioned below:

Segment			Country			Product			Discount Band			Units Sold			
Government			Canada			Carretera			None			1618.5			
Government			Germany			Carretera			None			1321.0			
Midmarket			France			Carretera			None			2178.0			
Midmarket			Germany			Carretera			None			888.0			
Midmarket			Mexico			Carretera			None			2470.0			
Government			Germany			Carretera			None			1513.0			
Midmarket			Germany			Montana			None			921.0			
Channel Partners			Canada			Montana			None			2518.0			
Government			France			Montana			None			1899.0			
Channel Partners			Germany			Montana			None			1545.0	

Conclusion

Congratulations! You just learned how to load and read excel files using the xlrd module. Hope you enjoyed it! 😇

Liked the tutorial? In any case, I would recommend you to have a look at the tutorials mentioned below:

  1. Plot data from Excel Sheet using Python
  2. Read Text Files Using Pandas – A Brief Reference

Thank you for taking your time out! Hope you learned something new!! 😄