Python Read Excel- Different ways to read an excel file using Python

Python Read Excel

An Excel file is a spreadsheet file containing some cells in rows and columns (Tabular view) and can help in the arrangement, calculation, sorting, and managing of data. The data in the spreadsheet may be numeric, text, formulas, hyperlinks, functions, etc. An XLS file stores data as binary streams. It can only be created by the most popular MS Excel or other spreadsheet programs. The file format .xlsx always indicates an excel file on its own.

The following image depicts an excel file created by the MS-excel program:

Excel File By MS Excel
Excel File By MS Excel

How to read Excel files using Python

To read excel files using Python, we need to use some popular Python modules and methods. Let’s understand those as well.

Using Python xlrd module

xlrd is a python library or module to read and manage information from Excel files ( i.e. files in .xlsx format ). This Module will not be applicable for anything other than .xlsx files.

Let’s have a quick look at how to install xlrd module.

C:\Users\pc> pip install xlrd

As you are using python, You must have downloaded the pip package installer. You can also use another Python package manager of your choice.

Installing Xlrd Module
Installing Xlrd Module

In this method, We are going to use xlwings module along with the method associated with it (i.e. xlwings.Book() ).

This method will automatically open our .xlsx in the background for us in its original program (i.e. MS-Excel) where we can operate and manage our data.

#importing the xlwings module as xw
import xlwings as xw

#you can get ur excel file already been opened after execution of this command
ws = xw.Book("C:\\Users\\pc\\Desktop\\students.xlsx").sheets['Sheet1']

From the above code snippet, We can get our Excel automatically opened on our desktop where we can access it.

Using Python pandas module

Pandas is an open-source Python library or module that provides in-built high-performance data structures and data analysis tools. It is most preferably used to analyze data along with two other core python libraries- Matplotlib for data visualization and NumPy for mathematical operations.

We are going to install this module in the same way as our previous module using the pip installer as follows.

C:\Users\pc> pip install pandas

The above code snippet will install the pandas module for us as follows.

Installing Panda Module
Installing Panda Module

To read excel files, let’s run the following snippet of code.

# importing pandas module as pd
import pandas as pd

#using read_excel() method to read our excel file and storing the same in the variable named "df "
df = pd.read_excel("C:\\Users\\pc\\Desktop\\students.xlsx")

#printing our spreadsheet using print() method
print(df)

In the above method, We are using read_excel() method to read our .xlsx file. We can use this method along with the pandas module as panda.read_excel() to read the excel file data into a DataFrame object (Here it is ‘ df ‘).

The above code snippet will print our spreadsheet as follows.

Read Excel Method
Read Excel Method

Using Python openpyxl module

Openpyxl is a Python library or module used to read or write from an Excel file. This module needs to be installed to use certain methods like load_workbook(), otherwise, we can’t use those methods, It will throw error. Let’s install this module using our command prompt.

C:\Users\pc> pip install openpyxl

The above code snippet will install our openpyxl module as follows.

Installing Openpyxl Module
Installing Openpyxl Module

In our second method, We are going to use our openpyxl module along with load_workbook() method as our following code snippet.

# importing openpyxl module 
import openpyxl

#using load_workbook() method to read our excel file and storing to dataframe object table1
table1 = openpyxl.load_workbook("C:\\Users\\pc\\Desktop\\students.xlsx")

#To access the table1 we need to activate and store to an another object (Here it is table2)
table2 = table1.active

for row in range(1, table2.max_row):
    for col in table2.iter_cols(1, table2.max_column):
        print(col[row].value, end = " ")
    print("\n")

In the above code snippet, We are using load_workbook() method to read our required excel file along with openpyxl module. We couldn’t use this method without importing this library or module. Not only this, This module is responsible for reading the location as a parameter( Here it is “C:\Users\pc\Desktop\students.xlsx” ) in our read_excel() method.

After reading our excel file and assigning it to table1, It needs to be activated. Otherwise, If we print the table1, the following output occurs.

Printing Table1
Printing Table1

We are going to access table2 by using for loop as the above code snippet. We will get our results as follows.

Output By Method 2
Output By Method 2

Conclusion

In this article, We covered Different methods to read our Excel file using Python. We discussed some popular Modules along with some required methods of Python for our appropriate output. Hope You must have practiced and enjoyed our code snippets. We must visit again with some more exciting topics.