Hello Readers! In this tutorial, we are going to discuss the different ways to set the index of a Pandas DataFrame object in Python.
What do we mean by indexing of a Pandas Dataframe?
In Python, when we create a Pandas DataFrame object using the pd.DataFrame()
function which is defined in the Pandas module automatically (by default) address in the form of row indices and column indices is generated to represent each data element/point in the DataFrame that is called index.
But, the row indices are called the index of the DataFrame, and column indices are simply called columns. The index of a Pandas DataFrame object uniquely identifies its rows. Let’s start our core discussion about the different ways to set the index of a Pandas DataFrame object in Python.
Set index of the DataFrame while creating
In Python, we can set the index of the DataFrame while creating it using the index
parameter. In this method, we create a Python list and pass it to the index
parameter of the pd.DataFrame()
function to its index. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Name': ['Rajan', 'Raman', 'Deepak', 'David', 'Shivam'],
'Marks': [93, 88, 95, 75, 99],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan', 'Delhi']}
# Create a Python list of Roll NOs
Roll = [11, 12, 13, 14, 15]
# Create a DataFrame from the dictionary
# and set Roll column as the index
# using DataFrame() function with index parameter
df = pd.DataFrame(data, index = Roll)
print(df)
Output:

Set index of the DataFrame using existing columns
In Python, we can easily set any existing column or columns of a Pandas DataFrame object as its index in the following ways.
1. Set column as the index (without keeping the column)
In this method, we will make use of the inplace
parameter which is an optional parameter of the set_index()
function of the Python Pandas module. By default the value of the inplace
parameter is False
. But here we will set the value of inplace
as True
. So that the old index of the DataFrame is replaced by the existing column which has been passed to the pd.set_index()
function as the new index. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Name': ['Rajan', 'Raman', 'Deepak', 'David'],
'Roll': [11, 12, 13, 14],
'Marks': [93, 88, 95, 75]}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Set the Roll column as the index
# using set_index() function
df = df.set_index('Roll')
print("\nThis is the final DataFrame:")
print(df)
Output:

2. Set column as the index (keeping the column)
In this method, we will make use of the drop
parameter which is an optional parameter of the set_index()
function of the Python Pandas module. By default the value of the drop
parameter is True
. But here we will set the value of the drop
parameter as False
. So that the column which has been set as the new index is not dropped from the DataFrame. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114],
'Name': ['Rajan', 'Raman', 'Deepak', 'David'],
'Marks': [93, 88, 95, 75]}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Set the Name column as the index
# using set_index() function with drop
df = df.set_index('Name', drop = False)
print("\nThis is the final DataFrame:")
print(df)
Output:

3. Set multiple columns as the index of the DataFrame
In this method, we can set multiple columns of the Pandas DataFrame object as its index by creating a list of column names of the DataFrame then passing it to the set_index()
function. That’s why in this case, the index is called multi-index. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114],
'Name': ['Rajan', 'Raman', 'Deepak', 'David'],
'Marks': [93, 88, 95, 75],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan']}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Set the Roll & Name column as the multi-index
# using set_index() function and list of column names
df = df.set_index(['Roll', 'Name'])
print("\nThis is the final DataFrame:")
print(df)
Output:

Set index of the DataFrame using Python objects
In Python, we can set any Python object like a list, range, or series as the index of the Pandas DataFrame object in the following ways.
1. Python list as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Index()
, range()
, and set_index()
function. First, we will create a Python sequence of numbers using the range()
function then pass it to the pd.Index()
function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index()
function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114, 115],
'Name': ['Rajan', 'Raman', 'Deepak', 'David', 'Shivam'],
'Marks': [93, 88, 95, 75, 99],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan', 'Delhi']}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Create a Python list
list = ['I', 'II', 'III', 'IV', 'V']
# Create a DataFrame index object
# using pd.Index() function
idx = pd.Index(list)
# Set the above DataFrame index object as the index
# using set_index() function
df = df.set_index(idx)
print("\nThis is the final DataFrame:")
print(df)
Output:

2. Python range as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Index()
and set_index()
function. First, we will create a Python list then pass it to the pd.Index()
function which returns the DataFrame index object. Then we pass the returned DataFrame index object to the set_index()
function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114, 115],
'Name': ['Rajan', 'Raman', 'Deepak', 'David', 'Shivam'],
'Marks': [93, 88, 95, 75, 99],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan', 'Delhi']}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Create a DataFrame index object
# using pd.Index() & range() function
idx = pd.Index(range(1, 6, 1))
# Set the above DataFrame index object as the index
# using set_index() function
df = df.set_index(idx)
print("\nThis is the final DataFrame:")
print(df)
Output:

3. Python series as the index of the DataFrame
In this method, we can set the index of the Pandas DataFrame object using the pd.Series()
, and set_index()
function. First, we will create a Python list and pass it to the pd.Series()
function which returns a Pandas series that can be used as the DataFrame index object. Then we pass the returned Pandas series to the set_index()
function to set it as the new index of the DataFrame. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114, 115],
'Name': ['Rajan', 'Raman', 'Deepak', 'David', 'Shivam'],
'Marks': [93, 88, 95, 75, 99],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan', 'Delhi']}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Create a Pandas series
# using pd.Series() function & Python list
series_idx = pd.Series([5, 4, 3, 2, 1])
# Set the above Pandas series as the index
# using set_index() function
df = df.set_index(series_idx)
print("\nThis is the final DataFrame:")
print(df)
Output:
This is the initial DataFrame:
Roll Name Marks City
0 111 Rajan 93 Agra
1 112 Raman 88 Pune
2 113 Deepak 95 Delhi
3 114 David 75 Sivan
4 115 Shivam 99 Delhi
This is the final DataFrame:
Roll Name Marks City
5 111 Rajan 93 Agra
4 112 Raman 88 Pune
3 113 Deepak 95 Delhi
2 114 David 75 Sivan
1 115 Shivam 99 Delhi
4. Set index of the DataFrame keeping the old index
In this method, we will make use of the append
parameter which is an optional parameter of the set_index()
function of the Python Pandas module. By default the value of the append
parameter is False
. But here we will set the value of the append parameter as True
. So that the old index of the DataFrame is appended by the new index which has been passed to the set_index()
function. Let’s implement this through Python code.
# Import Pandas module
import pandas as pd
# Create a Python dictionary
data = {'Roll': [111, 112, 113, 114, 115],
'Name': ['Rajan', 'Raman', 'Deepak', 'David', 'Shivam'],
'Marks': [93, 88, 95, 75, 99],
'City': ['Agra', 'Pune', 'Delhi', 'Sivan', 'Delhi']}
# Create a DataFrame from the dictionary
df = pd.DataFrame(data)
print("\nThis is the initial DataFrame:")
print(df)
# Set Roll column as the index of the DataFrame
# using set_index() function & append
df = df.set_index('Roll', append = True)
print("\nThis is the final DataFrame:")
print(df)
Output:

Conclusion
In this tutorial we have learned the following things:
- What is the index of a Pandas DataFrame object?
- How to set index while creating a DataFrame?
- How to set existing columns of DataFrame as index or multi-index?
- How to set the Python objects like list, range, or Pandas series as index?
- How to set new index keeping the older one?