How to add a new column to Pandas DataFrame?

Add A New Column To Pandas DataFrame

In this tutorial, we are going to discuss different ways to add a new column to pandas data frame.


What is a pandas data frame?

Pandas data frame is a two-dimensional heterogeneous data structure that stores the data in a tabular form with labeled indexes i.e. rows and columns.

Usually, data frames are used when we have to deal with a large dataset, then we can simply see the summary of that large dataset by loading it into a pandas data frame and see the summary of the data frame.

In the real-world scenario, a pandas data frame is created by loading the datasets from an existing CSV file, Excel file, etc.

But pandas data frame can be also created from the list, dictionary, list of lists, list of dictionaries, dictionary of ndarray/lists, etc. Before we start discussing how to add a new column to an existing data frame we require a pandas data frame.

Installing and importing pandas

We require the Pandas library of Python for working on data frames, so we have to first install the Pandas library and then import it into the Python program. Following are the commands to install and import pandas:

# Installing pandas Python library
pip install pandas
# Importing pandas into the program
import pandas as pd

Before we start discussing how to add a new column to an existing pandas data frame, we require a pandas data frame.

Creating a data frame from a dictionary of lists

# Creating a dictionary of lists
data = {'name': ['Sanjay', 'Ravi', 'Shreya', 'Abhishek', 'Shantanu'],
'roll': [55, 65, 75, 85, 95]}

# Creating a pandas data frame from the above data
df = pd.DataFrame(data)
print(df)

Output:

New Dataframe

Now let’s discuss the different ways to add a new column to this existing data frame which we have created just above. There are multiple ways to add a new column to this existing data frame but here we will discuss only the three main robust and powerful ones.

Adding a new column using DataFrame indexing

It is the simplest way to add a new column to the existing pandas data frame we just have to index the existing data frame with the new column’s name and assign a list of values that we want to store in the column for the corresponding rows:

# Adding a new column named 'cgpa' to the data frame
# Using DataFrame indexing
df['cgpa'] = [8.1, 9.3, 8.2, 7.9, 7.5]
print(df)

Output:

Add Col Df Index

Adding a new column to a pandas data frame using assign()

This is the second robust way of adding a new column to an existing data frame using the pandas in-built assign() method. This adds a new column to the existing data frame and then returns a new data frame with the added column. Let’s see the Python code to use it:

# Adding a new column named 'address' to the data frame
# Using the assign() method
# And saving the new returned data frame
df2 = df.assign(address = ['Bihar', 'Bihar', 'Jharkhand', 'UP', 'UP'])
print(df2)

Output:

Add column using assign()

Adding a new column using the insert() method

This is the third powerful way of adding a new column to the existing data frame. Unlike the previous ways of adding a column to the data frame, which simply added the new column at the end of the data frame as the last one, the insert() method allows us to add the new column at any specified position in the existing data frame. Let’s see the Python code to use it:

# Adding a column named 'branch'to the data frame
# Using the insert() method
# First argument is the column position
# Second argument is the column name
# And third argument is the column value
df2.insert(3, 'branch', ['ECE', 'CSE', 'ECE', 'EE', 'ECE'])
print(df2)

Output:

Add column using insert()

In the output, it is clearly visible that the new column named branch has been added at the third column index as specified in the Python code.

Conclusion

So in this tutorial, we have learned what’s a pandas data frame, how to create a new data frame from a dictionary of lists, and the three robust methods to add a new column to the existing data frame: DataFrame indexing, assign() method, and insert() method.