How to create an empty DataFrame in Python?

How To Work With Empty DataFrame

Hello readers! In this tutorial we are going to discuss the different ways to create an empty DataFrame in Python. We will also discuss the difference between an empty DataFrame and a DataFrame with NaN values. So, let’s get started.


What is an empty DataFrame in Python?

In Python, a DataFrame is a two-dimensional data structure that is provided by the Python pandas module which stores the data in the tabular format i.e. in the rows and columns. An empty DataFrame is a pandas DataFrame object which is entirely empty (with no data in it), all the axes are of zero length. It must have either zero number of rows or zero number of columns.

We can check if a pandas DataFrame object is empty or not using the DataFrame.empty property of the pandas DataFrame object. When we apply this property on the pandas DataFrame object it returns a boolean value i.e True or False depending on the condition if the concerned DataFrame object is empty or not.

Ways to create an empty DataFrame

In Python, we can create an empty pandas DataFrame in the following ways. Let’s understand these one by one.

1. Create a complete empty DataFrame without any row or column

This is the simplest and the easiest way to create an empty pandas DataFrame object using pd.DataFrame() function. In this method, we simply call the pandas DataFrame class constructor without any parameters which in turn returns an empty pandas DataFrame object. Let’s see the Python code to implement this method.

# Method-1

# Import pandas module
import pandas as pd 

# Create an empty DataFrame without 
# Any any row or column
# Using pd.DataFrame() function
df1 = pd.DataFrame()
print('This is our DataFrame with no row or column:\n')
print(df1)

# Check if the above created DataFrame
# Is empty or not using the empty property
print('\nIs this an empty DataFrame?\n')
print(df1.empty)

Output:

This is our DataFrame with no row or column:

Empty DataFrame
Columns: []
Index: []

Is this an empty DataFrame?

True

2. Create an empty DataFrame with only rows

This is another easy way to create an empty pandas DataFrame object which contains only rows using pd.DataFrame() function. In this method, we will call the pandas DataFrame class constructor with one parameter- index which in turn returns an empty Pandas DataFrame object with the passed rows or index list. Let’s write Python code to implement this method.

# Method-2

# Import pandas module
import pandas as pd 

# Create an empty DataFrame with
# Five rows but no columns
# Using pd.DataFrame() function with rows parameter
df2 = pd.DataFrame(index = ['R1', 'R2', 'R3', 'R4', 'R5'])
print('This is our DataFrame with rows only no columns:\n')
print(df2)

# Check if the above created DataFrame
# Is empty or not using the empty property
print('\nIs this an empty DataFrame?\n')
print(df2.empty)

Output:

This is our DataFrame with rows only no columns:

Empty DataFrame
Columns: []
Index: [R1, R2, R3, R4, R5]

Is this an empty DataFrame?

True

3. Create an empty DataFrame with only columns

To create an empty Pandas DataFrame object which contains only columns using pd.DataFrame() function, we call the Pandas DataFrame class constructor with one parameter – columns which in turn returns an empty Pandas DataFrame object with the passed columns list. Let’s implement this method through Python code.

# Method-3

# Import pandas module
import pandas as pd 

# Create an empty DataFrame with
# Five columns but no rows
# Using pd.DataFrame() function with columns parameter
df3 = pd.DataFrame(columns = ['C1', 'C2', 'C3', 'C4', 'C5'])
print('This is our DataFrame with columns only no rows:\n')
print(df3)

# Check if the above created DataFrame
# Is empty or not using the empty property
print('\nIs this an empty DataFrame?\n')
print(df3.empty)

Output:

This is our DataFrame with columns only no rows:

Empty DataFrame
Columns: [C1, C2, C3, C4, C5]
Index: []

Is this an empty DataFrame?

True

4. Create an empty DataFrame with both rows and columns

In this method, we create an empty Pandas DataFrame object which contains both rows as well as columns. When we call the pandas DataFrame class constructor with two parameters- columns and index it returns an empty pandas DataFrame object with the passed index and columns list. Let’s see how to implement this method through Python code.

# Method-4

# Import pandas module
import pandas as pd 

# Create an empty DataFrame with
# Five rows and five columns
# Using pd.DataFrame() function 
# With columns & index parameters
df4 = pd.DataFrame(columns = ['C1', 'C2', 'C3', 'C4', 'C5'],
                   index = ['R1', 'R2', 'R3', 'R4', 'R5'])
print('This is our DataFrame with both rows and columns:\n')
print(df4)

# Check if the above created DataFrame
# Is empty or not using the empty property
print('\nIs this an empty DataFrame?\n')
print(df4.empty)

Output:

This is our DataFrame with both rows and columns:

     C1   C2   C3   C4   C5
R1  NaN  NaN  NaN  NaN  NaN
R2  NaN  NaN  NaN  NaN  NaN
R3  NaN  NaN  NaN  NaN  NaN
R4  NaN  NaN  NaN  NaN  NaN
R5  NaN  NaN  NaN  NaN  NaN

Is this an empty DataFrame?

False

NOTE: There is one problem with this method like we can see its output the empty attribute has returned False. It means the DataFrame which we created in this method is not considered as an empty DataFrame by the pandas module.

Empty DataFrame vs DataFrame with NaN values

We have seen the problem with the output of the above Python code. An empty DataFrame and a DataFrame with all NaN values are treated differently by the Pandas module.

This happens because when we try to create an empty pandas DataFrame using this method, we do not provide or enter any data in the DataFrame object but by default, it gets filled with NaN values.

That is why when we apply the empty attribute to such kinds of pandas DataFrames, it returns False.

So, one simple solution to overcome this problem is to remove all the NaN values which have been placed by default in the DataFrame. We can use the dropna() function of the pandas DataFrame class to remove all the NaN values in the DataFrame. Then we apply the empty property on the DataFrame object to check the result and it will return True. Let’s implement this through Python Code.

# Compare an empty DataFrame
# With a DataFrame with all NaN values

# Import pandas module
import pandas as pd 

# Create an empty DataFrame with
# Three rows and four columns
# Using pd.DataFrame() function 
# With columns & index parameters
df = pd.DataFrame(columns = ['Col-1', 'Col-2', 'Col-3', 'Col-4'],
                   index = ['Row-1', 'Row-2', 'Row-3'])
print('This is our DataFrame with NaN values:\n')
print(df)

# Check if the above created DataFrame
# Is empty or not using the empty property
print('\nIs this an empty DataFrame?\n')
print(df.empty)

# Remove all the NaN values using dropna() function
# Then apply the empty attribute/property on the DataFrame
print('\nAfter removing all the NaN values:\n')
print('Is this an empty DataFrame?\n')
print(df.dropna().empty)

Output:

This is our DataFrame with NaN values:

      Col-1 Col-2 Col-3 Col-4
Row-1   NaN   NaN   NaN   NaN
Row-2   NaN   NaN   NaN   NaN
Row-3   NaN   NaN   NaN   NaN

Is this an empty DataFrame?

False

After removing all the NaN values:

Is this an empty DataFrame?

True

Conclusion

In this tutorial, we have learned four ways to create an empty Pandas DataFrame object and the difference between an empty DataFrame and a DataFrame with NaN values. Hope you have understood everything discussed above and are excited to experiment with these methods on your own. Thank you and stay tuned with us for more such exciting Python tutorials.