Pandas fillna() Method - A Complete Guide

Data analysis has become an important part of our everyday life. Every day we deal with different kinds of data from different domains. One of the major challenges in data analysis is the presence of missing values or (NA) in the data. In this article, we will learn how we can handle the missing values in a dataset with the help of the fillna() method. Let’s get started!

What Is the Pandas fillna() Method and Why Is It Useful?

The Pandas Fillna() is a method that is used to fill the missing or NA values in your dataset. You can either fill the missing values like zero or input a value. This method will usually come in handy when you are working with CSV or Excel files.

Don’t get confused with the dropna() method where we remove the missing values. In this case, we will replace the missing values with zero or with an input value from the user.

Let’s look at the syntax of the fillna() function.

DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

Let’s look at the examples below of how you can use the fillna () method for different scenarios.

Pandas DataFrame fillna() method

In the following example, we will fill the place of NAN values with zeros.

import pandas as pd
import numpy as np

df = pd.DataFrame([[np.nan, 300, np.nan, 330],
                     [589, 700, np.nan, 103],
                     [np.nan, np.nan, np.nan, 675],
                     [np.nan, 3]],
                    columns=list('abcd'))
print(df)

#Filling the NaN values with zeros.
print("\n")
print(df.fillna(0))

Output

   a      b   c      d
0    NaN  300.0 NaN  330.0
1  589.0  700.0 NaN  103.0
2    NaN    NaN NaN  675.0
3    NaN    3.0 NaN    NaN


       a      b    c      d
0    0.0  300.0  0.0  330.0
1  589.0  700.0  0.0  103.0
2    0.0    0.0  0.0  675.0
3    0.0    3.0  0.0    0.0

Applying fillna() method to only one column

df = pd.DataFrame([[np.nan, 300, np.nan, 330],
                     [589, 700, np.nan, 103],
                     [np.nan, np.nan, np.nan, 675],
                     [np.nan, 3]],
                    columns=list('abcd'))

print(df)

#Filling the NaN value 
print("\n")
newDF = df['b'].fillna(0)
print(newDF)

Output

 a      b   c      d
0    NaN  300.0 NaN  330.0
1  589.0  700.0 NaN  103.0
2    NaN    NaN NaN  675.0
3    NaN    3.0 NaN    NaN


0    300.0
1    700.0
2      0.0
3      3.0
Name: b, dtype: float64

You can also use the limit method to specify which rows you want to fill the NAN values.

import pandas as pd
import numpy as np
df = pd.DataFrame([[np.nan, 300, np.nan, 330],
                     [589, 700, np.nan, 103],
                     [np.nan, np.nan, np.nan, 675],
                     [np.nan, 3]],
                    columns=list('abcd'))


print(df)

# Filing the NaN value 
print("\n")
print(df.fillna(0, limit=2))

Output

a      b   c      d
0    NaN  300.0 NaN  330.0
1  589.0  700.0 NaN  103.0
2    NaN    NaN NaN  675.0
3    NaN    3.0 NaN    NaN


       a      b    c      d
0    0.0  300.0  0.0  330.0
1  589.0  700.0  0.0  103.0
2    0.0    0.0  NaN  675.0
3    NaN    3.0  NaN    0.0

In the above method, we have applied limit=2 which means we have replaced NAN values in only the first two rows.

Conclusion

In summary, we learned different methods to fill NAN values in a DataFrame. All these, methods will come in handy in any of your data analysis projects.