How to replace NaN values in a Pandas dataframe with 0?

Replace NaN Values In A Pandas Dataframe With 0

In Python, NaN means Not A Number. It denotes that the entries with NaN values are either not defined or are missing from the dataset. It is a floating point value and cannot be converted to any other data type.

NaN values are not desirable, especially in machine learning models as they can lead to training an inaccurate model. These values can be replaced by a computed term like mean, median or any other suitable value based upon the dataset.

Also read: How to Replace NAN Values in Pandas with an Empty String?

This tutorial will look at how we can replace NaN values with 0 in a Pandas data frame. Let’s first create a data frame to start with.


Creating a Pandas Dataframe

import pandas as pd 
import numpy as np

data = {
    'Mobile Model Number': [6, np.nan, 2, np.nan, 7, 3, 5,
                            np.nan, 21, 12, np.nan],
    'Price': [30000, 5200, 6000, np.nan, np.nan, 15000, 36000,
              np.nan, 4500, np.nan, 2300], 
    'Rating': [3.1, 3.0, np.nan, 4.6, np.nan, np.nan, 2.8, 4.7, 
               np.nan, 3.0, np.nan]
}

df = pd.DataFrame(data)

df
Data Frame
Data Frame

The above is a data frame consisting of 3 columns: Mobile Model Number, Price and Rating. All of these columns contain some NaN values as of now.


Python functions to replace NaN values

There are mainly two functions available in Python to replace NaN values:
1. replace()
2. fillna()

You can learn more about the replace() function by referring to this and about fillna() function by referring to this article.


Examples of replacing NaN values with 0

Let’s get started with a few examples of replacing the NaN values here and understand how that works in code.

Using replace() function:

a. Using replace() to replace NaN values in a single column with 0

data = {
    'Mobile Model Number': [6, np.nan, 2, np.nan, 7, 3, 5,
                            np.nan, 21, 12, np.nan],
    'Price': [30000, 5200, 6000, np.nan, np.nan, 15000, 36000,
              np.nan, 4500, np.nan, 2300], 
    'Rating': [3.1, 3.0, np.nan, 4.6, np.nan, np.nan, 2.8, 4.7, 
               np.nan, 3.0, np.nan]
}

# applying the replace method on a single column
df['Rating'] = df['Rating'].replace(np.nan, 0)

df
Replace Single Column With 0 1
replace(): Replace NaN in a Single Column With 0

In the above code, we applied the replace() function to replace NaN values with 0 in the ‘Rating’ column of the dataframe. As a result, this column now has 0 in place of the previously NaN values.

b. Using replace() to replace NaN values in the entire data frame with 0

data = {
    'Mobile Model Number': [6, np.nan, 2, np.nan, 7, 3, 5,
                            np.nan, 21, 12, np.nan],
    'Price': [30000, 5200, 6000, np.nan, np.nan, 15000, 36000,
              np.nan, 4500, np.nan, 2300], 
    'Rating': [3.1, 3.0, np.nan, 4.6, np.nan, np.nan, 2.8, 4.7, 
               np.nan, 3.0, np.nan]
}

# applying the replace method on the entire dataframe
df = df.replace(np.nan, 0)

df
Replace NaN In Entire Dataframe With 0 1
replace(): Replace NaN In the Entire Dataframe With 0

In this case, we replaced all the NaN values in the entire dataframe with 0 all at once.


Using fillna() function:

This example will use the fillna() function which we’ve explored earlier.

a. Using fillna() to replace NaN values in a single column with 0

data = {
    'Mobile Model Number': [6, np.nan, 2, np.nan, 7, 3, 5,
                            np.nan, 21, 12, np.nan],
    'Price': [30000, 5200, 6000, np.nan, np.nan, 15000, 36000,
              np.nan, 4500, np.nan, 2300], 
    'Rating': [3.1, 3.0, np.nan, 4.6, np.nan, np.nan, 2.8, 4.7, 
               np.nan, 3.0, np.nan]
}

# applying the fillna method on a single column
df['Mobile Model Number'] = df['Mobile Model Number'].fillna(0)

df
Fillna Single Column With 0
fillna(): Replace NaN in a Single Column With 0

Here, we have replaced all the NaN values in the ‘Rating’ column with a 0.

b. Using fillna() to replace NaN values in the entire dataframe with 0

data = {
    'Mobile Model Number': [6, np.nan, 2, np.nan, 7, 3, 5,
                            np.nan, 21, 12, np.nan],
    'Price': [30000, 5200, 6000, np.nan, np.nan, 15000, 36000,
              np.nan, 4500, np.nan, 2300], 
    'Rating': [3.1, 3.0, np.nan, 4.6, np.nan, np.nan, 2.8, 4.7, 
               np.nan, 3.0, np.nan]
}
# applying the fillna method on the entire dataframe
df.fillna(0)

df
Fillna NaNin Entire Dataframe With 0
fillna(): Replace NaN in the Entire Dataframe With 0

In this case, we used the fillna() function to replace all the NaN values in the dataframe with 0 all at once.


Summary

Hence, we have seen how to replace NaN values with a 0 in a dataframe. To learn more about Pandas and other Python-related concepts, do check out our other blogs as well!


References