Pandas notna(): Detect non-missing values for an array-like object

Pandas Notna Cover Image

Pandas is a Python library popularly used for working with datasets. It helps us clean, analyze, and manipulate data. Data cleaning is the process of detecting incorrect data and correcting it as per the requirements.

It requires you to know which values are missing and which ones are not missing. A dataset might contain missing data due to various reasons like manual errors, users refusing to fill in all the information in a survey form, data not available for a particular case, etc.

In our earlier tutorial, you have learnt how to detect missing values in an array-like object.

Recommended Read: Pandas isnull() – Detect missing values for an array-like object

In this tutorial, you will how to detect non-missing values for an array-like object using the Pandas notna() function.


Pandas notna()

The Pandas notna() function enables us to detect all the non-missing i.e. existing values in a given array-like object.
You can use this function to detect non-missing values in both a data frame and a series object as you will further in this tutorial.

You can learn about Pandas data frame and series from the below tutorials of ours:

Syntax of notna()

DataFrame.notna()

Returns: A data frame of boolean values representing a missing value by ‘False’ and a non-missing value by ‘True’.


Also read: Python isna() function in Pandas

Using notna() with a Pandas DataFrame Object

Let us create a data frame first.

import numpy as np
import pandas as pd

# creating a data frame
data = {
    "car company": ["honda", np.nan, "kia", "fiat"],
    "colour": ["red", "white", np.nan, np.nan],
    "price": [1500000, np.nan, np.nan, 1200000]
}

df = pd.DataFrame(data)

df
Not Na Df

You can see that the data frame contains some missing data and existing data. Now, you can detect the non-missing data, and in turn the missing data as well, using the pandas.notna() function as shown below:

df.notna()

Output:

Not Na Op1

The above data frame displays True for existing i.e. data values and False for non-existent data values.
You can also see the non-missing values in a particular column using the syntax

DataFrame['Column Name'].notna()

For example,

df['car company'].notna()

Output:

0     True
1    False
2     True
3     True
Name: car company, dtype: bool

In the ‘car company’ column, all values except the second entry are existent.
Similarly,

df['colour'].notna()

Output:

0     True
1     True
2    False
3    False
Name: colour, dtype: bool
df['price'].notna()

Output:

0     True
1    False
2    False
3     True
Name: price, dtype: bool

You can also get the count of total non-missing values in the columns of a data frame as follows:

df.notna().sum()

Output:

car company    3
colour         2
price          2
dtype: int64

The above output states that the ‘car company’ column contains 3 non-missing values and the ‘colour’ and ‘price’ columns contain 2 non-missing values each.


Using notna() with a Pandas Series object

The notna() method works on Pandas Series just like it works on the data frames as shown below:

Example 1: Finding non-NA values in a pandas series of integers

Creating a Series object:

import numpy as np
import pandas as pd

# creating a series
sr = pd.Series([15, 20, np.nan, 6, np.nan, 65])

sr

Output:

0    15.0
1    20.0
2     NaN
3     6.0
4     NaN
5    65.0
dtype: float64
sr.notna()

Output:

0     True
1     True
2    False
3     True
4    False
5     True
dtype: bool

The series contains a total of 6 values in which non-missing ones are denoted by True and missing ones are denoted by False.


Example 2: Finding non-NA values in a Pandas series of strings

import numpy as np
import pandas as pd

# creating a series
sr2 = pd.Series(["yellow", "green", np.nan, "blue", np.nan, np.nan])

sr2

Output:

0    yellow
1     green
2       NaN
3      blue
4       NaN
5       NaN
dtype: object
sr2.notna()

Output:

0     True
1     True
2    False
3     True
4    False
5    False
dtype: bool
sr2.notna().sum()

Output:

3

The series sr2 contained 3 non-missing values.


Conclusion

notna() is a function in the Pandas library in Python used to detect non-missing i.e. existing values for an array-like object. It returns a data frame of boolean entries representing non-missing values by True and missing values by False.


Reference