In this article, we implement a python library pandas, used for importing and analyzing data. Further to detect existing/non-existing data in a data frame or object we use the Pandas notnull() function.
Also check: Python isna() and notna() functions from Pandas
What is notnull?
“Pandas notnull” is a method available in the Pandas library for data analysis in Python. It can be used on either a Pandas DataFrame or Series, which are the two main data structures used in Pandas for storing and manipulating data. The “notnull” method is used to check for missing values, also known as “null” or “NaN” values, in the data.
When “notnull” is called on a DataFrame or Series, it returns a Boolean mask indicating whether each element in the data structure is not null (i.e., not missing). The result is a new DataFrame or Series with the same shape as the original, but containing only True or False values, depending on whether the corresponding element in the original data structure was missing or not.
Syntax of Pandas notnull
pandas.notnull(obj)
Parameter
obj : array-like or object value
Input Object to check for not null or non-missing values.
Return Value
- bool or array-like of bool
- For scalar input, returns a scalar boolean. For array input, returns an array of boolean indicating whether each corresponding element is valid.
Examples of notnull
Importing the required libraries.
import pandas as pd
import numpy as np
Example 1: Detecting entries in an array using Pandas not null
We create a 2d-array with null values using np.nan
and later display it.These null values are mapped as false
.
array = np.array([[10,20, np.nan, 40], [50, 60,70 ,np.nan]])
print(array)
[[10. 20. nan 40.]
[50. 60. 70. nan]]
print(pd.notna(array))
[[ True True False True]
[ True True True False]]
The example shows how to use the Pandas notnull() method to filter missing values in a data frame. The DataFrame, named “df”, has missing values and the goal is to create a new DataFrame, “df_notnull”, with only the non-missing values.
To achieve this, the Pandas notnull() method is applied to the entire DataFrame “df” and the “all” method is used with the “axis=1” argument. This means that Pandas notnull() will return a Boolean mask indicating whether each element in each row is not null. The “all” method, with “axis=1”, will check if all elements in each row are not null.
The resulting Boolean mask is then used to filter the original DataFrame “df” by selecting only the rows where all elements are not null. The filtered DataFrame, “df_notnull”, is created and it contains only the non-missing values from “df”.
Example 2: Detecting entries in DataFrame using notnull()
We print a dataframe ‘data’ with columns: Name, Age, Address, and Qualification and their respective data values. This data frame has no columns with nonexisting data .
data = pd.DataFrame({'Name':['Chris', 'Princi', 'Monica', 'Steve'],
'Age':[27, 24, 35, 32],
'Address':['New York', 'Oslo', 'LA', 'Boston'],
'Qualification':['Msc', 'MA', 'MCA', 'Phd']})
print(data)

As there are non null values, the non-missing values get mapped as true
in the boolean object .
print(data.notnull())

Below we modify the ‘data’ data frame with some columns having missing data example the Name column has its second value ‘none’ and the Age column has its last row ‘Nan’.
data = pd.DataFrame({'Name':['Chris',None, 'Monica', 'Steve'],
'Age':[27, 24, 35, np.nan ],
'Address':['New York', None, 'LA', 'Boston'],
'Qualification':['Msc',None , None, 'Phd']})
print(data)

Therefore the missing values are mapped as false
in the boolean object.
print(data.notnull())

Example 3 : Detecting entries in indexes
index = pd.DatetimeIndex(["2017-07-05", "2017-07-06", None,"2017-07-08"])
print(index.notnull())
Value with index 2 is null value thus mapped to false
.
[ True True False True]
Summary
We have implemented notnull()
function on array, index and DatFrame to detect if the input has any missing value. And if so the output must map it to false
otherwise map it to true
. Browse more articles at AskPython