Pandas isnull() – Detect missing values for an array-like object

Pandas Isnull

In this article, let’s try to understand one of the general functions the isnull() function of the Pandas package. For the purpose of manipulating and analyzing data, the Python programming language has a software package called pandas. “Pandas” refers to both “Panel Data” and “Python Data Analysis”. It includes specific data structures and procedures for working with time series and mathematical tables. It is open-source software.

The goal of this function is to detect the missing values in the given input dataset. It works the same way as the pandas’ function isna().

What is the use of isnull() in Pandas?

This function determines whether values are missing from a scalar or array-like object/parameter (example: NaN in numeric arrays, None or NaN (Not a Number) in arrays, NaT (Not a Time) in datetimelike). In other words, isnull() is a boolean function, it searches for the values that are missing and returns TRUE when it finds one.

Syntax of Pandas isnull()

pandas.isnull(object)
  • Input: scalar or array-like object
  • Output: returns a scalar boolean for scalar input. returns an array of boolean values reflecting if each associated element is missing for an array input.

Implementing Pandas isnull()

Make sure to import the Pandas package in your IDE before implementing the function. To do so, run the following code line first.

import pandas as pd

Example 1: Scalar Input

pd.isnull("one")

#creating NaN
x = pd.NA
pd.isnull(x)

Output

Example 1: Scalar Input
Example 1: Scalar Input

Example 2: Array-like input

#creating array
array = ([1, pd.NA, 3])
pd.isnull(array)

#creating array of datetimelike
array1 = pd.DatetimeIndex(["2005-04-03","2005-06-07",None])
pd.isnull(array1)

Output

Example 2: Array-like input
Example 2: Array-like input

Example 3: Dataset as Input

For the following example, the Melbourne housing dataset is used. You can download this dataset by clicking here. For ease of understanding let’s consider only the first five rows of the entire dataset.

#reading the first five rows of the dataset
input_df = pd.read_csv("melb_data.csv", nrows=5)
input_df

Example 3: Dataset as Input
Example 3: Dataset as Input

When working with data frames, the syntax is – dataframe_name.isnull()

Note the first and fourth observations in the Building_Area column and Year_Built column are equal to NaN (not a number). In output these values are True, the rest all are False.

output_df = input_df.isnull()
output_df

Output

Example 3: Dataset as Input
Example 3: Dataset as Input

Summary

In conclusion, there are many methods to detect the missing values from the dataset or in the given input parameters. One such method is using the isnull() function of the Pandas package. It is a general boolean function that returns True for missing values of NaN values. It works on both input parameters as well as data frames in Python.

Reference

https://pandas.pydata.org/docs/reference/api/pandas.isnull.html