In this article, let’s try to understand one of the general functions the
isnull() function of the Pandas package. For the purpose of manipulating and analyzing data, the Python programming language has a software package called pandas. “Pandas” refers to both “Panel Data” and “Python Data Analysis”. It includes specific data structures and procedures for working with time series and mathematical tables. It is open-source software.
The goal of this function is to detect the missing values in the given input dataset. It works the same way as the pandas’ function isna().
What is the use of isnull() in Pandas?
This function determines whether values are missing from a scalar or array-like object/parameter (example: NaN in numeric arrays, None or NaN (Not a Number) in arrays, NaT (Not a Time) in datetimelike). In other words,
isnull() is a boolean function, it searches for the values that are missing and returns TRUE when it finds one.
Syntax of Pandas isnull()
- Input: scalar or array-like object
- Output: returns a scalar boolean for scalar input. returns an array of boolean values reflecting if each associated element is missing for an array input.
Implementing Pandas isnull()
Make sure to import the Pandas package in your IDE before implementing the function. To do so, run the following code line first.
import pandas as pd
Example 1: Scalar Input
pd.isnull("one") #creating NaN x = pd.NA pd.isnull(x)
Example 2: Array-like input
#creating array array = ([1, pd.NA, 3]) pd.isnull(array) #creating array of datetimelike array1 = pd.DatetimeIndex(["2005-04-03","2005-06-07",None]) pd.isnull(array1)
Example 3: Dataset as Input
For the following example, the Melbourne housing dataset is used. You can download this dataset by clicking here. For ease of understanding let’s consider only the first five rows of the entire dataset.
#reading the first five rows of the dataset input_df = pd.read_csv("melb_data.csv", nrows=5) input_df
When working with data frames, the syntax is –
Note the first and fourth observations in the Building_Area column and Year_Built column are equal to NaN (not a number). In output these values are True, the rest all are False.
output_df = input_df.isnull() output_df
In conclusion, there are many methods to detect the missing values from the dataset or in the given input parameters. One such method is using the
isnull() function of the Pandas package. It is a general boolean function that returns True for missing values of NaN values. It works on both input parameters as well as data frames in Python.