Usage of nan_to_num in replacing NaN and Infinity

Replace NaN With Zero And Infinity With User Defined Values (1)

nan_to_num is a function of numpy library that converts NaN (not a number) to a numeric value. The NaN is generally replaced with zero whenever we call upon nan_to_num function.

We might come across some datasets with records or values that may not be represented or are usually undefined. Such values are termed as Not a Number, also known as NaN.

The presence of such values can cause errors in calculations in a data analysis project. The function nan_to_num can eradicate these values making your project error-free.

This function is available in the NumPy library and can also be used for replacing infinity with some number which we are going to see in the coming examples.

NumPy library

NumPy(Numerical Python) is a library for the Python programming language, which provides large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Why use NumPy?

You may have had a doubt; when we have lists in python that can act as arrays, why should we use NumPy?

While using lists in Python, you should remember a few things. Here is an article on the basic creation of lists, accessing the list elements, updations, and so on.

Lists in Python are pretty much like arrays. But they are a tad bit slow. NumPy library provides for the creation of a special array called ndarray which aims to have 50 times faster results than lists.

In addition to the creation of ndarray, NumPy also provides a lot of functions that make working with these arrays easier.


What is NaN?

NaN stands for not a number. To elaborate, let us take an example.

Suppose you came across division by zero. The result of such division is undefined.

How can we represent such undefined results?

We use NaN to represent the result of such undefined or unrepresentable mathematical operations.

Not a Number(NaN) Is Not Equivalent to Infinity

Not a Number is not equivalent to infinity. The reason is simple. We know that the term ‘infinity’ is used to represent a number that is much greater than any finite number. It is used to represent a number or value which is uncountable.

But NaN is a term used to represent a result of an undefined operation in Python. Another example of NaN would be the square root of a negative number.

nan_to_num Function to Replace Infinite Values

Apart from its primary function, which is to replace a NaN with zero, nan_to_num() is also used to replace infinite values with user-defined numbers meaning we can give our own choice to replace the infinite values.

Infinity Can Also Be Negative

We have discussed that infinity is a number that is much larger than any finite number. Such numbers are called positive infinite numbers.

There is also one thing to know; any number that is much smaller compared to any finite number is also called an infinite value but negative.


Exploring the NumPy.nan_to_num Function

This method replaces a nan with zero and infinity with user-defined numbers.

The syntax is as follows.

numpy.nan_to_num(x, copy=True, nan=0.0, posinf=None, neginf=None) 

The arguments of the syntax and their description is given below.

ArgumentsDescriptionDefault Type/ValueRequired/Optional
xInput datascalar or array-likeRequired
copyThis argument is used to specify if the original array should be duplicated to be modified or if the original array is to be modified without a duplication.If the value is true, a copy of the array is created, which is then supposed to be modified.If the value is false, no copy of the original array is made, and the modification is done on the original array itself.bool
Default= True
Optional
nanThis field is used to specify the value that is to be given to fill the NaN value
If no value is passed, then NaN values will be replaced with 0.0
int, float
Default=0.0
Optional
posinfSpecifies the value to be used to fill positive infinity values.
The user can specify the value it is to be filled with
If no value is passed, then positive infinity values will be replaced with a very large number
int, floatOptional
neginfSpecifies the value to be used to fill negative infinity values
If the user has not specified any value, then negative infinity values will be replaced with a very small (or negative) number
int, floatOptional
Arguments of nan_to_num

Return type: ndarray


Filling Nan With Zero

In this example, let us replace nan values with zeroes.

First, let us see the creation of a ndarray with nan values.

import numpy as np
arr = np.array([np.nan, 1, 2, 3,np.nan,24,25,26])
print("Original Array:\n", arr)

Here is a quick explanation of the above snippet of code.

import numpy as np: First, we are importing the numpy library to create an array. The standard and accepted alias for numpy is np.

Next, we create an array with nan values with the help of np.nan. np.nan is used to indicate that there is a NaN value in the data.

This newly created array is stored in an object called arr.

We are printing this array in the following line.

The ndarray is obtained as follows.

Original Array1
Original Array1

Now in the output, you might have noticed a small difference. It is just that since np.nan is a special floating-point value in Python. NaN has a floating-point data type associated with it. And since the primary rule of elements in an array is that they all should have the same datatype, all the elements are converted to float.

Let us see the replacement of nan values.

# Replace NaN with 0
new_arr = np.nan_to_num(arr, nan=0.0)
print("\nArray with NaN replaced by 0:\n", new_arr)

Now, we are moving to the interesting part.

In the second line, we have called the nan_to_num function. The parameters passed in this function are the array that needs to be modified and the value with which we are going to replace the NaN.

We need not specify nan=0.0 multiple times even if there are multiple NaN values in the original array. Just specifying once would do the job.

After this, we will have a new array that has NaN replaced by zeroes.

NaN Replaced With Zero
NaN Replaced With Zero

Replacing Infinite Values With posinf

In this example, let us see the replacement of infinity with a positive number using posinf keyword.

The posinf is used to replace a positive infinite value.

We can create a positive infinite value with the help of another keyword, just like np.nan. This keyword is np.inf.

Let us see the creation of an array with infinity.

arr1=np.array([1,2,3,np.inf,4,5,6,np.inf,7,8,9,np.inf])
print("Original array:\n",arr1)

In the first line, we create an array with positive infinity values and store it in an object called arr1.

In the next line, we are printing this array.

Original Array2
Original Array2

Since np.inf is also a floating-point keyword, the elements in this array are converted to float.

We are going to replace the infinity with 100.

n_arr1=np.nan_to_num(arr1,posinf=100)
print("The new array:\n",n_arr1)

The positive infinite values are replaced with 100 using the keyword posinf.

nan_to_num is called, and the array to be modified and the value with which the infinite values are to be replaced are given as parameters. This array is stored in a new object called n_arr1.

Infinity Replaced With 100
Infinity Replaced With 100

Replacing Infinite Values With neginf

In this example, let us see the replacement of infinity with a negative number using neginf keyword.

Let us take the same example with negative infinity.

arr2=np.array([1,2,3,-np.inf,4,5,6,-np.inf,7,8,9,-np.inf])
print("Original array:\n",arr2)

The -np.inf is used to represent a negative infinite value.

We are creating an array with infinite negative values and storing it in an object called arr2.

Next, we are printing the array with the help of print().

The array will look something like this.

Original Array3
Original Array3

Now let us replace the negative infinity with -100.

n_arr2=np.nan_to_num(arr2,neginf=-100)
print("The new array:\n",n_arr2)

In the above code, we called the nan_to_num function and the array to be modified and the value with which the negative infinity should be replaced are passed to the function,

The new array is:

Negative Inifinte Replaced With 100
Negative Inifinte Replaced With 100

We can return the element-wise numerical positive for the input array. Check out this article on the same.

How do you return the positive values from the input array?


Combination of NaN, np. inf, -np. inf

Let us see a combined example of nan, posinf, and neginf.

The code is shown below.

#np array with nan,positive infinity and negative infinity
arr=np.array([1,3,4,np.nan,7,6,8,np.inf,2,5,9,-np.inf,21])
print("Original array:\n",arr)
#replacing the values 
newarr=np.nan_to_num(arr,nan=0,posinf=100,neginf=-100)
print("Modified array:\n",newarr)

The second line shows how we can create an array with nan, positive infinite, and negative infinite values. This array is stored in arr.

In the following line, we are printing this original array.

In the fifth line, we are calling the nan_to_num function. The parameters are the original array, replacement of NaN, positive infinity, and negative infinity. This modified array is stored in newarr.

Combination Of NaN, Positive And, Negative Infinity
Combination Of NaN, Positive And, Negative Infinity

Replacing Multiple Infinite Values With Different Numbers

Suppose you are bored of replacing multiple infinite values with the same common number. We can specify different values, one for each infinite term.

Let us see how we can do that.

import numpy as np
arr1 = np.array([1, 2, 3, np.inf, 4, 5, 6, np.inf, 7, 8, 9, np.inf])
print("Original array:\n", arr1)
# array of replacement values
rep = np.array([40,50,60])
arr2 = np.where(arr1 == np.inf, np.tile(reps, int(len(arr1)/len(reps))), arr1)
print("Modified array:\n", arr2)

import numpy as np: In this line, we are importing the NumPy library as np.

In the following line, we create an array of twelve elements, of which three are infinite values. This array is stored in an object called arr1.

We are printing the array in the next line.

We are creating an array of replacement values for the original array. Since we have three infinite values in the original array, the replacement array has three elements. This array is stored in rep.

In the following line, we create another array called arr2 for the modified array.

The condition np.where() is used to replace the infinite values at different positions with the replacement array elements.

It is similar to the where function in SQL, in which we check for a certain condition to perform a certain task.

The np.where the condition takes three parameters. Let us see what they are one by one.

arr1==np.inf: It is a boolean condition that returns true if any of the positions in the original array contains an infinite value and false if the position does not contain an infinite value.

For example, from the original array(arr1), this condition returns true for the third position and returns false for the zeroth position taking into consideration that indexing in arrays starts from zero.

np.tile(rep, int(len(arr1)/len(rep))): This argument is used to match the length of both arrays. The original array(arr1) is of length 12. But the replacement array(rep) is of length 3. So we cannot use the replacement array with the original array.

This argument makes sure that the length of both arrays matches. len(arr1)/len(rep) returns 4. Hence, the rep array has to be repeated 4 times to make its length 12.

arr1: This argument is used when the arr1==np.inf is false. That is, it ensures that the non-infinite values in the original array are retained in the modified array in the same position.

The output is shown below.

Replacement Of Infinite With Different Values
Replacement Of Infinite With Different Values

Summary

To summarize, we have seen what is Not a Number and why NaN is not equivalent to infinity. We have also seen that infinity can also be negative.

Next, we see the syntax of the np.nan_to_num function and its arguments.

In the examples, first, we have seen how we can create a NaN with the help of np.nan and the replacement of nan by zero with the help of nan_to_num. We have also seen that the array with NaN values is converted to a float datatype.

Next, we have seen how we can create a positive infinity using np.posinf and the replacement of this infinity by posinf.

Similarly, we have seen the replacement of negative infinity by neginf.

And we have seen the conversion of an array that has all three: NaN, Positive Infinity, and Negative Infinity.

Finally, we have seen the replacement of multiple infinite values with different numbers.

References

Official Numpy Manual on nan_to_num

While working with arrays whose elements are NaN, you might have trouble replacing these values based on how you declare the array. Check out this stack overflow answer that might solve the problem.