NumPy nancumsum – A Complete Guide

NumPy Nancumsum Cover Image

Hello and welcome to this tutorial on Numpy nancumsum. In our previous tutorials, we have learned about NumPy cumsum and NumPy nansum. In this tutorial, we will be learning about the NumPy nancumsum() method and also seeing a lot of examples regarding the same. So let us begin!

Recommended Read: NumPy cumsumNumPy nansum


What is NumPy nancumsum?

In Python, NaN denotes Not a Number. If we have an array that contains some NaN values and want to find its cumulative sum, we can use the nancumsum() method from NumPy.

The cumulative sum is a sequence of partial sums of a given sequence. If {a, b, c, d, e, f,…..} is a sequence then its cumulative sum is represented as {a, a+b, a+b+c, a+b+c+d,….}.

The nancumsum() method in NumPy is a function that returns the cumulative sum of the array elements calculated by treating the NaN values in the array as equal to 0. It can be the cumulative sum of the flattened array, the cumulative sum of the array elements along the rows or the cumulative sum of the array elements along the columns. 

We will see the examples for each of these in the upcoming section of this tutorial.


Syntax of NumPy nancumsum

numpy.nancumsum(a, axis=None, dtype=None, out=None)
ParameterDescriptionRequired/Optional
aInput array.Required
axisAxis along which the cumulative sum of the array is to be calculated. It can be axis=0 or axis=1 or axis=None which implies that the cumulative sum of the flattened array is to be returned.Optional
dtype (data type)The data type of the array to be returned.Optional
outAn alternative output array in which to place the result. It must have the same shape and length as the expected output.Optional

Returns:
A new array that contains the output i.e. cumulative sum by treating the NaN values as equal to zero. If out is mentioned, then a reference to it is returned.


Examples of numpy.nancumsum() method

Let us now see how to use this function with the help of some examples.

The cumulative sum of a single element

import numpy as np

a = 8
ans_a = np.nancumsum(a)

b = np.nan
ans_b = np.nancumsum(b)

print("a =", a)
print("Cumulative sum of a =", ans_a)

print("b =", b)
print("Cumulative sum of b =", ans_b)

Output:

a = 8
Cumulative sum of a = [8]
b = nan
Cumulative sum of b = [0.]

The cumulative sum of a 1-dimensional array containing NaNs

import numpy as np

arr = [7, 8, np.nan, 10, np.nan, np.nan]
ans = np.nancumsum(arr)

print("arr =", arr)
print("Cumulative sum of arr =", ans)

Output:

arr = [7, 8, nan, 10, nan, nan]
Cumulative sum of arr = [ 7. 15. 15. 25. 25. 25.]

In the above code, the array contains 3 NaN values. While computing the cumulative sum, the nancumsum() method, treats these values as equal to zero. The cumulative sum, therefore, is calculated as 7, 7+8, 7+8+0, 7+8+0+10, 7+8+0+10+0, 7+8+0+10+0+0 which results in 7, 15, 15, 25, 25, 25.


The cumulative sum of a 2-dimensional array containing NaNs

import numpy as np

arr = [[5, np.nan, 3], [np.nan, 2, 1]]
ans = np.nancumsum(arr)

print("arr =", arr)
print("Cumulative sum of arr =", ans)

Output:

arr = [[5, nan, 3], [nan, 2, 1]]
Cumulative sum of arr = [ 5.  5.  8.  8. 10. 11.]

In the case of a 2-dimensional array, when no axis is mentioned, the array is first flattened and then its cumulative sum is calculated by treating NaNs as 0.

In the above example, the array is first flattened as [5, np.nan, 3, np.nan, 2, 1] i.e. row-wise and then its cumulative sum is calculated as [5, 5+0, 5+0+3, 5+0+3+0, 5+0+3+0+2, 5+0+3+0+2+1] which results in the array [5, 5, 8, 8, 10, 11] which is returned by the function.


Cumulative sum along the axis treating NaN as 0

axis=0

import numpy as np

arr = [[8, np.nan, 6], [np.nan, 10, 20]]
# cumulative sum along axis=0
ans = np.nancumsum(arr, axis=0)

print("arr =\n", arr)
print("Cumulative sum of arr =\n", ans)

Output:

arr =
 [[8, nan, 6], [nan, 10, 20]]
Cumulative sum of arr =
 [[ 8.  0.  6.]
 [ 8. 10. 26.]]

Treating NaN as 0, the first row is as it is. The second row contains cumulative sums calculated as 8+0, 0+10, 6+20 i.e. 8, 10 and 26. That is, the cumulative sum is calculated column-wise and stored in the form of a row.

axis=1

import numpy as np

arr = [[8, np.nan, 6], [np.nan, 10, 20]]
# cumulative sum along axis=1
ans = np.nancumsum(arr, axis=1)

print("arr =\n", arr)
print("Cumulative sum of arr =\n", ans)

Output:

arr =
 [[8, nan, 6], [nan, 10, 20]]
Cumulative sum of arr =
 [[ 8.  8. 14.]
 [ 0. 10. 30.]]

Here, the first column is as it is and the second column contains the cumulative sum calculated as 8+0, 0+10 resulting in 8, 10 and the third column has the cumulative sum of 8+0+6, 0+10+20 i.e. 14 and 30. That is, the cumulative sum is calculated row-wise and stored in the form of a column.


Conclusion

That’s all! In this tutorial, we learned about the Numpy nancumsum method and practiced different types of examples using the same. You can learn more about NumPy from our NumPy tutorials here.


Reference