How to normalize a NumPy array to a unit vector?

How To Normalize A NumPy Array To A Unit Vector

Numpy or numerical python is a free software tool that makes scientific computations easy in python. It is very powerful python library that provides a huge range of functions for operating on arrays. It contains logical, linear algebraic and also mathematical functions for array manipulation.

A numpy array is a matrix like object containing the same type of object which can be implemented with square brackets like lists. It is an ordered data type as it is indexed by a non-mutable tuple object which gives us the size or dimension of an array.

Why do we need normalization of arrays?

Array normalization helps in comparing two arrays of different sizes easily. In machine learning , the training algorithms learn at a faster pace when values are smaller in size. When different variables are of various sizes and the values diverge, it is easier to train the model by normalizing the values so that they converge.

This is why normalization of arrays into a unit vector is extremely useful in data science and artificial intelligence model trainings.

Properties of Numpy arrays

There are a lot of properties of NumPy arrays which are specific to this data type. Some of those are:

  • Unlike python lists, which are structural very similar to arrays, Numpy arrays have a fixed size.
  • The data type of all the variables in a numpy array must be the same.
  • A large number of data can be processed using NumPy arrays.
  • There are lots of NumPy functions that can be used for scientific computations involving complex NumPy arrays.

These properties of numpy arrays must be kept in mind while dealing with this data type. To read more about numpy arrays, visit the official documentation.

Ways to Normalize a numpy array into unit vector.

There are three ways in which we can easily normalize a numpy array into a unit vector. They are:

  • Using the numpy.linalg.norm() function.
  • Using the scipy.linalg.norm() function.
  • Using the scikit-learn library.

Let us explore each of those methods seperately.

The numpy.linalg.norm()function

This numpy function returns of one the seven different matrix norms depending on the order of the matrix.

The syntax of the function is as follows: linalg.norm(M, ord=None, axis=None, keepdims=False)

Here, M is the input matrix and the ord parameter represents the order of the matrix. When the axis is set to a definite value, it returns the norm along that axis. The keepdims parameter specifies the dimension of the vector norm. The function returns N which is a float or an ndarray depending on the ord function. To visit the official documentation, click here.

The function can be implemented in the following way:

#importing required modules
import numpy as np
from numpy import linalg as LA
#creating the matrix
M = np.arange(9) - 1
#reshaping the matrix
M=M.reshape((3,3))
#display the original matrix
print("original matrix=",M)
#calculating the norm of the matrix
N=LA.norm(M)
#displaying the result
print("the matrix norm is=",N)

The output of the above code would be:

original matrix= [[-1  0  1]
 [ 2  3  4]
 [ 5  6  7]]
the matrix norm is= 11.874342087037917

Also read: the numpy.linalg.norm() function.

Numpy Linalg Norm
Numpy Linalg Norm()

The scipy.linalg.norm() function

The syntax of this function is scipy.linalg.norm(M, ord=None, axis=None, keepdims=False, check_finite=True)

This function is similar to the function as numpy.linalg.norm() as shown above. The parameters and the inputs are all the same except there is one added parameter called check_finite which checks whether all numbers in the matrix are finite or not.

The function also returns N which is a float or an ndarray depending on the dimensions of the input matrix. Let us look at its implementation.

#importing required modules
import numpy as np
from scipy.linalg import norm
#creating the matrix
M = np.arange(9) - 2
#reshaping the matrix
M=M.reshape((3,3))
#display the original matrix
print("original matrix=",M)
#calculating the norm of the matrix
N=norm(M)
#displaying the result
print("the matrix norm is=",N)

The output would be:

original matrix= [[-2 -1  0]
 [ 1  2  3]
 [ 4  5  6]]
the matrix norm is= 9.797958971132712
Scipy Linalg Norm()
Scipy Linalg Norm()

To know about more about the scipy.linalg.norm, visit the official documentation.

Using the scikit-learn library.

This function also scales a matrix into a unit vector. The function looks something like this: sklearn.preprocessing.normalize(M, norm='l2', *, axis=1, copy=True, return_norm=False)

Here, just like the previous example, the first parameter M is the input matrix. The norm parameter by default is set to 12 which means that if the normalized vectors are summed up, then that result would not be equal to one. It can take up two other values, namely, l1 and max.

The axis parameter defines the axis through which the normalization is to be done. The copy parameter is to be used to remove copies from an input matrix. The last parameter returns all the computed norms. The function returns N, a float or an ndarray which is the normalized vector.

Let us look at the code for implementing this function:

#importing required modules
import numpy as np
from sklearn.preprocessing import normalize
#creating the matrix
M = np.arange(9) - 5
#reshaping the matrix
M=M.reshape((3,3))
#display the original matrix
print("original matrix=",M)
#calculating the norm of the matrix
N=normalize(M)
#displaying the result
print("the matrix norm is=",N)

The output of the above code would be:

original matrix= [[-5 -4 -3]
 [-2 -1  0]
 [ 1  2  3]]
the matrix norm is= [[-0.70710678 -0.56568542 -0.42426407]
 [-0.89442719 -0.4472136   0.        ]
 [ 0.26726124  0.53452248  0.80178373]]
Scikit Learn Normalization
Scikit Learn Normalization.

Also read: Machine Learning In Python – An Easy Guide For Beginner’s

Summary

This article describes how we can easily normalize a huge matrix into required unit vectors by the method of normalization. Normalization is an important process when dealing with huge amounts of diverse data. Normalization of matrices are done in model training and machine learning algorithms to increase the pace of learning. Three ways can be used to normalize matrices, and those are described in detail above. You can learn about the mathematical side of matrix normalization.