Numpy linalg.lstsq - Return the least-squares solution to a linear matrix equation

The NumPy library in Python provides a powerful set of tools for numerical and scientific computing. One of the important functions in NumPy is the linalg.lstsq function, which solves the linear matrix equation using the least-squares method. This function is commonly used in a variety of applications such as regression analysis, curve fitting, and other machine learning tasks.

The linalg.lstsq function calculates the optimal solution for a given set of data points, making it a valuable tool for data analysis and modeling. In this guide, we will introduce the linalg.lstsq function and explain how it can be used to solve linear matrix equations in Python.

What is the Least square solution to a linear matrix equation?

If for some systems of linear equation, AX = B, where A is the coefficient matrix, X is the matrix with the unknown variables and B is the dependent or coordinate matrix, there exists no consistent or absolute solution, then an approximate solution is to be determined.

This approximation of the values of the unknown variables can be done using the least-square method. This method is also known as the “best-fit” method or the solutions from this obtained from this process are called “best-fit” solutions for the problems.

The least-square solution minimizes the distance or the sum of the differences between the entries of Ax^{^} and B, where x^{^} is the approximate value of the unknown matrix.

The numpy linalg.lstsq() function

The numpy or the numerical python module contains a vast range of functions that automate complicated scientific calculations for complex values.

One such function calculates the least square solution to a system of linear equations in the form of Ax = B which are inconsistent in nature or if the matrix A has full rank. This function is called the linalg.lstsq() function.

In order to use this function, it is necessary to have numpy installed in your system. If you don’t already have it, open your command prompt in administrator mode and run the following command:

pip install numpy

Sometimes you might face a problem called “could not build wheels for numpy” while installing numpy. It is caused due to the version mismatch between the python installed in your system and the version of numpy. To know how to solve this problem, click here.

Syntax and parameters of the linalg.lstsq() function

The syntax of the linalg.lstsq() function in python is as follows:

linalg.lstsq(A, B, rcond='warn')

The parameters of the function are:

A: (array_like) : The coefficient matrix.
B: (array_like) : The coordinate matrix. If this matrix is 2 dimensional then the least square solutions are calculated for each of the columns of B.
rcond(float, optional): This is used to nullify the smaller or lesser values in A. This parameter raises a FutureWarning if not specified, hence we can specify it to “None” so that the warning is suppressed.

The function returns four values. They are as follows:

R :(data_type=ndarray): This matrix contains the least square solutions of the system of linear equations.
residuals:(data_type=ndarray) This matrix contains the sum of the squared residuals.
RANK: (data_type=int) This variable represents the rank of the matrix returned by the function.
Sing:(data_type=ndarray) The singular values of matrix A are stored in this array.

The function raises LinAlgError if the computations diverge.

Also read: How to use Numpy Convolve in Python?

Examples of using numpy linalg.lstsq()

Let’s now look at the examples of using the function and how you can implement it for your own code.

Example 1: Using the numpy linalg.lstsq() function for a system of linear equations in 2 variables

Let’s take a look at one of the simplest applications of the mentioned function by running the following code in order to find out the approximate solution to some given system of linear equations.

#importing the required modules
import numpy as np
#assigning the arrays
#the coefficient matrix
A = np.array([[1, 2],
              [-1,1],
              [0,3]])
#the coordinate matrix
B = np.array([1,3,0])
#displaying the original matrices
print("The coefficient matrix is =",A)
print("the coordinate matrix is=",B)
#calculating the approximate values
R, residuals, RANK, sing = np.linalg.lstsq(A,B, rcond=None)
#displaying the results
print("the least square solutions are'",R)
print("the residuals are=",residuals)
print("The singular values of the coefficient matrix are=",sing)
print("the rank of the matrix is",RANK)

The output of the above block of code would be:

The coefficient matrix is = [[ 1  2]
 [-1  1]
 [ 0  3]]
the coordinate matrix is= [1 3 0]
the least square solutions are' [-1.22222222  0.44444444]
the residuals are= [5.33333333]
The singular values of the coefficient matrix are= [3.7527007  1.38464344]
the rank of the matrix is 2

The code and the output for the first example.

Example 2: Taking user input for a system of a linear equation and calculating the least square solution

Now, let’s modify the code to accommodate user input in order to make it more useful for individual users.

#importing the required modules
import numpy as np
# Taking user input for matrix
Row = int(input("Enter the number of rows for all the matrices:"))  
# Initializing the two matrices
A,B = [],[]
#for A  
Col = int(input("Enter the number of columns for A, the coefficient matrix:"))
print("Enter the entries for A row-wise:")

# For user input 1
for i in range(Row): #for loop for row entries
    en =[]
    for j in range(Col):#for loop for column entries
         en.append(int(input()))
    A.append(en)
print("the matrix A is =")
print(A)
#for B
print("since B is a singular matrix we don't need any extra input.")
print("Enter the entries for B row-wise:")
col1= 1 #since B is a singular matrix
# For user input 2
for i in range(Row): #for loop for row entries
    en1 =[]
    for j in range(col1):#for loop for column entries
         en1.append(int(input()))
    B.append(en1)
print("the matrix B is =")
print(B)
R, residuals, RANK, sing = np.linalg.lstsq(A,B, rcond=None)
#displaying the results
print("the least square solutions are'",R)
print("the residuals are=",residuals)
print("The singular values of the coefficient matrix are=",sing)
print("the rank of the matrix is",RANK)

The user input along with the output for the above code is given below:

Enter the number of rows for all the matrices:2
Enter the number of columns for A, the coefficient matrix:2
Enter the entries for A row-wise:
14
-3
-3
21
the matrix A is =
[[14, -3], [-3, 21]]
since B is a singular matrix we don't need any extra input.
Enter the entries for B row-wise:
1
10
the matrix B is =
[[1], [10]]
the least square solutions are' [[0.17894737]
 [0.50175439]]
the residuals are= []
The singular values of the coefficient matrix are= [22.10977223 12.89022777]
the rank of the matrix is 2

The code and output for the second example.

Conclusion

In this article, we have seen how inconsistent systems of linear equations can be solved using approximation techniques such as the least square method. The numpy module contains a function called linalg.lstsq() which makes the calculations associated with the least square method less daunting. To know more about the numpy.linalg.lstsq() function, click here.