Vectorization in Python

In this article, we’ll be learning about Vectorization. Many complex systems nowadays deal with a large amount of data. Processing such a large amount of data in python can be slow as compared to other languages like C/C++. This is where vectorization comes into play. In this tutorial, we will learn about vectorizing operations on arrays in NumPy that speed up the execution of Python programs by comparing their execution time.

Vectorization is a technique of implementing array operations without using for loops. Instead, we use functions defined by various modules which are highly optimized that reduces the running and execution time of code. Vectorized array operations will be faster than their pure Python equivalents, with the biggest impact in any kind of numerical computations.

Python for-loops are slower than their C/C++ counterpart. Python is an interpreted language and most of the implementation is slow. The main reason for this slow computation comes down to the dynamic nature of Python and the lack of compiler level optimizations which incur memory overheads. NumPy being a C implementation of arrays in Python provides vectorized actions on NumPy arrays.

Vectorized Operations using NumPy

1. Add/Subtract/Multiply/Divide by Scalar

Addition, Subtraction, Multiplication, and Division of an array by a scalar quantity result in an array of the same dimensions while updating all the elements of the array with a given scalar. We apply this operation just like we do with variables. The code is both small and fast as compared to for-loop implementation.

To calculate the execution time, we will use Timer class present in timeit module which takes the statement to execute, and then call timeit() method that takes how many times to repeat the statement. Note that the output computation time is not exactly the same always and depends on the hardware and other factors.

import numpy as np
from timeit import Timer

# Creating a large array of size 10**6
array = np.random.randint(1000, size=10**6)

# method that adds elements using for loop
def add_forloop():
  new_array = [element + 1 for element in array]

# method that adds elements using vectorization
def add_vectorized():
  new_array = array + 1

# Finding execution time using timeit
computation_time_forloop = Timer(add_forloop).timeit(1)
computation_time_vectorized = Timer(add_vectorized).timeit(1)

print("Computation time is %0.9f using for-loop"%execution_time_forloop)
print("Computation time is %0.9f using vectorization"%execution_time_vectorized)

Computation time is 0.001202600 using for-loop
Computation time is 0.000236700 using vectorization

2. Sum and Max of array

For finding the sum and maximum element in an array, we can use for loop as well as python built-in methods sum() and max() respectively. Lets compare both of these ways with numpy operations.

import numpy as np
from timeit import Timer

# Creating a large array of size 10**5
array = np.random.randint(1000, size=10**5)

def sum_using_forloop():
  sum_array=0
  for element in array:
    sum_array += element

def sum_using_builtin_method():
  sum_array = sum(array)

def sum_using_numpy():
  sum_array = np.sum(array)

time_forloop = Timer(sum_using_forloop).timeit(1)
time_builtin = Timer(sum_using_builtin_method).timeit(1)
time_numpy = Timer(sum_using_numpy).timeit(1)

print("Summing elements takes %0.9f units using for loop"%time_forloop)
print("Summing elements takes %0.9f units using builtin method"%time_builtin)
print("Summing elements takes %0.9f units using numpy"%time_numpy)

print()

def max_using_forloop():
  maximum=array[0]
  for element in array:
    if element > maximum:
      maximum = element

def max_using_builtin_method():
  maximum = max(array)

def max_using_numpy():
  maximum = np.max(array)

time_forloop = Timer(max_using_forloop).timeit(1)
time_builtin = Timer(max_using_built-in_method).timeit(1)
time_numpy = Timer(max_using_numpy).timeit(1)

print("Finding maximum element takes %0.9f units using for loop"%time_forloop)
print("Finding maximum element takes %0.9f units using built-in method"%time_builtin)
print("Finding maximum element takes %0.9f units using numpy"%time_numpy)

Summing elements takes 0.069638600 units using for loop
Summing elements takes 0.044852800 units using builtin method
Summing elements takes 0.000202500 units using numpy

Finding maximum element takes 0.034151200 units using for loop
Finding maximum element takes 0.029331300 units using builtin method
Finding maximum element takes 0.000242700 units using numpy

Here we can see numpy operations are way faster than built-in methods which are faster than for loops.

3. Dot product

Also known as Inner Product, the dot product of two vectors is an algebraic operation that takes two vectors of the same length and returns a single scalar quantity. It is calculated as a sum of the element-wise product of both vectors. In terms of a matrix, given 2 matrices a and b of size nx1, the dot product is done by taking the transpose of the first matrix and then mathematical matrix multiplication of a^T(transpose of a) and b.

In NumPy, we use dot() method to find dot product of 2 vectors as shown below.

import numpy as np
from timeit import Timer

# Create 2 vectors of same length
length = 100000
vector1 = np.random.randint(1000, size=length)
vector2 = np.random.randint(1000, size=length)

# Finds dot product of vectors using for loop
def dotproduct_forloop():
  dot = 0.0
  for i in range(length):
    dot += vector1[i] * vector2[i]

# Finds dot product of vectors using numpy vectorization
def dotproduct_vectorize():
  dot = np.dot(vector1, vector2)
  

# Finding execution time using timeit
time_forloop = Timer(dotproduct_forloop).timeit(1)
time_vectorize = Timer(dotproduct_vectorize).timeit(1)

print("Finding dot product takes %0.9f units using for loop"%time_forloop)
print("Finding dot product takes %0.9f units using vectorization"%time_vectorize)

Finding dot product takes 0.155011500 units using for loop
Finding dot product takes 0.000219400 units using vectorization

4. Outer Product

The outer Product of two vectors produces a rectangular matrix. Given 2 vectors a and b of size nx1 and mx1, the outer product of these vectors results in a matrix of size nxm.

In NumPy, we use outer() method to find outer product of 2 vectors as shown below.

import numpy as np
from timeit import Timer

# Create 2 vectors of same length
length1 = 1000
length2 = 500
vector1 = np.random.randint(1000, size=length1)
vector2 = np.random.randint(1000, size=length2)

# Finds outer product of vectors using for loop
def outerproduct_forloop():
  outer_product = np.zeros((length1, length2), dtype='int')
  for i in range(length1):
    for j in range(length2):
      outer_product[i, j] = vector1[i] * vector2[j]

# Finds outer product of vectors using numpy vectorization
def outerproduct_vectorize():
  outer_product = np.outer(vector1, vector2)
  
# Finding execution time using timeit
time_forloop = Timer(outerproduct_forloop).timeit(1)
time_vectorize = Timer(outerproduct_vectorize).timeit(1)

print("Finding outer product takes %0.9f units using for loop"%time_forloop)
print("Finding outer product takes %0.9f units using vectorization"%time_vectorize)

Finding outer product takes 0.626915200 units using for loop
Finding outer product takes 0.002191900 units using vectorization

5. Matrix Multiplication

Matrix Multiplication is an algebraic operation in which rows of the first matrix is multiplied by a column of the second matrix. For 2 matrices of dimensions p x q and r x s, a necessary condition is that q == r for 2 matrices to multiply. The resulting matrix after multiplication will have dimension p x s.

Matrix Multiplication is widely used operation in mathematical models like Machine Learning. Computing matrix multiplication is a computationally costly operation and requires fast processing for systems to execute quickly. In NumPy, we use matmul() method to find matrix multiplication of 2 matrices as shown below.

import numpy as np
from timeit import Timer

# Create 2 vectors of same length
n = 100
k = 50
m = 70
matrix1 = np.random.randint(1000, size=(n, k))
matrix2 = np.random.randint(1000, size=(k, m))

# Multiply 2 matrices using for loop
def matrixmultiply_forloop():
  product = np.zeros((n, m), dtype='int')
  for i in range(n):
    for j in range(m):
      for z in range(k):
        product[i, j] += matrix1[i, z] * matrix2[z, j]

# Multiply 2 matrices using numpy vectorization
def matrixmultiply_vectorize():
  product = np.matmul(matrix1, matrix2)
  
# Finding execution time using timeit
time_forloop = Timer(matrixmultiply_forloop).timeit(1)
time_vectorize = Timer(matrixmultiply_vectorize).timeit(1)

print("Multiplying matrices takes %0.9f units using for loop"%time_forloop)
print("Multiplying matrices takes %0.9f units using vectorization"%time_vectorize)

Multiplying matrices takes 0.777318300 units using for loop
Multiplying matrices takes 0.000984900 units using vectorization

6. Element Wise Product in Matrix

The element-wise product of two matrices is the algebraic operation in which each element of the first matrix is multiplied by its corresponding element in the second matrix. The dimension of the matrices should be the same.

In NumPy, we use * operator to find element wise product of 2 vectors as shown below.

import numpy as np
from timeit import Timer

# Create 2 vectors of same length
n = 500
m = 700
matrix1 = np.random.randint(1000, size=(n, m))
matrix2 = np.random.randint(1000, size=(n, m))

# Multiply 2 matrices using for loop
def multiplication_forloop():
  product = np.zeros((n, m), dtype='int')
  for i in range(n):
    for j in range(m):
      product[i, j] = matrix1[i, j] * matrix2[i, j]

# Multiply 2 matrices using numpy vectorization
def multiplication_vectorize():
  product = matrix1 * matrix2

# Finding execution time using timeit
time_forloop = Timer(multiplication_forloop).timeit(1)
time_vectorize = Timer(multiplication_vectorize).timeit(1)

print("Element Wise Multiplication takes %0.9f units using for loop"%time_forloop)
print("Element Wise Multiplication takes %0.9f units using vectorization"%time_vectorize)

Element Wise Multiplication takes 0.543777400 units using for loop
Element Wise Multiplication takes 0.001439500 units using vectorization

Conclusion

Vectorization is used widely in complex systems and mathematical models because of faster execution and less code size. Now you know how to use vectorization in python, you can apply this to make your project execute faster. So Congratulations!

Thanks for reading!