In this article, we will shine a light on a Python tool that can help the performance of our computation-intensive codes. We’ll explore the use of the delayed() function provided by the joblib library in Python. This function plays a crucial role in optimizing performance by allowing simultaneous task execution. Join us as we delve into the applications and benefits of using delayed() to increase the efficiency of your Python code.
The Python joblib.delayed() function is an integral tool for enhancing the performance of computation-intensive code by enabling simultaneous task execution. This function from the joblib library creates lazy or deferred function calls, commonly used with the Parallel class to distribute computations across multiple CPU cores or machines. The benefits of joblib.delayed() are evident in scenarios involving large data sets and resource-demanding functions, where it significantly reduces the execution time by running tasks in parallel.
An Introduction to Python’s joblib Library
You might have heard how machine learning models are trained on a large number of datasets to give effective results. But completing such computationally intensive tasks is not easy. Joblib helps us to run such tasks in parallel. It provides a set of functions for performing operations in parallel on large data sets and for caching the results of time/resource-taking functions.
Joblib also allows you to save the state of your computation, which includes trained machine learning, NLP, etc. models, allowing you to resume your work later or on a different machine.
Exploring the joblib’s delayed() Function
Definition: In Python’s
joblib library, the
delayed() function is used to create a lazy or deferred function call. It is commonly used in conjunction with the
Parallel class to parallelize computations across multiple CPU cores or machines.
The definition of the delayed function alone may be confusing. Let’s try to explain the usage by an example. Let’s say we have a time-consuming function square():
def square(x): time.sleep(1) #a time-taking task that takes 1 sec return x ** 2
The function simply calculates the square of a number but takes 1 sec of pause every time it’s called. Therefore, calling this function for 100 numbers would take approximately 100 seconds. Is there any way to reduce it?
Speeding Up Computations with joblib.delayed()
We will start by installing the joblib library in our systems
pip install joblib
Once the installation is done, create a Python file in your system and get started with importing the necessary libraries and their functions:
from joblib import Parallel, delayed import time
We will now define a list of numbers and perform the square() function on each element of the list, once using delayed() and once without that.
# List of numbers numbers = [1, 2, 3, 4, 5] # Without using delayed start = time.time() results_no_delayed =[square(number) for number in numbers] end = time.time() time_no_delayed = end - start # Using delayed start = time.time() delayed_calls = [delayed(square)(number) for number in numbers] results_delayed = Parallel(n_jobs=-1)(delayed_calls) end = time.time() time_delayed = end - start
numbersis a simple list of numbers 1 to 5
timelibrary we will measure how much time is taken in the execution of both cases
time.time()returns the current value of time.
Parallelis a function of Joblib library that helps the functions to be executed in parallel, You should note that it will only work if you have a list of pre-defined function calls or already delayed functions.
n_jobs=-1is a common usage pattern that instructs
Parallelto use all available CPU cores on the system.
[square(number) for number in numbers]simply performs the square function on the list and returns the result in a list
Tying It All Together: A Complete Example
At last, we will print our variables to see the visible difference. Our final code ready to execute is:
from joblib import Parallel, delayed import time def square(x): time.sleep(1) # Simulating a time-consuming task return x ** 2 # List of numbers numbers = [1, 2, 3, 4, 5] # Without using delayed start = time.time() results_no_delayed =[square(number) for number in numbers] end = time.time() time_no_delayed = end - start # Using delayed start = time.time() delayed_calls = [delayed(square)(number) for number in numbers] results_delayed = Parallel(n_jobs=-1)(delayed_calls) end = time.time() time_delayed = end - start print("Results without delayed:", results_no_delayed) print("Results with delayed: ", results_delayed) print("Time without delayed: ", time_no_delayed) print("Time with delayed: ", time_delayed)
Results without delayed: [1, 4, 9, 16, 25] Results with delayed: [1, 4, 9, 16, 25] Time without delayed: 5.013111114501953 Time with delayed: 4.319849491119385
By using parallel computation, we’ve managed to shave off 1 second of 20% of our computation time. Isn’t it amazing how a simple change can significantly boost computation speed? Parallel computation, enabled by joblib.delayed(), truly makes our Python programs faster and more efficient. Imagine what else you could achieve by optimizing your code. Are you ready to dive deeper and discover more ways to speed up your Python programs?
Further Reading: Explore More Concepts