Today I was working on a project where I had to run a for loop on top of a huge variable. Since the length of that variable was huge, that for loop was taking enormous amount of time to complete. So I thought to parallelize that for loop using multiprocessing technique in Python.
Before we start, those who are new to the concept of Multiprocessing, I will recommend you to read this article: Python Multithreading vs Multiprocessing.
Course for You: Concurrent and Parallel Programming in Python
How to use Multiprocessing in Python for loop?
Normal for loop in Python
For better explanation let me first write an simple example
for loop first.
import datetime # Store current time before execution start_time = datetime.datetime.now() # Run for loop for i in range(100000000): j = i**2 # Store current time after execution end_time = datetime.datetime.now() # Print for loop execution time print('Time taken: ', end_time-start_time)
Time taken: 0:00:29.411523
In the above Python code, I am just squaring numbers for 100 million numbers (100000000). That means our for loop needs to run 100 million times. The code takes around 29 seconds to complete entire execution.
for loop with Function
Now can we parallelize this for loop to complete execution faster? Yes, we can do this in Python using multiprocessing. To do this we need to convert our task into a function.
In our example, our task is to square a number. So let’s now convert our task into a function. Then we can run that function inside our for loop. Below is the code.
import datetime # Define function to square a number def square_func(value): return value ** 2 # Store current time before execution start_time = datetime.datetime.now() # Run for loop for i in range(100000000): square_func(i) # Store current time after execution end_time = datetime.datetime.now() # Print for loop execution time print('Time taken: ', end_time-start_time)
Time taken: 0:00:33.600378
So in above code, I just converted our task (square a number) into a function. Then I just passed this function to the same for loop and ran it for 100 million times or numbers.
Above code takes around 33 seconds to complete the entire for loop execution. When you check the Task Manager while running the code above, you will notice that only 16% of my processor was being used (It may vary to you. I am using intel i7 processor).
Multiprocessing for loop in Python
Now we can convert our for loop into a parallel processing code using Python multiprocessing. The idea is to run that function paralally in different cores of CPU or processor. So that we can fully utilize our CPU and complete execution faster.
import os import datetime from multiprocessing import Pool # Define function to square a number def square_func(value): return value ** 2 # Store current time before execution start_time = datetime.datetime.now() if __name__ == '__main__': # Create a pool to use all cpus pool = Pool(processes=os.cpu_count()) pool.map(square_func, range(100000000)) # Close the process pool pool.close() # Store current time after execution end_time = datetime.datetime.now() # Print multi threading execution time print('Time taken: ', end_time-start_time)
Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:00 Time taken: 0:00:18.954894
As you can see in the above output, while we convert our for loop into multiprocessing, the execution time became almost half from the previous normal for loop execution.
In line 12 of the above Python multiprocessing code, we are putting if main statement. This is the best practice to put all your multiprocessing code inside if main statement in Python.
In line 14, we are creating process pool. In this line, we create a multiprocessing pool that will use all available CPU cores. The number of processes in the pool is set to the number of CPU cores on the system using
os.cpu_count(). In my case number of core was 12.
If you do not want to fully utilize your CPU, you can put 2, 3, 4, etc. inside
Pool() function instead of
Line 15, uses the
map method of the
pool object to distribute the
square_func function across the range of numbers from 0 to 99,999,999 (100 million) in a parallel fashion. Each number is squared using a separate process.
After all the work is done in line 17, we are closing the process to free up system resources.
If we run the above code, now our CPU is getting fully (100%) utilized. Below is the screenshot for your reference.
In this tutorial, I showed you how you can parallelize a for loop using multiprocessing technique in Python. I just gave you a sample code to demonstrate how multiprocessing can speed up the execution of a for loop in Python.
Our multiprocessing code takes almost half execution time than normal for loop. In real-world scenarios, you may notice a significant reduction in execution time. It is always a good idea to use multiprocessing when dealing with large for loops in Python.
While implementing multiprocessing, you may sometime get Memory Error. This is because your multiprocessing code trying to utilize huge amount of Memory (RAM) to allocate all of its processes. To solve this error you can use
maxtasksperchild parameter to restart individual worker processes after a fixed number of tasks. Below is an example code snippet.
pool = Pool(processes=4, maxtasksperchild=50)
This is it for this tutorial. If you have any questions or suggestions, please let me know in the comment section below. If you are new to Python I will suggest you to learn Python by taking this Udemy course: Learn Python in 100 days of coding.
Hi there, I’m Anindya Naskar, Data Science Engineer. I created this website to show you what I believe is the best possible way to get your start in the field of Data Science.