Multiprocessing for loop in Python

Today I was working on a project where I had to run a for loop on top of a huge variable. Since the length of that variable was huge, that for loop was taking enormous amount of time to complete. So I thought to parallelize that for loop using multiprocessing technique in Python.

Before we start, those who are new to the concept of Multiprocessing, I will recommend you to read this article: Python Multithreading vs Multiprocessing.

Course for You: Concurrent and Parallel Programming in Python

How to use Multiprocessing in Python for loop?

Normal for loop in Python

For better explanation let me first write an simple example for loop first.

import datetime

# Store current time before execution
start_time = datetime.datetime.now()

# Run for loop
for i in range(100000000):
j = i**2

# Store current time after execution
end_time = datetime.datetime.now()

# Print for loop execution time
print('Time taken: ', end_time-start_time)
Time taken:  0:00:29.411523

In the above Python code, I am just squaring numbers for 100 million numbers (100000000). That means our for loop needs to run 100 million times. The code takes around 29 seconds to complete entire execution.

for loop with Function

Now can we parallelize this for loop to complete execution faster? Yes, we can do this in Python using multiprocessing. To do this we need to convert our task into a function.

In our example, our task is to square a number. So let’s now convert our task into a function. Then we can run that function inside our for loop. Below is the code.

import datetime

# Define function to square a number
def square_func(value):
return value ** 2

# Store current time before execution
start_time = datetime.datetime.now()

# Run for loop
for i in range(100000000):
square_func(i)

# Store current time after execution
end_time = datetime.datetime.now()

# Print for loop execution time
print('Time taken: ', end_time-start_time)
Time taken:  0:00:33.600378

So in above code, I just converted our task (square a number) into a function. Then I just passed this function to the same for loop and ran it for 100 million times or numbers.

Also Read:  Generate Synthetic Text Data with Faker in Python

Above code takes around 33 seconds to complete the entire for loop execution. When you check the Task Manager while running the code above, you will notice that only 16% of my processor was being used (It may vary to you. I am using intel i7 processor).

Multiprocessing for loop in Python

Now we can convert our for loop into a parallel processing code using Python multiprocessing. The idea is to run that function paralally in different cores of CPU or processor. So that we can fully utilize our CPU and complete execution faster.

import os
import datetime
from multiprocessing import Pool

# Define function to square a number
def square_func(value):
return value ** 2

# Store current time before execution
start_time = datetime.datetime.now()

if __name__ == '__main__':
# Create a pool to use all cpus
pool = Pool(processes=os.cpu_count())
pool.map(square_func, range(100000000))
# Close the process pool
pool.close()

# Store current time after execution
end_time = datetime.datetime.now()

# Print multi threading execution time
print('Time taken: ', end_time-start_time)
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:00
Time taken:  0:00:18.954894

As you can see in the above output, while we convert our for loop into multiprocessing, the execution time became almost half from the previous normal for loop execution.

In line 12 of the above Python multiprocessing code, we are putting if main statement. This is the best practice to put all your multiprocessing code inside if main statement in Python.

In line 14, we are creating process pool. In this line, we create a multiprocessing pool that will use all available CPU cores. The number of processes in the pool is set to the number of CPU cores on the system using os.cpu_count(). In my case number of core was 12.

Also Read:  Rename columns in R

If you do not want to fully utilize your CPU, you can put 2, 3, 4, etc. inside Pool() function instead of os.cpu_count().

Line 15, uses the map method of the pool object to distribute the square_func function across the range of numbers from 0 to 99,999,999 (100 million) in a parallel fashion. Each number is squared using a separate process.

After all the work is done in line 17, we are closing the process to free up system resources.

If we run the above code, now our CPU is getting fully (100%) utilized. Below is the screenshot for your reference.

Final Note

In this tutorial, I showed you how you can parallelize a for loop using multiprocessing technique in Python. I just gave you a sample code to demonstrate how multiprocessing can speed up the execution of a for loop in Python.

Our multiprocessing code takes almost half execution time than normal for loop. In real-world scenarios, you may notice a significant reduction in execution time. It is always a good idea to use multiprocessing when dealing with large for loops in Python.

While implementing multiprocessing, you may sometime get Memory Error. This is because your multiprocessing code trying to utilize huge amount of Memory (RAM) to allocate all of its processes. To solve this error you can use maxtasksperchild parameter to restart individual worker processes after a fixed number of tasks. Below is an example code snippet.

pool = Pool(processes=4, maxtasksperchild=50)

This is it for this tutorial. If you have any questions or suggestions, please let me know in the comment section below. If you are new to Python I will suggest you to learn Python by taking this Udemy course: Learn Python in 100 days of coding.