The `multiprocessing` module in Python
This article explains the multiprocessing module in Python.
This article introduces practical tips for writing safe and efficient parallel processing code using the multiprocessing module.
YouTube Video
The multiprocessing module in Python
Basics: Why use multiprocessing?
multiprocessing enables parallelization on a process basis, so you can parallelize CPU-bound tasks without being restricted by Python's GIL (Global Interpreter Lock). For I/O-bound tasks, threading or asyncio may be simpler and more suitable.
Simple usage of Process
First, here is a basic example of running a function in a separate process using Process. This demonstrates how to start a process, wait for its completion, and pass arguments.
1# Explanation:
2# This example starts a separate process to run `worker` which prints messages.
3# It demonstrates starting, joining, and passing arguments.
4
5from multiprocessing import Process
6import time
7
8def worker(name, delay):
9 # English comment in code per user's preference
10 for i in range(3):
11 print(f"Worker {name}: iteration {i}")
12 time.sleep(delay)
13
14if __name__ == "__main__":
15 p = Process(target=worker, args=("A", 0.5))
16 p.start()
17 print("Main: waiting for worker to finish")
18 p.join()
19 print("Main: worker finished")- This code shows the flow where the main process launches a child process
workerand waits for its completion usingjoin(). You can pass arguments usingargs.
Simple parallelization with Pool (high-level API)
Pool.map is useful when you want to apply the same function to multiple independent tasks. It manages worker processes internally for you.
1# Explanation:
2# Use Pool.map to parallelize a CPU-bound function across available processes.
3# Good for "embarrassingly parallel" workloads.
4
5from multiprocessing import Pool, cpu_count
6import math
7import time
8
9def is_prime(n):
10 # Check primality (inefficient but CPU-heavy for demo)
11 if n < 2:
12 return False
13 for i in range(2, int(math.sqrt(n)) + 1):
14 if n % i == 0:
15 return False
16 return True
17
18if __name__ == "__main__":
19 nums = [10_000_000 + i for i in range(50)]
20 start = time.time()
21 with Pool(processes=cpu_count()) as pool:
22 results = pool.map(is_prime, nums)
23 end = time.time()
24 print(f"Found primes: {sum(results)} / {len(nums)} in {end-start:.2f}s")Poolcan automatically control the number of workers, andmapreturns results in the original order.
Interprocess communication: Producer/Consumer pattern using Queue
Queue is a First-In-First-Out (FIFO) queue that safely transfers objects between processes. Below are some typical patterns.
1# Explanation:
2# Demonstrates a producer putting items into a Queue
3# and consumer reading them.
4# This is useful for task pipelines between processes.
5
6from multiprocessing import Process, Queue
7import time
8import random
9
10def producer(q, n):
11 for i in range(n):
12 item = f"item-{i}"
13 print("Producer: putting", item)
14 q.put(item)
15 time.sleep(random.random() * 0.5)
16 q.put(None) # sentinel to signal consumer to stop
17
18def consumer(q):
19 while True:
20 item = q.get()
21 if item is None:
22 break
23 print("Consumer: got", item)
24 time.sleep(0.2)
25
26if __name__ == "__main__":
27 q = Queue()
28 p = Process(target=producer, args=(q, 5))
29 c = Process(target=consumer, args=(q,))
30 p.start()
31 c.start()
32 p.join()
33 c.join()
34 print("Main: done")Queueallows you to safely pass data between processes. It is common to use a special value such asNoneto signal termination.
Shared memory: Value and Array
You can use Value and Array when you want to share small numbers or arrays between processes. You need to use locks to avoid conflicts.
1# Explanation:
2# Use Value to share a single integer counter
3# and Array for a small numeric array.
4# Show how to use a Lock to avoid race conditions.
5
6from multiprocessing import Process, Value, Array, Lock
7import time
8
9def increment(counter, lock, times):
10 for _ in range(times):
11 with lock:
12 counter.value += 1
13
14def update_array(arr):
15 for i in range(len(arr)):
16 arr[i] = arr[i] + 1
17
18if __name__ == "__main__":
19 lock = Lock()
20 counter = Value('i', 0) # 'i' = signed int
21 shared_arr = Array('i', [0, 0, 0])
22
23 p1 = Process(target=increment, args=(counter, lock, 1000))
24 p2 = Process(target=increment, args=(counter, lock, 1000))
25 a = Process(target=update_array, args=(shared_arr,))
26
27 p1.start(); p2.start(); a.start()
28 p1.join(); p2.join(); a.join()
29
30 print("Counter:", counter.value)
31 print("Array:", list(shared_arr))ValueandArrayshare data between processes using lower-level mechanisms (shared memory at the C language level), not Python itself. Therefore, it is suitable for quickly reading and writing small amounts of data, but it is not suitable for handling large amounts of data..
Advanced sharing: Shared objects (dicts, lists) with Manager
If you want to use more flexible shared objects like lists or dictionaries, use Manager().
1# Explanation:
2# Manager provides proxy objects like dict/list
3# that can be shared across processes.
4# Good for moderate-size shared state and easier programming model.
5
6from multiprocessing import Process, Manager
7import time
8
9def worker(shared_dict, key, value):
10 shared_dict[key] = value
11
12if __name__ == "__main__":
13 with Manager() as manager:
14 d = manager.dict()
15 processes = []
16 for i in range(5):
17 p = Process(target=worker, args=(d, f"k{i}", i*i))
18 p.start()
19 processes.append(p)
20 for p in processes:
21 p.join()
22 print("Shared dict:", dict(d))Manageris convenient for sharing dictionaries and lists, but every access sends data between processes and requirespickleconversion. Therefore, frequently updating large amounts of data will slow down processing.
Synchronization mechanisms: How to use Lock and Semaphore
Use Lock or Semaphore to control concurrent access to shared resources. You can use them concisely with the with statement.
1# Explanation:
2# Demonstrates using Lock to prevent simultaneous access to a critical section.
3# Locks are necessary when shared resources are not atomic.
4
5from multiprocessing import Process, Lock, Value
6
7def safe_add(counter, lock):
8 for _ in range(10000):
9 with lock:
10 counter.value += 1
11
12if __name__ == "__main__":
13 lock = Lock()
14 counter = Value('i', 0)
15 p1 = Process(target=safe_add, args=(counter, lock))
16 p2 = Process(target=safe_add, args=(counter, lock))
17 p1.start(); p2.start()
18 p1.join(); p2.join()
19 print("Counter:", counter.value)- Locks prevent data races, but if the locked region is too large, parallel processing performance will decrease. Only the necessary parts should be protected as a critical section.
Differences between fork on UNIX and behavior on Windows
On UNIX systems, processes are duplicated using fork by default, making copy-on-write for memory efficient. Windows starts processes using spawn (which re-imports modules), so you need to take care with entry point protection and global initialization.
1# Explanation: Check start method (fork/spawn) and set it if needed.
2# Useful for debugging platform-dependent behavior.
3
4from multiprocessing import get_start_method, set_start_method
5
6if __name__ == "__main__":
7 print("Start method:", get_start_method())
8
9 # uncomment to force spawn on Unix for testing
10 # set_start_method('spawn')set_start_methodcan only be called once at the start of your program. It's safer not to change this arbitrarily within libraries.
Practical example: Benchmarking CPU-bound workloads (comparison)
Below is a script that simply compares how much faster processing can be with parallelization using multiprocessing. Here, we use Pool.
1# Explanation:
2# Compare sequential vs parallel execution times for CPU-bound task.
3# Helps understand speedup and overhead.
4
5import time
6from multiprocessing import Pool, cpu_count
7import math
8
9def heavy_task(n):
10 s = 0
11 for i in range(1, n):
12 s += math.sqrt(i)
13 return s
14
15def run_sequential(nums):
16 return [heavy_task(n) for n in nums]
17
18def run_parallel(nums):
19 with Pool(processes=cpu_count()) as p:
20 return p.map(heavy_task, nums)
21
22if __name__ == "__main__":
23 nums = [2000000] * 8 # heavy tasks
24 t0 = time.time()
25 run_sequential(nums)
26 seq = time.time() - t0
27 t1 = time.time()
28 run_parallel(nums)
29 par = time.time() - t1
30 print(f"Sequential: {seq:.2f}s, Parallel: {par:.2f}s")- This example shows that depending on task load and the number of processes, parallelization can sometimes be ineffective due to overhead. The larger (“heavier”) and more independent the tasks, the greater the benefit.
Important basic rules
Below are the basic points for using multiprocessing safely and efficiently.
- On Windows, modules are re-imported when child processes start, so you must protect your script's entry point with
if __name__ == "__main__":. - Inter-process communication is serialized (with
pickleconversion), so transferring large objects becomes costly. - Since
multiprocessingcreates processes, it's common to decide the number of processes based onmultiprocessing.cpu_count(). - Creating another
Poolwithin a worker becomes complex, so you should avoid nestingPoolinstances as much as possible. - Since exceptions occurring in child processes are hard to detect from the main process, it is necessary to explicitly implement logging and error handling.
- Set the number of processes according to the CPU, and consider using threads for I/O-bound tasks.
Practical design advice
Below are some useful concepts and patterns for designing parallel processing.
- It is efficient to separate processes into roles such as input reading (I/O), preprocessing (multi-CPU), and aggregation (serial) via 'pipelining.'.
- To simplify debugging, first check the operation in a single process before parallelizing.
- For logging, separate log outputs per process (e.g., include the PID in file names) to make isolating issues easier.
- Prepare retry and timeout mechanisms so you can safely recover even if a process hangs.
Summary (Key points you can use right away)
Parallel processing is powerful, but it's important to correctly judge the nature of tasks, data size, and inter-process communication cost. multiprocessing is effective for CPU-bound processing, but poor design or synchronization mistakes can reduce performance. If you follow the basic rules and patterns, you can build safe and efficient parallel programs.
You can follow along with the above article using Visual Studio Code on our YouTube channel. Please also check out the YouTube channel.