The `threading` Module in Python
This article explains the threading module in Python.
YouTube Video
The threading Module in Python
The threading module in Python is a standard library that supports multithreading programming. Using threads allows multiple processes to run concurrently, which can improve the performance of programs, especially in cases involving blocking operations like I/O waits. Due to Python's Global Interpreter Lock (GIL), the effectiveness of multithreading is limited for CPU-bound operations, but it works efficiently for I/O-bound operations.
The following sections explain the basics of using the threading module and how to control threads.
Basic Usage of Threads
Creating and Running Threads
To create a thread and perform concurrent processing, use the threading.Thread class. Specify the target function to create a thread and execute that thread.
1import threading
2import time
3
4# Function to be executed in a thread
5def worker():
6 print("Worker thread started")
7 time.sleep(2)
8 print("Worker thread finished")
9
10# Create and start the thread
11thread = threading.Thread(target=worker)
12thread.start()
13
14# Processing in the main thread
15print("Main thread continues to run")
16
17# Wait for the thread to finish
18thread.join()
19print("Main thread finished")- In this example, the
workerfunction is executed in a separate thread, while the main thread continues to operate. By calling thejoin()method, the main thread waits for the sub-thread to complete.
Naming Threads
Giving threads meaningful names makes logging and debugging easier. You can specify it with the name argument.
1import threading
2import time
3
4# Function to be executed in a thread
5def worker():
6 print("Worker thread started")
7 time.sleep(2)
8 print("Worker thread finished")
9
10t = threading.Thread(
11 target=worker,
12 args=("named-worker", 0.3),
13 name="MyWorkerThread"
14)
15
16t.start()
17
18print("Active threads:", threading.active_count())
19for th in threading.enumerate():
20 print(" -", th.name)
21
22t.join()threading.enumerate()returns a list of current threads, which is useful for debugging and monitoring state.
Inheriting the Thread Class
If you want to customize the thread-executing class, you can define a new class by inheriting the threading.Thread class.
1import threading
2import time
3
4# Inherit from the Thread class
5class WorkerThread(threading.Thread):
6 def __init__(self, name, delay, repeat=3):
7 super().__init__(name=name)
8 self.delay = delay
9 self.repeat = repeat
10 self.results = []
11
12 def run(self):
13 for i in range(self.repeat):
14 msg = f"{self.name} step {i+1}"
15 print(msg)
16 self.results.append(msg)
17 time.sleep(self.delay)
18
19# Create and start the threads
20t1 = WorkerThread("Worker-A", delay=0.4, repeat=3)
21t2 = WorkerThread("Worker-B", delay=0.2, repeat=5)
22
23t1.start()
24t2.start()
25
26t1.join()
27t2.join()
28
29print("Results A:", t1.results)
30print("Results B:", t2.results)- In this example, the
run()method is overridden to define the thread's behavior, allowing each thread to maintain its own data. This is useful when threads perform complex processing or when you want each thread to have its own independent data.
Synchronization Between Threads
When multiple threads access shared resources simultaneously, data races may occur. To prevent this, the threading module provides several synchronization mechanisms.
Lock (Lock)
The Lock object is used to implement exclusive control of resources between threads. While one thread locks a resource, other threads cannot access that resource.
1import threading
2
3lock = threading.Lock()
4shared_resource = 0
5
6def worker():
7 global shared_resource
8 with lock: # Acquire the lock
9 local_copy = shared_resource
10 local_copy += 1
11 shared_resource = local_copy
12
13threads = [threading.Thread(target=worker) for _ in range(5)]
14
15for t in threads:
16 t.start()
17
18for t in threads:
19 t.join()
20
21print(f"Final value of shared resource: {shared_resource}")- In this example, five threads access a shared resource, but the
Lockis used to prevent multiple threads from modifying the data simultaneously.
Reentrant Lock (RLock)
If the same thread needs to acquire a lock multiple times, use an RLock (reentrant lock). This is useful for recursive calls or for library calls that might acquire locks across different calls.
1import threading
2
3rlock = threading.RLock()
4shared = []
5
6def outer():
7 with rlock:
8 shared.append("outer")
9 inner()
10
11def inner():
12 with rlock:
13 shared.append("inner")
14
15t = threading.Thread(target=outer)
16t.start()
17t.join()
18print(shared)- With
RLock, the same thread can reacquire a lock it already holds, which helps prevent deadlocks in nested lock acquisition.
Condition (Condition)
Condition is used for threads to wait until a specific condition is met. When a thread satisfies a condition, you can call notify() to notify another thread, or notify_all() to notify all waiting threads.
Below is an example of a producer and consumer using a Condition.
1import threading
2
3condition = threading.Condition()
4shared_data = []
5
6def producer():
7 with condition:
8 shared_data.append(1)
9 print("Produced an item")
10 condition.notify() # Notify the consumer
11
12def consumer():
13 with condition:
14 condition.wait() # Wait until the condition is met
15
16 item = shared_data.pop(0)
17 print(f"Consumed an item: {item}")
18
19# Create the threads
20producer_thread = threading.Thread(target=producer)
21consumer_thread = threading.Thread(target=consumer)
22
23consumer_thread.start()
24producer_thread.start()
25
26producer_thread.join()
27consumer_thread.join()- This code uses a
Conditionto have the producer notify when data is added, and the consumer waits for that notification before retrieving the data, achieving synchronization.
Thread Daemonization
Daemon threads are forcibly terminated when the main thread ends. While normal threads must wait to terminate, daemon threads terminate automatically.
1import threading
2import time
3
4def worker():
5 while True:
6 print("Working...")
7 time.sleep(1)
8
9# Create a daemon thread
10thread = threading.Thread(target=worker)
11thread.daemon = True # Set as a daemon thread
12
13thread.start()
14
15# Processing in the main thread
16time.sleep(3)
17print("Main thread finished")- In this example, the
workerthread is daemonized, so it is forcibly terminated when the main thread ends.
Thread Management with ThreadPoolExecutor
Apart from the threading module, you can use the ThreadPoolExecutor from the concurrent.futures module to manage a pool of threads and execute tasks in parallel.
1from concurrent.futures import ThreadPoolExecutor
2import time
3
4def worker(seconds):
5 print(f"Sleeping for {seconds} second(s)")
6 time.sleep(seconds)
7 return f"Finished sleeping for {seconds} second(s)"
8
9with ThreadPoolExecutor(max_workers=3) as executor:
10 futures = [executor.submit(worker, i) for i in range(1, 4)]
11 for future in futures:
12 print(future.result())ThreadPoolExecutorcreates a thread pool and efficiently processes tasks. Specify the number of threads to run concurrently withmax_workers.
Event Communication Between Threads
Using threading.Event, you can set flags between threads to notify other threads about the occurrence of an event.
1import threading
2import time
3
4event = threading.Event()
5
6def worker():
7 print("Waiting for event to be set")
8 event.wait() # Wait until the event is set
9
10 print("Event received, continuing work")
11
12thread = threading.Thread(target=worker)
13thread.start()
14
15time.sleep(2)
16print("Setting the event")
17event.set() # Set the event and notify the thread- This code demonstrates a mechanism where the worker thread waits for the
Eventsignal and resumes processing when the main thread callsevent.set().
Exception Handling and Thread Termination in Threads
When exceptions occur in threads, they are not directly propagated to the main thread, so a pattern is needed to capture and share exceptions.
1import threading
2import queue
3
4def worker(err_q):
5 try:
6 raise ValueError("Something bad")
7 except Exception as e:
8 err_q.put(e)
9
10q = queue.Queue()
11t = threading.Thread(target=worker, args=(q,))
12t.start()
13t.join()
14if not q.empty():
15 exc = q.get()
16 print("Worker raised:", exc)- By putting exceptions into a
Queueand retrieving them in the main thread, you can reliably detect failures. If you useconcurrent.futures.ThreadPoolExecutor, exceptions are rethrown withfuture.result(), making them easier to handle.
The GIL (Global Interpreter Lock) and Its Effects
Due to the mechanism of the GIL (Global Interpreter Lock) in CPython, multiple Python bytecodes do not actually run simultaneously within the same process. For tasks that are CPU-intensive, such as heavy computations, it is recommended to use multiprocessing. On the other hand, for I/O-bound tasks such as file reading or network communication, threading works effectively.
Summary
Using Python's threading module, you can implement multithreaded programs and execute multiple processes concurrently. With synchronization mechanisms like Lock and Condition, you can safely access shared resources and perform complex synchronization. In addition, using daemon threads or ThreadPoolExecutor, thread management and efficient parallel processing become easier.
You can follow along with the above article using Visual Studio Code on our YouTube channel. Please also check out the YouTube channel.