The `threading` Module in Python

The `threading` Module in Python

This article explains the threading module in Python.

YouTube Video

The threading Module in Python

The threading module in Python is a standard library that supports multithreading programming. Using threads allows multiple processes to run concurrently, which can improve the performance of programs, especially in cases involving blocking operations like I/O waits. Due to Python's Global Interpreter Lock (GIL), the effectiveness of multithreading is limited for CPU-bound operations, but it works efficiently for I/O-bound operations.

The following sections explain the basics of using the threading module and how to control threads.

Basic Usage of Threads

Creating and Running Threads

To create a thread and perform concurrent processing, use the threading.Thread class. Specify the target function to create a thread and execute that thread.

 1import threading
 2import time
 3
 4# Function to be executed in a thread
 5def worker():
 6    print("Worker thread started")
 7    time.sleep(2)
 8    print("Worker thread finished")
 9
10# Create and start the thread
11thread = threading.Thread(target=worker)
12thread.start()
13
14# Processing in the main thread
15print("Main thread continues to run")
16
17# Wait for the thread to finish
18thread.join()
19print("Main thread finished")
  • In this example, the worker function is executed in a separate thread, while the main thread continues to operate. By calling the join() method, the main thread waits for the sub-thread to complete.

Naming Threads

Giving threads meaningful names makes logging and debugging easier. You can specify it with the name argument.

 1import threading
 2import time
 3
 4# Function to be executed in a thread
 5def worker():
 6    print("Worker thread started")
 7    time.sleep(2)
 8    print("Worker thread finished")
 9
10t = threading.Thread(
11    target=worker,
12    args=("named-worker", 0.3),
13    name="MyWorkerThread"
14)
15
16t.start()
17
18print("Active threads:", threading.active_count())
19for th in threading.enumerate():
20    print(" -", th.name)
21
22t.join()
  • threading.enumerate() returns a list of current threads, which is useful for debugging and monitoring state.

Inheriting the Thread Class

If you want to customize the thread-executing class, you can define a new class by inheriting the threading.Thread class.

 1import threading
 2import time
 3
 4# Inherit from the Thread class
 5class WorkerThread(threading.Thread):
 6    def __init__(self, name, delay, repeat=3):
 7        super().__init__(name=name)
 8        self.delay = delay
 9        self.repeat = repeat
10        self.results = []
11
12    def run(self):
13        for i in range(self.repeat):
14            msg = f"{self.name} step {i+1}"
15            print(msg)
16            self.results.append(msg)
17            time.sleep(self.delay)
18
19# Create and start the threads
20t1 = WorkerThread("Worker-A", delay=0.4, repeat=3)
21t2 = WorkerThread("Worker-B", delay=0.2, repeat=5)
22
23t1.start()
24t2.start()
25
26t1.join()
27t2.join()
28
29print("Results A:", t1.results)
30print("Results B:", t2.results)
  • In this example, the run() method is overridden to define the thread's behavior, allowing each thread to maintain its own data. This is useful when threads perform complex processing or when you want each thread to have its own independent data.

Synchronization Between Threads

When multiple threads access shared resources simultaneously, data races may occur. To prevent this, the threading module provides several synchronization mechanisms.

Lock (Lock)

The Lock object is used to implement exclusive control of resources between threads. While one thread locks a resource, other threads cannot access that resource.

 1import threading
 2
 3lock = threading.Lock()
 4shared_resource = 0
 5
 6def worker():
 7    global shared_resource
 8    with lock:  # Acquire the lock
 9        local_copy = shared_resource
10        local_copy += 1
11        shared_resource = local_copy
12
13threads = [threading.Thread(target=worker) for _ in range(5)]
14
15for t in threads:
16    t.start()
17
18for t in threads:
19    t.join()
20
21print(f"Final value of shared resource: {shared_resource}")
  • In this example, five threads access a shared resource, but the Lock is used to prevent multiple threads from modifying the data simultaneously.

Reentrant Lock (RLock)

If the same thread needs to acquire a lock multiple times, use an RLock (reentrant lock). This is useful for recursive calls or for library calls that might acquire locks across different calls.

 1import threading
 2
 3rlock = threading.RLock()
 4shared = []
 5
 6def outer():
 7    with rlock:
 8        shared.append("outer")
 9        inner()
10
11def inner():
12    with rlock:
13        shared.append("inner")
14
15t = threading.Thread(target=outer)
16t.start()
17t.join()
18print(shared)
  • With RLock, the same thread can reacquire a lock it already holds, which helps prevent deadlocks in nested lock acquisition.

Condition (Condition)

Condition is used for threads to wait until a specific condition is met. When a thread satisfies a condition, you can call notify() to notify another thread, or notify_all() to notify all waiting threads.

Below is an example of a producer and consumer using a Condition.

 1import threading
 2
 3condition = threading.Condition()
 4shared_data = []
 5
 6def producer():
 7    with condition:
 8        shared_data.append(1)
 9        print("Produced an item")
10        condition.notify()  # Notify the consumer
11
12def consumer():
13    with condition:
14        condition.wait()  # Wait until the condition is met
15
16        item = shared_data.pop(0)
17        print(f"Consumed an item: {item}")
18
19# Create the threads
20producer_thread = threading.Thread(target=producer)
21consumer_thread = threading.Thread(target=consumer)
22
23consumer_thread.start()
24producer_thread.start()
25
26producer_thread.join()
27consumer_thread.join()
  • This code uses a Condition to have the producer notify when data is added, and the consumer waits for that notification before retrieving the data, achieving synchronization.

Thread Daemonization

Daemon threads are forcibly terminated when the main thread ends. While normal threads must wait to terminate, daemon threads terminate automatically.

 1import threading
 2import time
 3
 4def worker():
 5    while True:
 6        print("Working...")
 7        time.sleep(1)
 8
 9# Create a daemon thread
10thread = threading.Thread(target=worker)
11thread.daemon = True  # Set as a daemon thread
12
13thread.start()
14
15# Processing in the main thread
16time.sleep(3)
17print("Main thread finished")
  • In this example, the worker thread is daemonized, so it is forcibly terminated when the main thread ends.

Thread Management with ThreadPoolExecutor

Apart from the threading module, you can use the ThreadPoolExecutor from the concurrent.futures module to manage a pool of threads and execute tasks in parallel.

 1from concurrent.futures import ThreadPoolExecutor
 2import time
 3
 4def worker(seconds):
 5    print(f"Sleeping for {seconds} second(s)")
 6    time.sleep(seconds)
 7    return f"Finished sleeping for {seconds} second(s)"
 8
 9with ThreadPoolExecutor(max_workers=3) as executor:
10    futures = [executor.submit(worker, i) for i in range(1, 4)]
11    for future in futures:
12        print(future.result())
  • ThreadPoolExecutor creates a thread pool and efficiently processes tasks. Specify the number of threads to run concurrently with max_workers.

Event Communication Between Threads

Using threading.Event, you can set flags between threads to notify other threads about the occurrence of an event.

 1import threading
 2import time
 3
 4event = threading.Event()
 5
 6def worker():
 7    print("Waiting for event to be set")
 8    event.wait()  # Wait until the event is set
 9
10    print("Event received, continuing work")
11
12thread = threading.Thread(target=worker)
13thread.start()
14
15time.sleep(2)
16print("Setting the event")
17event.set()  # Set the event and notify the thread
  • This code demonstrates a mechanism where the worker thread waits for the Event signal and resumes processing when the main thread calls event.set().

Exception Handling and Thread Termination in Threads

When exceptions occur in threads, they are not directly propagated to the main thread, so a pattern is needed to capture and share exceptions.

 1import threading
 2import queue
 3
 4def worker(err_q):
 5    try:
 6        raise ValueError("Something bad")
 7    except Exception as e:
 8        err_q.put(e)
 9
10q = queue.Queue()
11t = threading.Thread(target=worker, args=(q,))
12t.start()
13t.join()
14if not q.empty():
15    exc = q.get()
16    print("Worker raised:", exc)
  • By putting exceptions into a Queue and retrieving them in the main thread, you can reliably detect failures. If you use concurrent.futures.ThreadPoolExecutor, exceptions are rethrown with future.result(), making them easier to handle.

The GIL (Global Interpreter Lock) and Its Effects

Due to the mechanism of the GIL (Global Interpreter Lock) in CPython, multiple Python bytecodes do not actually run simultaneously within the same process. For tasks that are CPU-intensive, such as heavy computations, it is recommended to use multiprocessing. On the other hand, for I/O-bound tasks such as file reading or network communication, threading works effectively.

Summary

Using Python's threading module, you can implement multithreaded programs and execute multiple processes concurrently. With synchronization mechanisms like Lock and Condition, you can safely access shared resources and perform complex synchronization. In addition, using daemon threads or ThreadPoolExecutor, thread management and efficient parallel processing become easier.

You can follow along with the above article using Visual Studio Code on our YouTube channel. Please also check out the YouTube channel.

YouTube Video