Multithreading in Python
Multithreading is similar to running multiple different programs simultaneously. Running multithreading in Python offers the following advantages:
Threads allow long-duration tasks in programs to be processed in the background.
The user interface can be more attractive; for example, when a user clicks a button to trigger certain events, a progress bar can be displayed to show the process.
The speed of program execution can increase.
Threads are useful in situations such as waiting for user input, file reading/writing, and network data transmission. In these cases, some precious resources, such as memory, can be freed.
Each thread has its own entry point, execution sequence, and exit, but threads cannot run independently and must be controlled by an application that manages multiple threads.
Each thread has its own set of CPU registers, referred to as the thread’s context. This context reflects the CPU registers' state the last time the thread was running.
The instruction pointer and stack pointer registers are the two most important registers in the thread context. Threads always run in the context of the process, using these addresses to mark the memory within the process’s address space that owns the thread.
Threads can be preempted (interrupted). Threads can be temporarily suspended (also known as sleeping) while other threads are running—this is called thread yielding.
Threads can be divided into:
Kernel threads: Created and managed by the operating system kernel.
User threads: Created within user programs without kernel support.
In Python 3, two commonly used threading modules are:
_thread
threading
(recommended)
The thread
module has been deprecated. Users should use the threading
module instead. For compatibility, Python 3 renames thread
to _thread
.
Getting Started with Python Threads
There are two ways to use threads in Python: by using functions or by encapsulating thread objects in classes.
Function-Based Threads
To create a new thread, call the start_new_thread()
function from the _thread
module. The syntax is as follows:
_thread.start_new_thread(function, args[, kwargs])
function: The thread function.
args: The arguments passed to the thread function, which must be a tuple.
kwargs: Optional keyword arguments.
Example
#!/usr/bin/python3 import _thread import time # Define a function for the thread def print_time(threadName, delay): count = 0 while count < 5: time.sleep(delay) count += 1 print("%s: %s" % (threadName, time.ctime(time.time()))) # Create two threads try: _thread.start_new_thread(print_time, ("Thread-1", 2,)) _thread.start_new_thread(print_time, ("Thread-2", 4,)) except: print("Error: Unable to start thread") while 1: pass
The output of the above program will be as follows:
Thread-1: Wed Jan 5 17:38:08 2024 Thread-2: Wed Jan 5 17:38:10 2024 Thread-1: Wed Jan 5 17:38:10 2024 Thread-1: Wed Jan 5 17:38:12 2024 Thread-2: Wed Jan 5 17:38:14 2024 ...
You can press ctrl-c
to exit the program.
Threading Module
Python 3 provides threading support through the _thread
and threading
modules.
The _thread
module offers a lower-level interface, while threading
provides a higher-level interface with additional functionality:
threading.current_thread()
: Returns the current thread object.threading.enumerate()
: Returns a list of all active threads.threading.active_count()
: Returns the number of active threads.
To create a thread, use the threading.Thread()
class:
threading.Thread(target, args=(), kwargs={}, daemon=None)
Example
import threading import time def print_numbers(): for i in range(5): time.sleep(1) print(i) # Create thread thread = threading.Thread(target=print_numbers) # Start thread thread.start() # Wait for thread to finish thread.join()
The output will be:
0 1 2 3 4
Creating a Thread by Inheriting from threading.Thread
You can create a new subclass of threading.Thread
and override the run()
method.
Example
#!/usr/bin/python3 import threading import time exitFlag = 0 class MyThread(threading.Thread): def __init__(self, threadID, name, delay): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.delay = delay def run(self): print("Starting thread:", self.name) print_time(self.name, self.delay, 5) print("Exiting thread:", self.name) def print_time(threadName, delay, counter): while counter: if exitFlag: threadName.exit() time.sleep(delay) print("%s: %s" % (threadName, time.ctime(time.time()))) counter -= 1 # Create new threads thread1 = MyThread(1, "Thread-1", 1) thread2 = MyThread(2, "Thread-2", 2) # Start new threads thread1.start() thread2.start() thread1.join() thread2.join() print("Exiting main thread")
Thread Synchronization
When multiple threads modify the same data concurrently, unpredictable results can occur. To ensure data accuracy, synchronization between multiple threads is required.
Using the Lock
and RLock
objects from the Thread
class can achieve simple thread synchronization. Both objects have acquire
and release
methods. For data that should only be operated on by one thread at a time, place the operations between the acquire
and release
methods, as shown below:
The advantage of multithreading is that it allows multiple tasks to run simultaneously (or at least it seems that way). However, when threads need to share data, data synchronization issues may arise.
Consider this scenario: All elements in a list are initially set to 0. The "set" thread changes all elements from the end of the list to 1, while the "print" thread reads the list from the beginning and prints the elements.
If the "print" thread prints while the "set" thread is modifying the list, the output may be a mix of 0s and 1s, demonstrating data synchronization issues. To avoid this, the concept of a lock is introduced.
A lock has two states—locked and unlocked. Whenever a thread, like "set", needs to access shared data, it must first acquire the lock. If another thread, like "print", already holds the lock, the "set" thread will pause and wait for the lock to be released, i.e., it enters synchronized blocking. Once the "print" thread finishes and releases the lock, the "set" thread can continue its work.
This process ensures that when printing the list, the output will either be all 0s or all 1s, avoiding the inconsistent results.
Example
#!/usr/bin/python3 import threading import time class myThread(threading.Thread): def __init__(self, threadID, name, delay): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.delay = delay def run(self): print("Starting thread: " + self.name) # Acquire lock for thread synchronization threadLock.acquire() print_time(self.name, self.delay, 3) # Release lock for the next thread threadLock.release() def print_time(threadName, delay, counter): while counter: time.sleep(delay) print("%s: %s" % (threadName, time.ctime(time.time()))) counter -= 1 threadLock = threading.Lock() threads = [] # Create new threads thread1 = myThread(1, "Thread-1", 1) thread2 = myThread(2, "Thread-2", 2) # Start new threads thread1.start() thread2.start() # Add threads to the thread list threads.append(thread1) threads.append(thread2) # Wait for all threads to complete for t in threads: t.join() print("Exiting the main thread")
Output:
Starting thread: Thread-1 Starting thread: Thread-2 Thread-1: Wed Jan 5 17:36:50 2024 Thread-1: Wed Jan 5 17:36:51 2024 Thread-1: Wed Jan 5 17:36:52 2024 Thread-2: Wed Jan 5 17:36:54 2024 Thread-2: Wed Jan 5 17:36:56 2024 Thread-2: Wed Jan 5 17:36:58 2024 Exiting the main thread
Thread Priority Queue (Queue)
Python's Queue
module provides synchronized, thread-safe queue classes, including FIFO (First In First Out) Queue
, LIFO (Last In First Out) LifoQueue
, and PriorityQueue
.
These queues implement locking mechanisms and can be directly used in multithreading to achieve thread synchronization.
Common Methods in the Queue
Module:
Queue.qsize()
: Returns the size of the queue.Queue.empty()
: ReturnsTrue
if the queue is empty, otherwiseFalse
.Queue.full()
: ReturnsTrue
if the queue is full, otherwiseFalse
.Queue.full
corresponds to themaxsize
parameter.Queue.get([block[, timeout]])
: Retrieves an item from the queue; optionaltimeout
specifies wait time.Queue.get_nowait()
: Equivalent toQueue.get(False)
.Queue.put(item)
: Writes an item to the queue; optionaltimeout
specifies wait time.Queue.put_nowait(item)
: Equivalent toQueue.put(item, False)
.Queue.task_done()
: Signals that a task has been completed after processing an item from the queue.Queue.join()
: Blocks until all items in the queue have been processed.
Example
#!/usr/bin/python3 import queue import threading import time exitFlag = 0 class myThread(threading.Thread): def __init__(self, threadID, name, q): threading.Thread.__init__(self) self.threadID = threadID self.name = name self.q = q def run(self): print("Starting thread: " + self.name) process_data(self.name, self.q) print("Exiting thread: " + self.name) def process_data(threadName, q): while not exitFlag: queueLock.acquire() if not workQueue.empty(): data = q.get() queueLock.release() print("%s processing %s" % (threadName, data)) else: queueLock.release() time.sleep(1) threadList = ["Thread-1", "Thread-2", "Thread-3"] nameList = ["One", "Two", "Three", "Four", "Five"] queueLock = threading.Lock() workQueue = queue.Queue(10) threads = [] threadID = 1 # Create new threads for tName in threadList: thread = myThread(threadID, tName, workQueue) thread.start() threads.append(thread) threadID += 1 # Fill the queue queueLock.acquire() for word in nameList: workQueue.put(word) queueLock.release() # Wait for the queue to be emptied while not workQueue.empty(): pass # Notify threads to exit exitFlag = 1 # Wait for all threads to complete for t in threads: t.join() print("Exiting the main thread")
Output:
Starting thread: Thread-1 Starting thread: Thread-2 Starting thread: Thread-3 Thread-3 processing One Thread-1 processing Two Thread-2 processing Three Thread-3 processing Four Thread-1 processing Five Exiting thread: Thread-3 Exiting thread: Thread-2 Exiting thread: Thread-1 Exiting the main thread