Multithreading in Python often seems slower compared to other languages like Java and C++. This is due to something called the Global Interpreter Lock (GIL). In this article, we'll understand what the GIL is, why it makes Python multithreading slow, and explore some workarounds to speed up parallel processing in Python.
What is the Global Interpreter Lock (GIL)
The CPython implementation of Python uses a construct called the Global Interpreter Lock or GIL. This lock prevents multiple threads from running Python bytecodes at once. Essentially, it makes CPython execution single-threaded.
At any point, only one thread holds the GIL and runs Python bytecodes. Other threads cannot start execution until the current thread releases the GIL. This serialized execution is why Python multithreading often seems slow.
The GIL was introduced in CPython to avoid problems with non-threadsafe memory management and bindings to non-threadsafe 3rd party libraries. It avoids race conditions where multiple threads access the same resource without synchronization.
Why the GIL Makes Multithreading Slow
The GIL causes 2 major performance issues for Python multithreading:
- Only one thread executes at a time: As we saw earlier, only one Python thread can execute bytecodes at once. Others have to wait until the current thread yields execution. This serialized execution limits parallelism and hurts performance.
- Cannot utilize multiple CPU cores: The GIL prevents multiple Python threads from running bytecodes in parallel. So even with multiple cores, only one core executes the Python process at a time. The GIL underutilizes available computing resources.
As a result, CPU-bound Python programs don't see much speedup from multithreading. The threads end up taking turns on one CPU core instead of utilizing all cores.
However, the GIL doesn't affect I/O-bound tasks as much. Threads can release the GIL when performing blocking I/O operations like file, network or database access. So threads performing a lot of I/O can run in parallel.
Workarounds to Speed up Python Multithreading
There are a few ways to work around the GIL and speed up parallel processing in Python:
1. Multi-processing
We can use multiple Python processes instead of threads. The GIL is limited to one process, so multiple processes can utilize multiple cores.
The
import multiprocessing
def worker(x):
# Do some work
return x * 2
if __name__ == "__main__":
pool = multiprocessing.Pool(processes=4)
inputs = [1, 2, 3, 4]
outputs = pool.map(worker, inputs)
print(outputs) # [2, 4, 6, 8] parallelly processed
So for CPU-bound work, multiprocessing is faster than multithreading.
2. Multi-threading for I/O-bound tasks
For I/O-bound tasks, we can still use threads for parallelism. As mentioned earlier, threads can release GIL while waiting for I/O. So for network, file and database access, multithreading works well.
Web scraping is an example of I/O-bound work. Threads can scrape multiple webpages in parallel by releasing GIL while sites load.
3. Multi-threading in external C/C++ libraries
We can implement CPU-intensive sections in external C/C++ libraries. These libraries release the GIL during execution.
So they can run computations in parallel across multiple threads. Python code calls these libraries for parallelism.
For instance, the Numerical Python (NumPy) library uses optimized multi-threaded C code internally. NumPy calls release the GIL, enabling good parallelism.
import numpy as np
a = np.random.rand(5,5)
b = np.random.rand(5,5)
# Parallel matrix multiplication using NumPy
c = np.dot(a, b)
4. Newer Python Versions
Upcoming Python versions may have better multithreading capabilities. Python 3.9 introduced the
There are also alternative Python implementations like Jython and IronPython that don't use the GIL. They show good parallel throughput for multithreading.
Key Takeaways
So while the GIL affects some use cases, there are workarounds to enable parallel processing where needed. Python continues to be a versatile language for building modern applications.