Arnav Khare
All things interesting...
Friday, 18 December 2009
Thursday, 17 December 2009
Saturday, 12 December 2009
Friday, 4 December 2009
Just remembered the time when double-decker buses used to seem so cool...
how cool it was to have stairs and a whole other floor inside a bus... :)
Wednesday, 2 December 2009
Saturday, 31 October 2009
Multi-threading in Python and GIL
Ever wonder why the hell doesn't your multi-threaded Python program work on all the cores of your CPU?
I just ran across this great presentation by David Beazley about the inner workings of the Python Global Interpreter Lock (GIL). I always knew the threads library in Python had problems, but....
it is worse that you could imagine... :(
SUMMARY
Python creates its threads as PThreads (POSIX threads) on the base operating system (eg. Linux, Windows, etc). Python does not do the heavy lifting of the threads itself. So, it does not manage its own Thread Scheduling (like Round Robin, Priority-based scheduling, etc). Yes, Python does not have any thread scheduling! All it does is, maintain some light weight state info about each thread created, and data on the current thread running...
GIL:
The GIL is basically just a lock on a C variable called "_PyTheadState_Current", which points to the TheadState structure for the currently running thread. Every Python thread has to acquire a lock on this variable to run...
The lock itself is implemented simply as either of these:
Now, the Python Interpreter (PI) "checks" (stops the currently running thread) every 100 intervals or "ticks" and releases the lock of the currently running thread temporarily (for just a microsecond). And then in the very next C statement (literally), it re-acquires it... This gives the OS just a small window to give the lock to someone else...
The "ticks" themselves are not time-slice based, but correspond to Python byte-code instructions. These ticks are not interrupt-able. The lock "release" and "acquire" are modelled as Signals to the OS. So, for every check there is a signal.
MultiCore Scenario:
In the 2 Thread/1 CPU case, this works fine as the OS can pass on the signal to the other thread (after applying its scheduling algorithm), wake it up and let it acquire the GIL. This was how it was intended when the GIL was written in to Python.
But on the 2 Thread/2 CPU case, even though the OS can try to run the second thread on the 2nd CPU, the 2nd thread is not able to acquire the lock, as by the time it receives the Signal, the first thread on the 1st CPU has reacquired the lock... :( So, the 2nd thread just keeps on waiting (and waiting)...
In this case, the two threads are actually slugging it out to get the hold of the GIL... And the I/O bound thread would usually lose (even though it should have a higher priority)...
Great explanation, watch the video...
I just ran across this great presentation by David Beazley about the inner workings of the Python Global Interpreter Lock (GIL). I always knew the threads library in Python had problems, but....
it is worse that you could imagine... :(
SUMMARY
Python creates its threads as PThreads (POSIX threads) on the base operating system (eg. Linux, Windows, etc). Python does not do the heavy lifting of the threads itself. So, it does not manage its own Thread Scheduling (like Round Robin, Priority-based scheduling, etc). Yes, Python does not have any thread scheduling! All it does is, maintain some light weight state info about each thread created, and data on the current thread running...
GIL:
The GIL is basically just a lock on a C variable called "_PyTheadState_Current", which points to the TheadState structure for the currently running thread. Every Python thread has to acquire a lock on this variable to run...
The lock itself is implemented simply as either of these:
- PThread conditional variables
- POSIX semaphores
Now, the Python Interpreter (PI) "checks" (stops the currently running thread) every 100 intervals or "ticks" and releases the lock of the currently running thread temporarily (for just a microsecond). And then in the very next C statement (literally), it re-acquires it... This gives the OS just a small window to give the lock to someone else...
The "ticks" themselves are not time-slice based, but correspond to Python byte-code instructions. These ticks are not interrupt-able. The lock "release" and "acquire" are modelled as Signals to the OS. So, for every check there is a signal.
MultiCore Scenario:
In the 2 Thread/1 CPU case, this works fine as the OS can pass on the signal to the other thread (after applying its scheduling algorithm), wake it up and let it acquire the GIL. This was how it was intended when the GIL was written in to Python.
But on the 2 Thread/2 CPU case, even though the OS can try to run the second thread on the 2nd CPU, the 2nd thread is not able to acquire the lock, as by the time it receives the Signal, the first thread on the 1st CPU has reacquired the lock... :( So, the 2nd thread just keeps on waiting (and waiting)...
In this case, the two threads are actually slugging it out to get the hold of the GIL... And the I/O bound thread would usually lose (even though it should have a higher priority)...
Great explanation, watch the video...
Labels:
gil,
interpreter,
multithreading,
python,
scheduling,
threads
Subscribe to:
Posts (Atom)