Thursday 17 December 2009

its snowing in Edinburgh! Yay!!!

I think its the first snow of the season... :D

Posted via email from arnav's posterous

Friday 4 December 2009

Wednesday 2 December 2009

Saturday 31 October 2009

Multi-threading in Python and GIL

Ever wonder why the hell doesn't your multi-threaded Python program work on all the cores of your CPU?

I just ran across this great presentation by David Beazley about the inner workings of the Python Global Interpreter Lock (GIL). I always knew the threads library in Python had problems, but....

it is worse that you could imagine... :(

SUMMARY


Python creates its threads as PThreads (POSIX threads) on the base operating system (eg. Linux, Windows, etc). Python does not do the heavy lifting of the threads itself. So, it does not manage its own Thread Scheduling (like Round Robin, Priority-based scheduling, etc). Yes, Python does not have any thread scheduling! All it does is, maintain some light weight state info about each thread created, and data on the current thread running...

GIL:
The GIL is basically just a lock on a C variable called "_PyTheadState_Current", which points to the TheadState structure for the currently running thread. Every Python thread has to acquire a lock on this variable to run...

The lock itself is implemented simply as either of these:
  • PThread conditional variables
  • POSIX semaphores

Now, the Python Interpreter (PI) "checks" (stops the currently running thread) every 100 intervals or "ticks" and releases the lock of the currently running thread temporarily (for just a microsecond). And then in the very next C statement (literally), it re-acquires it... This gives the OS just a small window to give the lock to someone else...

The "ticks" themselves are not time-slice based, but correspond to Python byte-code instructions. These ticks are not interrupt-able. The lock "release" and "acquire" are modelled as Signals to the OS. So, for every check there is a signal.

MultiCore Scenario:

In the 2 Thread/1 CPU case, this works fine as the OS can pass on the signal to the other thread (after applying its scheduling algorithm), wake it up and let it acquire the GIL. This was how it was intended when the GIL was written in to Python.

But on the 2 Thread/2 CPU case, even though the OS can try to run the second thread on the 2nd CPU, the 2nd thread is not able to acquire the lock, as by the time it receives the Signal, the first thread on the 1st CPU has reacquired the lock... :( So, the 2nd thread just keeps on waiting (and waiting)...

In this case, the two threads are actually slugging it out to get the hold of the GIL... And the I/O bound thread would usually lose (even though it should have a higher priority)...

Great explanation, watch the video...

Django!

I love Django! ;)