2009-06-16 15:34:47
First, the bad news. I didn't make as much progress over the weekend as I'd hoped. In fact, I really achieved only one thing from that whole post and it was a side-effect of some other developments.
I threaded the Canto interface not because I really wanted to take advantage of multiple cores or get a speedup in the actually task of filtering and sorting item. I threaded it to get the interface to be more responsive. To an extent, it worked. The interface became more responsive, but only because in implementing the thread model, I had to implement partial updates. It effectively removed the problem with long lists by allowing you to use the interface when only part of the total work was done.
I never really cared to ponder why, as a whole, performance didn't really work out until I read David Beazley's slides on the GIL (PDF). It really clicked with me why I was still running into trouble, even with the threads.
Now, the good news. I converted the entirety of the threading model to use processes, which avoids the GIL all together by communicating through traditional pipes. No GIL locking required. Worst case scenario is a moderate speed up (because the locking overhead is completely gone). Best case scenario is a hefty speed up brought on by today's multi-core processors. I've noticed a big difference on my Core2 even when one core is being dominated by another process.
There are still some issues in the implementation (interested parties can glance at the git log. But as a whole, the process based code is not that much different than the thread based code was. Everything is still protocolized but the protocol has to be entirely comprised of pickle-able objects.
I don't anticipate too many problems directly related to the new process based code, but it's possible. Also, you're going to take a hit in memory usage because the same set of objects has to be synced between the two processes, but the dumb process doesn't keep too much state by default so it's been minimized. I also don't think it's enough to outweigh the memory gains made by keeping more content on disk so overally the memory performance is worse than when threading, but should still be better than 0.6.x.
New dependency for 2.5
To ease the use of processes, Python 2.5 users will have to install python-processing. Fortunately, this module has been integrated into the Python standard library in 2.6+ as the "multiprocessing" module.
Testing
As always, the code is in the experimental branch. If you 'git' it, feel free to report on the mailing lists or #canto on irc.freenode.net.
Have fun, submit bugs!