2009-04-15 12:09:00
You can grab the 0.6.9 release in tar or OSX portfile format from the download page.
- Fix setup.py generating null bytes in const.py
- Add 30 second timeout to canto-fetch
- Make selection data persist for hooks / filters
- Unset signals before exit (avoid shell garbage)
- Set User-Agent to Canto/x.y.z (fixes some 403'd feeds)
- Fix multiple c-f subtle corruption bug
- Sync docs now that site runs out of git.
I wasn't really planning on putting out another bugfix release, but so it goes. This release is a combination of fixes and minor improvements.
The 30 second timeout was implemented to keep canto-fetch from hanging on non-responsive feeds for a long long time. I think that 30 seconds is long enough since usually we're talking about a handful of kB per feed and even grabbing that over a (now practically non-existent) dial-up line or a torrent saturated broadband line shouldn't take that long.
Unsetting the signals keeps the shell from printing "zsh: alarm canto" after you exit sometimes.
The User-Agent addition has been a long time coming, especially since it's essentially a one-liner to add. I just never had any reason to mess with it since converting canto to feedparser. Then I discovered that urllib2 is discriminated against by a number of sites, particularly Wikipedia, so I changed it to "Canto/x.y.z"
The last thing I'll mention is the subtle corruption bug. 99% of you would probably never run into it. It took me running 12 simultaneous synchronized canto-fetch daemons trying to update every one of my feeds every minute to get the bug to regularly reproduce, but if you've ever had Canto magically forget an entire feed's worth of item state, this was probably the culprit. However, I will say that if you're running multiple canto-fetch daemons at a time over the same feed directory, you might want to stop that. Not because it won't work (because it should and does) but just because it's a waste. This is why I initially designed canto-fetch to be run as a cron-job and not a daemon.
Now, before you ask why oh why I'm doing homebrew Python disk storage when I could be using sqlite or something, let me explain. The first thing is that cPickle over a real database gives me Python objects right off of the bat, no populating, no encoding changes, I can literally read and write the feed state to disk with a single call to the cPickle module after I get a lock on the file. There is no restructuring of the data each time. I imagine this could be achieved by storing the pickled data in the sqlite database, but at that point you're just using sqlite to handle locking for you. The second thing is that there is no advanced querying going on. Every operation would get every item from a feed (table). If I wanted to use a SQL-like query language for filters, etc. this would be a different story. Lastly, I just don't fucking like SQL.
Anyway, now that that's straightened out, have fun! I'm back to working on new stuff, rather than fixing old stuff.