daemon 0.9.6, curses 0.9.7

Before you know it, it’s been almost a year since the last release and there are a handful of bugfixes that need to be pushed out.

daemon 0.9.6

  • Fixed hanging file descriptors eventually killing the daemon (which fixes canto-remote misbehaving sometimes among other things)
  • Fixed occasional hang from writing to closed sockets
  • Better sync-inoreader behavior: killing unusable “???” items (me) and pushing changes back to Inoreader (thanks Fraterius)
  • SIGINT and SIGTERM paths are now identical, where before SIGTERM attempted to exit ASAP without regard to running threads
  • Daemon now dumps feed data into a temporary file and then moves it to replace the original data, to help prevent corruption if the daemon is killed in the middle of writing to the disk.
  • Daemon now marches on if it can’t use the feed data because of corruption, instead of requiring the user to delete it.
  • Minor Python 3.2 compatibility change (thanks Sathors)

curses 0.9.7

  • Fix tab completing empty command lines (closes issue 37)

I’ll get the Debian repos updated soon.

Repos updated, new pub key

Administering packages for a distro you don’t use is a clusterfuck. Anyway, someone kindly pointed out that my Debian sid packages were still built for Python 3.4, and they’ve sometime since moved up to Python 3.5. Of course I’m oblivious to this on Arch.

That’s not the problem though. It seems like the Python package to .deb path is constantly having little tweaks made to it such that it actually takes effort to support new versions of a distro for no other reason than whatever tool you’ve decided to use has fallen out of favor. Ugh. Give me a PKGBUILD already so I can escape the million stupid Debian only binaries I need to put this shit together. Could this process be any more baroque? I literally have to use a special utility just to update the fucking changelog because its format is so locked down.

Then, of course, the version of GnuPG that Debian variants are currently using is different than the version Arch has so suddenly my actual ~.gnupg breaks the entire signing process. Which is just as well because I lost the key I used to sign the old packages when my last laptop drive went south, but regardless now I have a separate directory of keys that are only for Debian that I have to keep track of.

Anyway, the repos have been updated. They now include the latest Ubuntu variants (wily and xenial) but they’ve been obviously very lightly tested.

You will have to re-import the repo public key:

curl http://codezen.org/static/canto-pub.gpg | sudo apt-key add -

daemon 0.9.5, curses 0.9.6

Yet more maintenance type fixes.

daemon 0.9.5

  • Minor cleanup to excessive debug info and error paths in sync-inoreader
  • The ItemLimit transform is now included by default, so you can, for example do “:set tag.transform ItemLimit(10)” to only show the first 10 items of the selected feed.

curses 0.9.6

  • Fix some old usage of tag_updater that caused weirdness like syncs happening when they shouldn’t and making your cursor jump.
  • The above also fixed direct usage of :transform/:filter/:sort
  • Fix waiting on pending configs (again) – hooks need to be called after the changes have been made.

daemon 0.9.4, curses 0.9.5

More fixes.

daemon 0.9.4

  • More fixes for sync-inoreader’s bad behavior with missing items (if you see ??? items that won’t go away, this is the fix)
  • Socket changes to avoid deadlocks when both server and client sockets are full.
  • Quiet exceptions on disconnects. These were mostly harmless, but scary in the log.
  • Minor changes to lock returns

curses 0.9.5

  • Fix some breakage and deadlocking caused by forcing threads to wait for written config changes to be processed (e.g. tag config changes like collapsed were broken in 0.9.4). Ironically my test suite actually caught some of the more obvious breakage, if only I’d actually run the fucking thing.
  • h/left and l/right are now bound, by default, to setting items read and unread respectively.

Short and sweet.

Thanks to everyone that submitted bugs.

daemon 0.9.3

Quick bugfix release for the daemon sync plugins.

  • sync-inoreader: stop leaving dead items
  • sync-rsync: update to new database format

Debian repos will be updated shortly.

daemon 0.9.2, curses 0.9.4

… And then, everything got faster. Analysis to follow the changelog.

daemon 0.9.2

  • Inoreader Sync. The sync-inoreader.py plugin landed, allowing you to synchronize with Inoreader. It requires the python3-requests package to be installed (most distros have a package for this). It also requires a real Inoreader account (not an OAuth Google/Facebook login). The details are given at the top of the plugin file. I enumerated some of the trade offs in the last post, but in short the plugin tries to give you access to the most items, so when items show up only in Inoreader data or the data canto gathered but not both, they’ll be displayed even though they’re not synchronized. This also means that if you’re synchronizing multiple canto-daemons with Inoreader, some items won’t be synchronized. For “perfect” synchronization of multiple cantos, sync-rsync.py is the better option.
  • XDG support. The default location of canto files is now $XDG_CONFIG_HOME/canto/ (which is usually ~/.config/canto). This is only relevant for fresh copies, if ~/.canto-ng exists, it will continue to be used.
  • Feed file format. This has been converted to a gzipped JSON dump. The reasoning for this is twofold. First, the daemon uses basically no DB features, except the caching (which really amounts to the database code just holding everything in memory for our usecase) and yet suffered from having to manage database code (like requiring reorganize() on GDBM) and deal with incompatibilities between distros. Second, a gzipped JSON file not only takes far less disk space, but is also platform agnostic – versus the Python shelf that uses Python-only serialization techniques. Old feed files will be migrated on the first use of the new version.
  • Protocol changes. The protocol to communicate between daemon and client has changed. Instead of using fragments of data and searching for a message terminator, a leading 8-byte header has been added with a size in bytes. This was a relatively minor change, but it means that we don’t suffer from messages getting stuck in the buffer waiting on a read to timeout and, as a bonus, don’t need to worry about fragmented messages. In addition, ITEMS responses are now always in single blocks rather than 100 item pages. ITEMSDONE is still sent, although obsolete.
  • Fetching is thread limited. By default, the daemon will only spawn a fetch thread per processor core. In the end, this turned out to be more of a memory issue (since the Python heap will expand and basically never contract) than anything, but obviously having a thousand threads waiting on two cores is a waste of time.
  • canto-remote status This remote command can be used to query item counts, as you might use in a status bar. See canto-remote help status for more info.
  • filter_read is now default.
  • Performance. Various changes were made to increase performance. Chiefly, the feed index function was refactored such that global transforms are applied on tag changes instead of “on-the-fly” when responding to an ITEMS response.

curses 0.9.4

  • Theming. The appearance of canto-curses is the same, but it’s defined in Python directly now, instead of a thousand characters of ternaries and escapes. It’s much clearer now, and as such there is a theme-default.py that functions as a plugin and can be modified to your tastes.
  • Color system. The color config has been shaken up a bit. Now, all available colors (1-8 or 1-256 depending on your terminal) will be initialized to that color on a black background. You can still change color codes directly, however :color can now be used to change specific colors by element name. (i.e. :color unread green) instead of messing with gibberish color codes. See :help color for a list of elements you can change. On first run, canto-curses will attempt to migrate your old color scheme and it should work well for simple changes. If it butchers your colors, use :reset-config color to restore to default.
  • Style system. Similar to :color, :style has been implemented that allows you to change the curses style (bold, dim, reverse, standout, underline) of a specific element. Also note that how these styles appear is entirely up to your terminal, so results may vary between them. See :help style for details.
  • xdg-open is now the default browser.
  • cleantitle.py is a plugin that allows you to strip content out of story titles. From annoying newlines to HTML fragments, this can help cleanup content from feeds that are poorly defined.
  • Tab completion tweaks. Works more like Bash now.
  • Update style: prepend. You can now use “:set update.style prepend” to get new items to be added to the top of the feed on update. The other options are “maintain” for sorting, and “append” for adding to the end.
  • :help set will now list various common options and their settings / uses. It’s entirely static (i.e. won’t show current values), but should help to familiarize the user with some of the lower level options.
  • Performance improvements. Various changes have been made to speed up many parts of the code. It still isn’t perfect, but a lot of larger operations are broken into smaller pieces, functions have been tweaked to run faster, and commands modified to be smarter. In my experience, after everything is loaded, canto runs very well, so I’ve put in a lot of effort to minimize the annoying pauses caused by loading feeds with massive amounts of post-filtered items. The initialization process has also been significantly reworked to make more sense.

All in all, a healthy changelog for a dot release. Subjectively, the performance has improved quite a lot, and the feature list isn’t too shabby either. I am a bit concerned with the invasiveness of some of the performance changes, but they’ve passed the (admittedly anemic) test suites and seem to be rock solid from my personal use. In other words, it’s time to push it out there and see how you guys break it.

I’m particularly surprised at how simple some of the performance changes were. A lot of the algorithms I used were naively implemented, particularly when trying to sort items into new, current, and old. That has to be done with linear complexity, because once you break into O(n^2) you’re fucked as soon as you get more than a handful of items. Other places it was the classic “cheap operations are expensive”, for example I was trying to cleanup the entire hook stack every time a hook callee was unregistered. This mean that a single story receiving die() and unregistering itself could cause a search of 1000s of possible hook callees. Yikes. Consider killing whole tags at once (say because of a config change filtering most of them) and you get into some seriously computational load for what should be almost instantaneous (like it is now).

On the other hand, I’m pleased that I was able to get a lot of the lead out of the system without causing any incompatibilities or other headaches.

Anyway, it’s not perfect, but there were too many changes pending release to delay any longer. If there are serious bugs that need to be ironed out, well, 0.9.5 and 0.9.3 aren’t exactly big deals.

Debian repos will be updated shortly.

Have fun!

Submit bugs!

In the Pipe 6/6

So I’ve been sprinting lately and have perhaps been taking too much of a breathless approach to committing. Anyway, here’s an overview of what is already in git and what I plan on doing before the next dot releases.

Inoreader

The big one is Inoreader support has landed in canto-next git. It’s gone through a couple of revisions, but it seems to be working pretty well.

The first revision used Inoreader as the source for all Canto feed data, and that worked very well, was very tightly synchronized, but it had some down sides. Particularly, Inoreader seems to have trouble fetching some feeds on a regular basis (personal Reddit feeds, for example) so it finds far fewer items than Canto does, then when it does find items, Inoreader’s content has been filled with ads (obviously with a free account), and otherwise sterilized which means it’s missing interesting information custom to feeds that I’m not particularly happy with losing.

The second revision uses Canto’s standard fetch, and then grabs Inoreader’s data and correlates them. This means we’re not serving Inoreader ads, we don’t break reader-extras by losing content, and we can fetch as fast as possible… but as I discovered after writing this revision, on some fast moving, problematic feeds (Reddit, again) we get a different set of items between Canto and Inoreader. The end result is that you could have no unread items in Canto or Inoreader, but then the other would still have a bunch of unread items simply because they’re getting two different versions of the same feed. Of course, the silver lining there is that the unread items in one, you haven’t actually seen in the other so in the end you’re getting a broader range of items put in front of you… but at the same time it doesn’t feel synchronized when you’ve marked everything read in one place and the other has 200 items waiting. In addition, if you’re using Inoreader as a service to synchronize multiple Cantos, it can only sync items it knows about so you’d still have to use sync-rsync to get perfect synchronization between the Cantos.

Right now, I’m planning a third revision which will compromise between the two approaches. Canto will still fetch the data, and the true data will still be the primary data source (so for most feeds we don’t receive ads and still have custom content), but any items that Inoreader knows about and Canto doesn’t will be integrated (ads and all). This will fix the problem of marking everything as read in Canto, then going to Inoreader to find hundreds of waiting items and, most importantly, puts the most amount of content in front of the user. Unfortunately, I’m pretty sure fixing the opposite problem (marking everything in Inoreader read, and going to Canto to find unread items) is impossible to fix (without discarding items, which would be stupid) since Canto can’t advertise items to Inoreader. As such we’ll have to live with that. The only remaining foible, multiple Cantos becoming desynchronized on the items Inoreader doesn’t know about is also impossible to fix in this plugin, but is possible to workaround with sync-rsync if you really must.

Curses Color Config

Another improvement currently in git, is that the curses client has a much easier to use :color command, allowing you to do stuff like :color unread green instead of having to figure out what color pair is unread like before. This requires a configuration change, but current git should migrate most simple color changes on the first run of the new version. If it butchers your crazy color config, sorry, and please note the 0 at the front of the version number.

Finer Plugin Control

This keeps plugins not designed to work with a specific program from loading. Previously, every plugin would be loaded entirely even for uses of canto-remote. Because of the plugin architecture this wasn’t really a functional problem, but Python was generating some useless code. The end user will probably only notice that fewer plugins are listed in the logs (since incompatible plugins aren’t listed anymore) and maybe a small downtick in memory usage and startup time.

Non-debug CPU usage improved

A mass conversion to better use Python’s log.debug function to avoid doing the string formatting when the message won’t even be used. This causes a noticeable speedup when running the daemon/curses without the -v flag. In other news, I may be a log information addict.

Coming Soon

These are features that are either half-way implemented in git, or will be done before the next dot releases.

  • Easier theming. The convoluted, Bash style escape sequences and shit are already gone. Themes will be implemented like plugins, overriding a function internal to the Story/Tag/Reader objects, allowing their appearance to be defined in regular old Python instead of a thousand character long tangle of codes. Along with this change will come the ability to manipulate Story/Tag appearance to filter or add content, similar to the abilities that already exist for the Reader.
  • Better internal documentation of config options. :help set should list known variables that it makes sense to use with :set (i.e. rather than :bind or :color).

  • Some old canto stuff that shouldn’t have been lost. Particularly, canto-remote should provide a way to get item read/unread/tag information out on the command line for integration into status bars, etc. as well as the cycles (i.e. old [ and ] binds to switch filters, sorts or visible tags).
  • Further Inoreader tweaks as mentioned above.

On Sync

This sync thing is turning into a major headache.

The Trouble with Feedly

I’m thinking of abandoning Feedly, which is a shame since it seems to have the best mobile ecosystem, but it’s just not developer friendly on the command line. I went over this the last time sync came up, but there are too many hurdles.

  • OAuth. I like OAuth, but it’s really not a solution for authenticating from the command line. In particular, the Feedly authentication requires you to be running a webserver (doable since localhost is okay, but totally out of the way for a sync plugin) and interact with a browser. The interaction is the real deal breaker, as I want the daemon to be fully runnable headless, not spawning browsers to click around on. In this case, it’s not really Feedly’s fault because it’s pretty clear they’re targeting other web services and mobile devices where clicking on OAuth sites isn’t an issue. Really, it’s more that it just doesn’t fill Canto’s specific niche.
  • Developer Tokens. The workaround for OAuth is using their dev system which basically gives you an authentication token directly, which would be the end result of using OAuth. The unfortunate part there is that it expires without warning and in only 3 months. This is the path I’ve been pursuing albeit with trepidation because that expiration is really a pain in the ass.
  • Excessive Rate Limiting. The straw that broke the camel’s back was that the developer tokens only allow you to do 250 requests a day. That might seem like a lot, about 10 an hour, or one every 5-6 minutes, but if the sync plugin is going to immediately inform Feedly of read items that limit is absurdly small. This could be made to work, say by making synchronization explicit but I really don’t want to have to have the user ask to sync manually. And even then, that’s a limit that could be hit and it’s yet another case where the plugin fails out of the blue. Not to mention that developing with that rate limit is really annoying – I’ve hit the limit in under an hour each time I’ve been testing. Of course, that rate limit only applies to the developer token logins, and if I was doing a web service / mobile with the OAuth login then I could register my application and get a decent rate limit. Once again, the daemon gets shafted.
  • Documentation. The Feedly API documentation is very sparse and it doesn’t seem to really cover how to use it in a straightforward manner. For example, I had to look at some half-way implemented API bindings just to find that you get content of feeds from the “Stream” API, instead of the “Feed” API. The Stream API talks about streamIds, but never defines them or how to derive them from a feed (I guess it’s *implied* that you use the feedId?). It also has pagination built into it with a “continuation” header, but doesn’t define what circumstances that’s present (i.e. you still get ‘continuation’ even if there are no more items). These aren’t really killer, but it does make figuring out how to use the API a chore.

Alternative: Inoreader

Github user romaintb mentioned an alternative service, inoreader.

Inoreader hits the same notes the Feedly does in terms of platform support. All I really care about is having a free Android app and a web service and those both look good even if there isn’t the same client diversity Feedly offers (yet?).

More importantly, however, the API seems a bit more sane. I was able to register an application and then login with a standard Inoreader account (not using OAuth account, I tried). There’s still a rate limit, but it’s 10,000 requests a day for one part of the API and 50,000 a day for another (so basically 240x the API calls although it’s per application instead of per user). That’s still a limit, but it’s a lot more relaxed and I can request an increase if it becomes necessary (haven’t looked into this yet, but there’s a link for it).

The sane(r) API and the elimination of the developer token 3 month expiration is a pretty compelling reason to switch to researching inoreader.

Alternative: The Old Reader

A commenter on this site put me on to The Old Reader too. I chose to pursue Inoreader based on the web interface, but The Old Reader might also be an option but I haven’t really delved into the API. It seems like everyone is mimicking the old Google Reader API (as unofficial as it was) so it might be trivial to do both.

Let’s not get ahead of ourselves however.

curses 0.9.3

Another bugfix release.

  • Fix hang when scrolling between feeds when a daemon is misbehaving or slow
  • Fix handling of “invalid” multi-byte characters, which exhibited symptoms like breaking line wrapping and other graphical weirdness
  • Fix borders (:set taglist.border True) and their color
  • Fix :color command being hidden from help

A few minor fixes. The big one is obviously the hang, which is an odd one. Multiple NEWTAG responses were being generated and if that happened early in canto-curses startup it would cause tags to be duplicated, which confused the taglist logic. That shouldn’t happen, but this was apparently reproducible on some machines.

The “invalid” multi-byte characters have been an annoyance for a while too. If you’ve ever had certain items that you’ve scrolled past that seem to break the interface, then this is probably the cause.

I’ve also updated the apt repos already, and expanded the Ubuntu build to include the forthcoming Vivid release.