daemon 0.9.4, curses 0.9.5

More fixes.

daemon 0.9.4

  • More fixes for sync-inoreader’s bad behavior with missing items (if you see ??? items that won’t go away, this is the fix)
  • Socket changes to avoid deadlocks when both server and client sockets are full.
  • Quiet exceptions on disconnects. These were mostly harmless, but scary in the log.
  • Minor changes to lock returns

curses 0.9.5

  • Fix some breakage and deadlocking caused by forcing threads to wait for written config changes to be processed (e.g. tag config changes like collapsed were broken in 0.9.4). Ironically my test suite actually caught some of the more obvious breakage, if only I’d actually run the fucking thing.
  • h/left and l/right are now bound, by default, to setting items read and unread respectively.

Short and sweet.

Thanks to everyone that submitted bugs.

daemon 0.9.2, curses 0.9.4

… And then, everything got faster. Analysis to follow the changelog.

daemon 0.9.2

  • Inoreader Sync. The sync-inoreader.py plugin landed, allowing you to synchronize with Inoreader. It requires the python3-requests package to be installed (most distros have a package for this). It also requires a real Inoreader account (not an OAuth Google/Facebook login). The details are given at the top of the plugin file. I enumerated some of the trade offs in the last post, but in short the plugin tries to give you access to the most items, so when items show up only in Inoreader data or the data canto gathered but not both, they’ll be displayed even though they’re not synchronized. This also means that if you’re synchronizing multiple canto-daemons with Inoreader, some items won’t be synchronized. For “perfect” synchronization of multiple cantos, sync-rsync.py is the better option.
  • XDG support. The default location of canto files is now $XDG_CONFIG_HOME/canto/ (which is usually ~/.config/canto). This is only relevant for fresh copies, if ~/.canto-ng exists, it will continue to be used.
  • Feed file format. This has been converted to a gzipped JSON dump. The reasoning for this is twofold. First, the daemon uses basically no DB features, except the caching (which really amounts to the database code just holding everything in memory for our usecase) and yet suffered from having to manage database code (like requiring reorganize() on GDBM) and deal with incompatibilities between distros. Second, a gzipped JSON file not only takes far less disk space, but is also platform agnostic – versus the Python shelf that uses Python-only serialization techniques. Old feed files will be migrated on the first use of the new version.
  • Protocol changes. The protocol to communicate between daemon and client has changed. Instead of using fragments of data and searching for a message terminator, a leading 8-byte header has been added with a size in bytes. This was a relatively minor change, but it means that we don’t suffer from messages getting stuck in the buffer waiting on a read to timeout and, as a bonus, don’t need to worry about fragmented messages. In addition, ITEMS responses are now always in single blocks rather than 100 item pages. ITEMSDONE is still sent, although obsolete.
  • Fetching is thread limited. By default, the daemon will only spawn a fetch thread per processor core. In the end, this turned out to be more of a memory issue (since the Python heap will expand and basically never contract) than anything, but obviously having a thousand threads waiting on two cores is a waste of time.
  • canto-remote status This remote command can be used to query item counts, as you might use in a status bar. See canto-remote help status for more info.
  • filter_read is now default.
  • Performance. Various changes were made to increase performance. Chiefly, the feed index function was refactored such that global transforms are applied on tag changes instead of “on-the-fly” when responding to an ITEMS response.

curses 0.9.4

  • Theming. The appearance of canto-curses is the same, but it’s defined in Python directly now, instead of a thousand characters of ternaries and escapes. It’s much clearer now, and as such there is a theme-default.py that functions as a plugin and can be modified to your tastes.
  • Color system. The color config has been shaken up a bit. Now, all available colors (1-8 or 1-256 depending on your terminal) will be initialized to that color on a black background. You can still change color codes directly, however :color can now be used to change specific colors by element name. (i.e. :color unread green) instead of messing with gibberish color codes. See :help color for a list of elements you can change. On first run, canto-curses will attempt to migrate your old color scheme and it should work well for simple changes. If it butchers your colors, use :reset-config color to restore to default.
  • Style system. Similar to :color, :style has been implemented that allows you to change the curses style (bold, dim, reverse, standout, underline) of a specific element. Also note that how these styles appear is entirely up to your terminal, so results may vary between them. See :help style for details.
  • xdg-open is now the default browser.
  • cleantitle.py is a plugin that allows you to strip content out of story titles. From annoying newlines to HTML fragments, this can help cleanup content from feeds that are poorly defined.
  • Tab completion tweaks. Works more like Bash now.
  • Update style: prepend. You can now use “:set update.style prepend” to get new items to be added to the top of the feed on update. The other options are “maintain” for sorting, and “append” for adding to the end.
  • :help set will now list various common options and their settings / uses. It’s entirely static (i.e. won’t show current values), but should help to familiarize the user with some of the lower level options.
  • Performance improvements. Various changes have been made to speed up many parts of the code. It still isn’t perfect, but a lot of larger operations are broken into smaller pieces, functions have been tweaked to run faster, and commands modified to be smarter. In my experience, after everything is loaded, canto runs very well, so I’ve put in a lot of effort to minimize the annoying pauses caused by loading feeds with massive amounts of post-filtered items. The initialization process has also been significantly reworked to make more sense.

All in all, a healthy changelog for a dot release. Subjectively, the performance has improved quite a lot, and the feature list isn’t too shabby either. I am a bit concerned with the invasiveness of some of the performance changes, but they’ve passed the (admittedly anemic) test suites and seem to be rock solid from my personal use. In other words, it’s time to push it out there and see how you guys break it.

I’m particularly surprised at how simple some of the performance changes were. A lot of the algorithms I used were naively implemented, particularly when trying to sort items into new, current, and old. That has to be done with linear complexity, because once you break into O(n^2) you’re fucked as soon as you get more than a handful of items. Other places it was the classic “cheap operations are expensive”, for example I was trying to cleanup the entire hook stack every time a hook callee was unregistered. This mean that a single story receiving die() and unregistering itself could cause a search of 1000s of possible hook callees. Yikes. Consider killing whole tags at once (say because of a config change filtering most of them) and you get into some seriously computational load for what should be almost instantaneous (like it is now).

On the other hand, I’m pleased that I was able to get a lot of the lead out of the system without causing any incompatibilities or other headaches.

Anyway, it’s not perfect, but there were too many changes pending release to delay any longer. If there are serious bugs that need to be ironed out, well, 0.9.5 and 0.9.3 aren’t exactly big deals.

Debian repos will be updated shortly.

Have fun!

Submit bugs!

In the Pipe 6/6

So I’ve been sprinting lately and have perhaps been taking too much of a breathless approach to committing. Anyway, here’s an overview of what is already in git and what I plan on doing before the next dot releases.

Inoreader

The big one is Inoreader support has landed in canto-next git. It’s gone through a couple of revisions, but it seems to be working pretty well.

The first revision used Inoreader as the source for all Canto feed data, and that worked very well, was very tightly synchronized, but it had some down sides. Particularly, Inoreader seems to have trouble fetching some feeds on a regular basis (personal Reddit feeds, for example) so it finds far fewer items than Canto does, then when it does find items, Inoreader’s content has been filled with ads (obviously with a free account), and otherwise sterilized which means it’s missing interesting information custom to feeds that I’m not particularly happy with losing.

The second revision uses Canto’s standard fetch, and then grabs Inoreader’s data and correlates them. This means we’re not serving Inoreader ads, we don’t break reader-extras by losing content, and we can fetch as fast as possible… but as I discovered after writing this revision, on some fast moving, problematic feeds (Reddit, again) we get a different set of items between Canto and Inoreader. The end result is that you could have no unread items in Canto or Inoreader, but then the other would still have a bunch of unread items simply because they’re getting two different versions of the same feed. Of course, the silver lining there is that the unread items in one, you haven’t actually seen in the other so in the end you’re getting a broader range of items put in front of you… but at the same time it doesn’t feel synchronized when you’ve marked everything read in one place and the other has 200 items waiting. In addition, if you’re using Inoreader as a service to synchronize multiple Cantos, it can only sync items it knows about so you’d still have to use sync-rsync to get perfect synchronization between the Cantos.

Right now, I’m planning a third revision which will compromise between the two approaches. Canto will still fetch the data, and the true data will still be the primary data source (so for most feeds we don’t receive ads and still have custom content), but any items that Inoreader knows about and Canto doesn’t will be integrated (ads and all). This will fix the problem of marking everything as read in Canto, then going to Inoreader to find hundreds of waiting items and, most importantly, puts the most amount of content in front of the user. Unfortunately, I’m pretty sure fixing the opposite problem (marking everything in Inoreader read, and going to Canto to find unread items) is impossible to fix (without discarding items, which would be stupid) since Canto can’t advertise items to Inoreader. As such we’ll have to live with that. The only remaining foible, multiple Cantos becoming desynchronized on the items Inoreader doesn’t know about is also impossible to fix in this plugin, but is possible to workaround with sync-rsync if you really must.

Curses Color Config

Another improvement currently in git, is that the curses client has a much easier to use :color command, allowing you to do stuff like :color unread green instead of having to figure out what color pair is unread like before. This requires a configuration change, but current git should migrate most simple color changes on the first run of the new version. If it butchers your crazy color config, sorry, and please note the 0 at the front of the version number.

Finer Plugin Control

This keeps plugins not designed to work with a specific program from loading. Previously, every plugin would be loaded entirely even for uses of canto-remote. Because of the plugin architecture this wasn’t really a functional problem, but Python was generating some useless code. The end user will probably only notice that fewer plugins are listed in the logs (since incompatible plugins aren’t listed anymore) and maybe a small downtick in memory usage and startup time.

Non-debug CPU usage improved

A mass conversion to better use Python’s log.debug function to avoid doing the string formatting when the message won’t even be used. This causes a noticeable speedup when running the daemon/curses without the -v flag. In other news, I may be a log information addict.

Coming Soon

These are features that are either half-way implemented in git, or will be done before the next dot releases.

  • Easier theming. The convoluted, Bash style escape sequences and shit are already gone. Themes will be implemented like plugins, overriding a function internal to the Story/Tag/Reader objects, allowing their appearance to be defined in regular old Python instead of a thousand character long tangle of codes. Along with this change will come the ability to manipulate Story/Tag appearance to filter or add content, similar to the abilities that already exist for the Reader.
  • Better internal documentation of config options. :help set should list known variables that it makes sense to use with :set (i.e. rather than :bind or :color).

  • Some old canto stuff that shouldn’t have been lost. Particularly, canto-remote should provide a way to get item read/unread/tag information out on the command line for integration into status bars, etc. as well as the cycles (i.e. old [ and ] binds to switch filters, sorts or visible tags).
  • Further Inoreader tweaks as mentioned above.

On Sync

This sync thing is turning into a major headache.

The Trouble with Feedly

I’m thinking of abandoning Feedly, which is a shame since it seems to have the best mobile ecosystem, but it’s just not developer friendly on the command line. I went over this the last time sync came up, but there are too many hurdles.

  • OAuth. I like OAuth, but it’s really not a solution for authenticating from the command line. In particular, the Feedly authentication requires you to be running a webserver (doable since localhost is okay, but totally out of the way for a sync plugin) and interact with a browser. The interaction is the real deal breaker, as I want the daemon to be fully runnable headless, not spawning browsers to click around on. In this case, it’s not really Feedly’s fault because it’s pretty clear they’re targeting other web services and mobile devices where clicking on OAuth sites isn’t an issue. Really, it’s more that it just doesn’t fill Canto’s specific niche.
  • Developer Tokens. The workaround for OAuth is using their dev system which basically gives you an authentication token directly, which would be the end result of using OAuth. The unfortunate part there is that it expires without warning and in only 3 months. This is the path I’ve been pursuing albeit with trepidation because that expiration is really a pain in the ass.
  • Excessive Rate Limiting. The straw that broke the camel’s back was that the developer tokens only allow you to do 250 requests a day. That might seem like a lot, about 10 an hour, or one every 5-6 minutes, but if the sync plugin is going to immediately inform Feedly of read items that limit is absurdly small. This could be made to work, say by making synchronization explicit but I really don’t want to have to have the user ask to sync manually. And even then, that’s a limit that could be hit and it’s yet another case where the plugin fails out of the blue. Not to mention that developing with that rate limit is really annoying – I’ve hit the limit in under an hour each time I’ve been testing. Of course, that rate limit only applies to the developer token logins, and if I was doing a web service / mobile with the OAuth login then I could register my application and get a decent rate limit. Once again, the daemon gets shafted.
  • Documentation. The Feedly API documentation is very sparse and it doesn’t seem to really cover how to use it in a straightforward manner. For example, I had to look at some half-way implemented API bindings just to find that you get content of feeds from the “Stream” API, instead of the “Feed” API. The Stream API talks about streamIds, but never defines them or how to derive them from a feed (I guess it’s *implied* that you use the feedId?). It also has pagination built into it with a “continuation” header, but doesn’t define what circumstances that’s present (i.e. you still get ‘continuation’ even if there are no more items). These aren’t really killer, but it does make figuring out how to use the API a chore.

Alternative: Inoreader

Github user romaintb mentioned an alternative service, inoreader.

Inoreader hits the same notes the Feedly does in terms of platform support. All I really care about is having a free Android app and a web service and those both look good even if there isn’t the same client diversity Feedly offers (yet?).

More importantly, however, the API seems a bit more sane. I was able to register an application and then login with a standard Inoreader account (not using OAuth account, I tried). There’s still a rate limit, but it’s 10,000 requests a day for one part of the API and 50,000 a day for another (so basically 240x the API calls although it’s per application instead of per user). That’s still a limit, but it’s a lot more relaxed and I can request an increase if it becomes necessary (haven’t looked into this yet, but there’s a link for it).

The sane(r) API and the elimination of the developer token 3 month expiration is a pretty compelling reason to switch to researching inoreader.

Alternative: The Old Reader

A commenter on this site put me on to The Old Reader too. I chose to pursue Inoreader based on the web interface, but The Old Reader might also be an option but I haven’t really delved into the API. It seems like everyone is mimicking the old Google Reader API (as unofficial as it was) so it might be trivial to do both.

Let’s not get ahead of ourselves however.

curses 0.9.3

Another bugfix release.

  • Fix hang when scrolling between feeds when a daemon is misbehaving or slow
  • Fix handling of “invalid” multi-byte characters, which exhibited symptoms like breaking line wrapping and other graphical weirdness
  • Fix borders (:set taglist.border True) and their color
  • Fix :color command being hidden from help

A few minor fixes. The big one is obviously the hang, which is an odd one. Multiple NEWTAG responses were being generated and if that happened early in canto-curses startup it would cause tags to be duplicated, which confused the taglist logic. That shouldn’t happen, but this was apparently reproducible on some machines.

The “invalid” multi-byte characters have been an annoyance for a while too. If you’ve ever had certain items that you’ve scrolled past that seem to break the interface, then this is probably the cause.

I’ve also updated the apt repos already, and expanded the Ubuntu build to include the forthcoming Vivid release.

0.9.1

I’ve just bumped the versions of canto-curses and canto-daemon to 0.9.1. The apt repos will be updated shortly.

Daemon Changes

  • A number of tests for various important parts of the daemon. Particularly the index() function that converts the on-disk database into functional data structures.
  • Fixed improper tagging of old items
  • Don’t forget items that should be kept when we expect a feed to be empty (i.e. on startup or sync).
  • Workaround feedparser bug #238, fixing basic authentication for feeds.

Not a very long list, but still some bad behavior that’s been corrected. All of these are accompanied with specific tests as well.

Curses Changes

  • A new indicator has been added to the tag header, in gray, noting the number of pending items that will be displayed when you :refresh (Ctrl-r) or :update (F5 or \)
  • Commands that take single items will now use the first item automatically, if no selection has been made.
  • Relaxed some locking, improving performance and removing at least one hardlock-on-resize case.
  • Color escapes (%1 -> %8) now match directly with their curses counterparts. Color 0 is the default color and can be used by unsetting other colors with %0 for each enabled color. This won’t affect you unless you’ve set your own format string, but if you have you just subtract one from each color code higher than 1 and balance your colors.
  • Deferred all graphical logging until the GUI thread is released (caused a lot of non-fatal errors to lock)
  • Enabled completion for categories and user tags
  • A handful of minor fixes and optimizations designed to get content to your screen faster, like triggering a mini-sync when an empty tag gets its first item.
  • A small amount of core tests and infrastructure.

In short, these are the bug fixes for the bugs I got over the holidays and since 0.9.0. Nothing too fancy, but working out some post-release kinks.

Some sort of service sync plugin is still coming in the future, and I’ve got a local branch using elinks to format the reader but it’s not quite there yet (I want to have output with the color scheme instead of just raw text).

Dec 17th, 2014

Happy Holidays, all. Here’s a status update.

Testing

The biggest change is that tests have started appearing. I’ve written some very core tests for the daemon, testing its indexing which is by far the most important part (i.e. turning on-disk storage into data structures properly).

Canto-curses is by far more complex, dealing directly with user input, so it’s started to get test coverage as well. For now it’s still pretty basic, but the trick was in the infrastructure rather than the tests itself. I’ve written a neat little fake curses bit of python that has the interesting property of being able to print a canto-curses screen in ASCII but it mostly serves to make it so that I can run a test thread with a full c-c instance running without dumping on the screen and without needing actual human interaction.

Unsurprisingly, every test I’ve written has uncovered some bad (or at least unexpected) behavior. A small trickle of fixes and tweaks has followed them, including a couple of daemon bugs that were reported, I wrote a failing test for, and fixed. Feels good =).

Indicators

A minor change to canto-curses is that the tag status line now includes a grey indicator (similar to the blue indicator for unread items) for items that have been updated, but haven’t been displayed yet. Because the new default policy is to not automatically update the screen, this lets you finish reading items that are currently shown but still know that there are updates pending for when you update (F5) or refresh (Ctrl+r) next.

Style changes

The built in style formats have been updated so that canto-curses colors and curses colors are identical, instead of being off by one. (i.e. %1 is now curses pair 1, not curses pair 0). If you’ve messed with your format strings, you need to subtract one from each color (%8 -> %7 etc.) and if you were using %1 for curses default pair 0, you need to turn off previous colors (%0 disables last color) to get back to the default, or temporarily clear it %C .. %c.

I don’t expect many people have done this, considering it’s entirely undocumented =P.

If you haven’t dug around in this, the update with the changed format strings should go unnoticed.

Feedly and TT-RSS

With 0.9.0 out, I’ve started looking at using canto to drive mobile and web interfaces. TT-RSS seems to be a natural fit here, being the sort of homebrew heir to Google Reader. I wrote a little proof of concept TT-RSS server compatibility plugin that supports enough that the official TT-RSS mobile app can read (but not write!) information from the canto-daemon.

I was about to look at the inverse (which would be using a TT-RSS server to power canto-daemon), but I was rather disappointed in TT-RSS’s client selection (Android app costs money to upgrade from a 7-day trial and other than that seems like you’re stuck on the web) and I wasn’t impressed with having to have a full web server, database, and PHP running with a domain (or dynamic DNS) just to use them either, especially when canto’s requirement are so slim.

So, instead of TT-RSS I started looking at Feedly. Feedly is nice, has a bunch of clients on different platforms (and, of course, on the web). It also has a nicer API, in my opinion, although they’re very similar. Best of all, Feedly is a free service so there’s no need to have web facing servers running yourself.

The problem here is that Feedly doesn’t offer a great way to do hands-off authentication. There are two methods. The first is one intended for a web platform and requires a user to have an active cookie, or to click around on an OAuth page like Google/Facebook/Twitter. It then pings a registered web server with token information. Clearly this method doesn’t make sense for a sync plugin but it’s obvious that this is the more supported usecase for Feedly.

The other method is using developer tokens, which is basically a way to get an API key. This seems perfect at first glance, and in fact this is what a curses based Feedly client (Feednix) does, but these keys expire every three months without warning. Ugh. It’s a simple thing to have the daemon throw a warning to a client that connects to say “Hey, your Feedly token expired! Go get a new one and restart the daemon!” but it’s really fucking annoying that it’s going to break unexpectedly.

It seems clear that Feedly is the way to go (free accounts, no setup, lots of other clients) but I’m a bit disappointed that it won’t Just Work.

0.9.0

A year and a half after the last version, six months of work, and about 370 commits, 0.9.0 is here with an impressive changelog.

Daemon Changes

  • Daemon is now fully threaded.
  • Daemon now has better support for multiple simultaneous connections.
  • Performance has been enhanced significantly.
  • Many bugfixes for instability and lost data.
  • Many features polished, some stripped away.
  • Config compatible with 0.8.x., should be able to just update and restart.
  • Protocol updated to 0.9 (incompatible with old clients).
  • Much faster initial startup.
  • Optional systemd user daemon config included.
  • New sync and sync-rsync plugins to enable file-based syncing with rsync to a remote server, or to a local filesystem (e.g. for outside NFS/sshfs/Dropbox support).
  • Improved reddit plugin performance.
  • Control over caching (which is disabled by default) for systems that have a good amount of memory, but potentially slow disks.
  • Fine grain plugin control.
  • Better manpage.

Curses Changes

  • Curses client is now fully threaded.
  • Performance has been enhanced significantly.
  • Command infrastructure overhauled.
  • Command line now supports tab completion.
  • :help (hit ‘?’ by default) now provides general help as well as command specific help and keybind listing
  • Command line now supports readline-enabled line editing.
  • Better support for 256 color config
  • :set to provide support for changing all manners of configuration for daemon and curses client
  • More natural support for tag categories (or folders).
  • New plugin: autocmd.py to automatically run certain commands on startup, potentially based on environment or other settings.
  • New plugin: reader-extras.py to add additional information to reader output (such as author, or feed specific content) as well as debug options to show all content included in the story.
  • New plugin: xterm-title.py to set the xterm title for canto-curses terminals.
  • New plugin: favorites.py, a proof of concept plugin for allowing the user to mark, group and change the appearance of certain items.
  • New plugin: smartlink.py adds a reworked :fetch command to use content sensitive handlers to open links based on the URL or the output of file. Useful for using media enclosures or reading PDFs or showing images in a dedicated program.
  • Fine grain plugin control.
  • Better manpage

This release does a lot to blow the dust out of some dark corners of canto, bring performance and stability up a few notches. However, it still needs a lot of testing and there are still miles to go on the usability front. At some point, you’ve got to rein in the changes.

The Plan

0.9.0 isn’t perfect, but within the parameters set by the previous versions it’s pretty good. Canto-curses still needs more work. The command line is much nicer, but it needs to be augmented with more graphical information. This is going to require more GUI work, which is beyond the scope of this release which already gutted and rebuilt a lot of the core infrastructure.

Some ideas that I’ve had for subsequent 0.9.x releases:

  • Indicators, or perhaps a status line for information like what categories you’re showing, or your filter settings, or notification of waiting updates or working status.
  • Graphical selectors for keying between filter settings etc.
  • 256-color default theme (I’m open to input on the current one too =P).
  • Shifting keybinds a bit.

In addition, now that threading has been accomplished, and the command system overhauled, and the protocol tweaked, it’s time to formalize some testing which is a hard requirement for an eventual 1.0.0.

For now though, bugfixes, bugfixes, bugfixes and a nice rest before any more feature work.

Updates to the apt repo will be up soon.