Inspecting and hacking E-Readers

I just got a Kobo Clara HD expressly because it’s more open than a kindle. The main thing I want is the ability to export my highlights and annotations in some format that’s easy for me to pipe them to my website. This thread is mainly just to talk about what I’m doing in case anyone also has one and is interested and to prompt a conversation about novel ways to interact with e-readers.

Initial Findings

  • The Kobo store also has a majority of books that are sold with DRM. I’m unsure if they enforce this or publishers do?

  • There’s a very easy to read sqlite database that contains data on each books (wierdly stored as individual chapters). Books are referenced by their filename everywhere in the database. For each chapter there’s time last read and read status and a ton of other data I’m yet to decipher.

  • There are two ways to access annotations

    1. You can pull them from a table in the sqlite database called bookmark
    2. There’s a top level folder called Digital Editions that contains a folder called annotations. For each book you have there’s a file ending with epub.annot containing a bunch of XML which should be pretty easy to parse.

Future things to explore

  • there’s a pocket integration, it would be interesting to see what data is stored there and what

  • Reading stats (avg absolute time from start to finish for example)

Questions

Anyone have any ideas for interesting things to do with the database? There’s also the option of side-loading software like [KOreader. Also, if you use pocket, what’re your thoughts there?

1 Like

This is very cool! Love the aim of experimenting here. Just checking out that Kobo reader and looks nice / basically identical to a Kindle haha…I’ve thought about getting one but honestly not sure about extricating myself from the Kindle ecosystem. I have so many ebooks on Kindle already and some services like Readwise seem to work just with Kindle (albeit in a hacky way; I think Readwise has you log in with your Amazon account and basically scrapes the highlight data somehow).

But I use Pocket all the time and actually feel like it might be worth having a Kobo even just as a dedicated Pocket reader, if the integration is nice. I’ve tried this web app I think called like “Pocket 2 Kindle” that works pretty well to send a batch of Pocket articles to read on Kindle, but a direct integration seems awesome. I wonder if you can highlight arbitrarily on articles with Pocket + Kobo or if it has the same limitation where you have to pay for the premium subscription to unlock certain features like > 3 highlights per article…

(Aside: I see Kobo is now owned by Rakuten, a Japanese megacorp I don’t know much about…but I see they also acquired OverDrive, the company that handles much of the “ebook fulfillment for libraries” market, a few years ago. May be nice for using Kobo w/ library books! I kinda thought the writing was on the wall with Amazon completely dominating the e-reader game, and that may still be the case, but at least an indication Kobo has the resources to compete for a while yet.)

I’m also not all in on ebooks in general, probably still read majority physical books. Haven’t hit on a great solution for saving notes / highlights from those books, though I’ve seen a couple apps lately that do OCR from photos which I’d like to try out. Would be cool to end up with some kind of system for personal highlights from multiple sources — Pocket articles, ebooks, physical books — all collated into one searchable archive. (Maybe even web sources too with e.g. Hypothesis / Pinboard)

Traveling now w/o computer but I’d be interesting in taking a look at the data Kobo makes available, both for ebooks and with the Pocket integration. And perhaps comparing to what’s possible to access from Kindle (assuming people have hacked some way to do that…) Sounds like the actual book text and the reading metadata (e.g. read location / time) are kind of two separate things, with highlights and annotations somewhere in the middle? Not sure the implications of DRM status on how accessible highlights would be, for example. Lots to explore here!

I hear the Library integration is pretty seamless. I still need to get a library card so I can try it out :sweat_smile:

Highlighting seems to be a no-go on Pocket articles, which is pretty frustrating. Converting pocket articles to epubs and then putting them on it would enable it, but that’s convoluted. Perhaps there would be a neat way to automate it? I need to see if there’s a way to run scripts on the kobo in the background.

There definitely seems to be a more robust software ecosystem around kindles. I hadn’t heard of Readwise but it looks very cool! It might be a nice first step to see if that could be replicated with the kobo database, perhaps tying in Anki or another SR app. I guess that’s the appeal for the kobo, I probably won’t get around to doing a tenth of the things I’d be keen on, but just the possibility of hacking and scripting it gets me excited!

1 Like

This piece is very cool and includes some relevant Kobo workflow notes! —

Interesting to note that Kobo highlights apparently work much better w/ their proprietary .kepub format (vs. regular .epub) so he uses this tool kepubify to convert. And a couple other scripts to export and copy over to a laptop.

Also some great little tools for saving content from web articles / PDFs, as well as a tool for tagging, reviewing, and organizing highlights.

Yeah, I’m using a calibre plugin to convert as I use it for handling my ebook library anways (actually come to think of it I don’t do much other than converting and managing folders with it, so maybe I should look into setting up something more minimal).

I’m not too sure how proprietary the kepub is and why they can’t just do it with epub, it was a little frustrating at first when the files I dropped on didn’t have time estimates or wouldn’t save my highlights properly.

The database is a little funky to deal with. It uses internal IDs to reference Chapters and Books, so when you pull out an annotation from the annotations table you then have to take those id’s and look em up in another table to get the human readable names.

Also, this is the second little web tool I’ve seen today for highlighting, the first being @tomcritchlow’s . Seems like a pretty handy thing to have in the arsenal.

1 Like