Building a decentralized GoodReads - a V0.1 spec for library.json files

tomcritchlow · April 15, 2020, 9:02pm

Hey folks - I’m in the early stages of thinking through an open spec for “library.json” files so that we can recreate GoodReads out of a thousand indie sites - powered by the web of books.

I wrote up a post here with ideas, a draft spec and more:

Would love your feedback and comments!

Brendan · April 15, 2020, 9:57pm

Awesome, thanks for sharing, this looks like a great start and nice to have something tangible to look at here. Some initial thoughts (also a twitter thread I just posted haha):

Personal website book lists = one of the great things on the web; love the idea of building something to support a “web of books”, with a strong focus on open tooling
Ha, probably a good idea to not focus on monetization, though not sure I fully agree “the only path to monetization is Amazon” — it’s one of the clearest, but there have to be others! Subscription in certain cases…though I think that requires a strong niche / value prop
Decentralized architecture sounds great; a common data format everyone can agree on sounds potentially very tricky! I like the simple JSON prototype here and the idea of a bookshelf feed reader. And great starting point with the spec, definitely useful to have something concrete (and relatively simple) to play with
Not sure yet re: what the centralized hub would look like; feels like that’d just be kind of one example of the most maximalist feed, but probably the most value would be in more personal curated feeds. Book club use case is very interesting here.
Thoughts on the spec itself:
- “Library” composed of one or many “lists” seems like a good starting point
- Good book data is def a big challenge…not only unified ID but URL / image
- Flexible array of notes seems useful
- Beyond default data…support for custom fields?
Thoughts for next steps:
- I’d be interested in more research into e.g. seeing what professional library solutions may exist that attempt to address some of the thorny challenges around universal book IDs
- Goodreads seems to have decent API + export functionality, could be a starting point for getting more sample data

tomcritchlow · April 16, 2020, 1:25am

I’m all ears! Would love to figure out something else - but especially for a decentralized idea subs etc seem hard to pull off… With great branding you could perhaps monetize via merch but it’s a bit of a stretch…

Not sure yet re: what the centralized hub would look like; feels like that’d just be kind of one example of the most maximalist feed, but probably the most value would be in more personal curated feeds. Book club use case is very interesting here.

I really think this is part of the magic - the centralized hub is like a proof of concept but the infinite extension and re-mixability is a cool feature

support for custom fields?

For sure! The cool thing about json is you can just ignore the bits you don’t want and use the bits you do. For example I don’t have star ratings on my bookshelf but super easy to integrate and imagine how you might have it.

Goodreads seems to have decent API + export functionality, could be a starting point for getting more sample data

Never played with the Goodreads API though I’d be nervous trying to build anything robust on it… More exploration needed here…

eshnil · April 16, 2020, 4:47pm

Do make sure to check GoodReads terms and conditions for their API.

Schema.org already has a data-type for Book which you can adopt: https://schema.org/Book
Microformats.org doesn’t have a standard for books, but their mailing lists will surely have discussed books. Might be worth checking out. Also, consider working with OpenLibrary which is very similar in scope.

BillSeitz · April 18, 2020, 10:05pm

some old links: http://webseitz.fluxent.com/wiki/OpenReview

Brendan · April 19, 2020, 12:48am

Totally, I could imagine some kind of multi-tiered subscription thing perhaps, like a low Patreon-type level for casual supporters, and higher tiers for bigger sites. But yeah this is tricky!

Totally, I think a baseline of common data for every book + standard-ish format for custom stuff could be great. And this:

…seems like an interesting starting point, if rather complex (largely b/c of how many properties it inherits from Thing and CreativeWork it seems)

I wasn’t thinking of building something totally reliant on it, more just a way to more easily get some solid seed data and easy onramp for users to generate the library.json file — could be even like a one-time manual thing to start, potentially a more ongoing integration later. For folks w/ tons of data in Goodreads already, option to quickly generate this data from their existing shelves could be cool.

Welcome Bill! Can you say more about what you’re sharing here? Almost all the links on that page are broken, unfortunately.

Brendan · April 19, 2020, 1:06am

Linking as well to this great reply from Matt Webb building on Tom’s initial proposal:

http://interconnected.org/home/2020/04/16/rss_for_books

The gist:

Ability to subscribe to others’ lists would be awesome
Might make sense to just use RSS rather than some new format
And overall could split into two formats: one for a library (collection of lists), one for the book list itself (collection of multiple book objects)
This way, could leverage existing RSS tools / ecosystem
And for the library, may be able to use OPML (way to group RSS feeds)

To which the main counterpoint is probably:

RSS is annoying to work with
Might feel too heavyweight, less fun and loose for experimentation

My thoughts…

I don’t know enough about RSS / OPML to have super specific opinions on implementation, but separating the sub / pub mechanism from the book data spec & piggybacking on existing tech where possible sounds like a good direction to explore. I wonder if there’s a way to keep it more simple and lightweight for now, but with an eye towards how it might be e.g. integrated into RSS-y ecosystem stuff later on.

One other thing that comes to mind is separating the source data from the aggregated “hub” data — also relates to my above comment about using Goodreads to jumpstart initial data.

I could imagine e.g. a way to allow a user to pull from their Goodreads lists as a canonical source, and then parse that into whatever format (JSON / RSS / etc) that a) others can subscribe to from the hub site or b) the user could embed or somehow pull back to their own personal site etc. if they like.

Basically the hub site could support converting / aggregating some of the most popular existing book lists out there (Goodreads, LibraryThing, Open Library, etc.) as a way to support more data sources upfront and merge into a collection of libraries with more standard form…