Automatic Metadata Retrieval

Hey everyone! I just got anytype today, and I’m loving it so far! It seems like an amazing concept with a lot of potential, and I can’t wait to see where it goes!

One thing that stuck out to me as I was tinkering around is that for every object I add that has any sort of metadata associated with it (e.g. a book has an author, a cover image, and a page count; a movie has a director, writer producer, lead actors, etc.), I have to add all of that stuff in manually, even though it’s readily available online in an easy-to-parse format (from places like goodreads and imdb for those two examples). If I’m making a reading list or a movie watch list, I don’t want to add all that stuff in myself. It’s worth noting that in all of the demo videos when they have objects like this, they are pre-populated with all this metadata already, and this gives the impression of the knowledge base being a lot easier to work with than it actually is in practice. It makes sense - the point of the videos is to showcase the underlying functionality quickly, and adding all that metadata in manually would be pretty time consuming and boring to watch - but that’s kinda the point. All this info is available on the internet. Wouldn’t it be cool if you didn’t have to go through all those steps yourself?

I realize this is a pretty vague request, which is why I’m making it as a discussion rather than a feature request. I have no idea what this would look like in implementation - would it be a set of plugins, one each for each database source? Would it be an anytype-hosted database that’s user-sourced? Would it be a feature as part of the web-clipper, when that becomes a thing? I don’t know. I’m just curious what y’all’s thoughts are about this.

4 Likes

I was just thinking about this the other day.

Something like Calibre does for book management it was I was thinking but I have no idea how this can be implemented on Anytype, I’ve just seen examples with Notion.

I have a set for games and another one for books, but only for those items that I don’t have in my kindle or steam. It would be great if there was an option to populate the relations that I have in those sets. I would love a feature like this to save time and start writing notes for my books or adding different mods for my games.

2 Likes

Yes @Composer3, it’s a great idea. We’ve discussed it before and I think we’re ebbing in that direction, especially after our webclipper will be up and running. It would be a super convenient way to quickly build or simply import a library, which reminds me of MP3 id3 tags that contain all the info about the file (ie artist name, release year, duration, record label, etc.) This has existed for decades. I think the difference is, audio files are (for the most part) universally standardized to be read the same way on any media player, so without uploading the media itself, it’s hard for me to imagine how it would work with just the descriptions of books or movies, since most vendors have their own proprietary way of presenting that info. It would almost certainly have to be a plugin which can read many different sources of content.

There’s definite potential for natively populating all the criteria, by extracting it from the media file itself when uploaded into Anytype, like what is described in this Feature Request Use music's cover from the metadata as a Cover in Anytype. However most people aren’t doing that, they’re basically just creating wikis about their books, etc. :thinking:

2 Likes

That would be pretty cool if we could have that. I think It definitely can be done; for instance, the movie/tv show collection can be built using the TMDB’s API or Goodreads for books.

For the case of TMDB, I think it would play nicely with Anytype’s relations and objects as the properties there such as director, actors, etc. can be classified as relations and the actors/directors themselves can be considered as “human” types of objects.

2 Likes

I’ll look into the WebClipper to see if it is close to what I’m hoping to do, but my first idea for a plugin would be to be able to import metadata from various open API’s for objects. Books, even research materials like Patents, etc. I also am imagining something like how Calibre does it, where you enter one piece of information (title, for example) and the plugin will search the source material (wherever that is, TMDB or Goodreads or the USPTO), parse it and fill in the appropriate relations and, ideally, fill in the template for the object. For books, for example, I can see it pulling down Authors, creating a ‘Human’ for each author, associating it with the book, pulling down the cover, etc. Hopefully the API will support this kind of workflow.

1 Like