Explore Your Listening Data Using SQLite & Datasette

heyarne · October 17, 2021, 9:58am

I posted this in the IRC channels and people suggested I should post this here as well, so here we go.

I created a tool that goes through your listenbrainz listen history and saves it in an SQLite database. In the process of doing so it fetches additional information from Musicbrainz and saves that as well. The result gives me a really neat way to explore my musical history using datasette, here’s a screenshot showing a graph of different releases I listened to on each day:

If you think this sounds interesting, feel free to check it out! You can find the code on https://github.com/heyarne/listenbrainz-to-sqlite.

heyarne · October 17, 2021, 10:12am

Here’s a really neat query I just wrote that your can use to figure out what to play next based on your previous choices!

SELECT
  COUNT(*) count,
  rela.release_artists,
  rec.title recording_title,
  rel.title release_title,
  following_rela.release_artists following_release_artists,
  following_rec.title as following_recording_title,
  following_rel.title as following_release_title,
  l.recording_mbid,
  l.release_mbid,
  following_recording_mbid,
  following_release_mbid
FROM
  (
    SELECT
      *,
      LAG(l.release_mbid) OVER (
        ORDER BY
          listened_at DESC
      ) following_release_mbid,
      LAG(l.recording_mbid) OVER (
        ORDER BY
          listened_at DESC
      ) following_recording_mbid
    FROM
      listens l
  ) l
JOIN
  recordings rec on l.recording_mbid = rec.mbid,
  recordings following_rec on l.following_recording_mbid = following_rec.mbid,
  releases rel on l.release_mbid = rel.mbid,
  releases following_rel on l.following_release_mbid = following_rel.mbid,
  release_artists_label rela on l.release_mbid = rela.release_mbid,
  release_artists_label following_rela on l.following_release_mbid = following_rela.release_mbid
WHERE
  following_recording_mbid IS NOT NULL
  AND LOWER(recording_title) LIKE LOWER('%' || :current_track || '%')
GROUP BY
  l.release_mbid,
  l.following_recording_mbid
ORDER BY
  count DESC;

alastairp · October 21, 2021, 10:19am

Hi @heyarne
This is really neat, thanks for sharing it. I’ve been following a bit of the development of datasette, and I think it’s a neat tool. How do you find it for this kind of analysis?
As I understand, it works only with a local sqlite database, is that right? This means that if you wanted to join anything against it (for example to get more metadata), you’d need to make sure that also exists in the database?..
I’m thinking mostly along the lines of things like your query that uses the recording title to find matches - it’d be nice to use MBIDs here (and now that we’ve fixed the server error bug that you reported, most of your listens should have them), and then look up additional metadata in MusicBrainz based on these items.

Keep in mind also that we have user stats here, if you come up with any interesting stats then please let us know, we’d love to add more graphs to this page!

Regarding the recommendations for new music to listen to, also take a look at the beginning of our recommendation engine, troi: https://github.com/metabrainz/troi-recommendation-playground/. This is designed to help people build pipelines that take data in from many different sources and use it to generate a playlist of songs to listen to.

thanks again for the demo!

heyarne · October 21, 2021, 11:00am

Thank you for your input and your links! I am aware of the stats page and use it often, troi also looks very interesting.

Datasette works with a local SQLite database, that is correct. I am using the MBIDs to fetch information on artists, releases, and recordings and store it in this database. This creates a small Musicbrainz dump if you will that can be explored based on what your listening habits are. My personal motivation was that I really like the yearly retrospect Spotify gives to listeners, but I want to avoid Spotify itself. I thought “I can do this myself” and this tool is one building block for that.

I remember stumbling over a Jira ticket that listed some of Last.fm’s features that would be nice-to-have for Listenbrainz (this one), and I imagine it to be fairly straightforward to prototype some of those (“When did I start listening to XYZ”), which were on that list, plus maybe more (“What’s the kind of music I listen to during night times / when taking a shower in the morning / …?”). It’s also a playground for me for some of the things that aren’t easily doable in the Listenbrainz interface, so if I hit something interesting there, I’ll definitely share it! Maybe others will enjoy it as well.