How to Get the Full MusicBrainz Dataset with ISRCs (for Spotify API Matching)?

Hey,

I’m working on a research project for my bachelor’s thesis related to music recommendation systems. One part of my analysis involves comparing seed songs with their recommendations on Spotify, and for that, I need to retrieve a large set of songs with valid ISRCs to cross-match with the Spotify Web API.(I thought ISRCs are the best way to do the match?)

So far, I’ve worked with the public JSON recording dump, but it only contains a few thousand recordings with ISRCs — far from the millions that MusicBrainz is known to have.(Database statistics - Timeline graph - MusicBrainz)

Here’s what I’m trying to understand:

  1. Is there a way to access a more complete or full dataset of MusicBrainz recordings that include ISRCs?
  • Are ISRCs only available through a full PostgreSQL DB setup?
  • Is there an alternative dataset or dump that includes them?
  1. What’s the recommended way to extract this information efficiently (recording title + artist + ISRC)?
  • Should I set up the full PostgreSQL DB and query recording + isrc + artist_credit tables?
  • Or is there a lighter-weight method that gives sufficient coverage?
  1. Any tips for matching MusicBrainz data to Spotify tracks via the Spotify API?
  • I’m currently searching tracks using ISRCs, but in many cases, Spotify returns no match even though the song exists with a different ISRC or under slightly different artist names or doesn’t exist on Spotify at all.

I’d appreciate any insights, best practices, or recommended tools you’ve used for similar large-scale MusicBrainz extraction and ISRC-Spotify mapping workflows.

Thanks a lot in advance!