How to Get the Full MusicBrainz Dataset with ISRCs (for Spotify API Matching)?

yclhsn · May 29, 2025, 11:28pm

Hey,

I’m working on a research project for my bachelor’s thesis related to music recommendation systems. One part of my analysis involves comparing seed songs with their recommendations on Spotify, and for that, I need to retrieve a large set of songs with valid ISRCs to cross-match with the Spotify Web API.(I thought ISRCs are the best way to do the match?)

So far, I’ve worked with the public JSON recording dump, but it only contains a few thousand recordings with ISRCs — far from the millions that MusicBrainz is known to have.(Database statistics - Timeline graph - MusicBrainz)

Here’s what I’m trying to understand:

Is there a way to access a more complete or full dataset of MusicBrainz recordings that include ISRCs?

Are ISRCs only available through a full PostgreSQL DB setup?
Is there an alternative dataset or dump that includes them?

What’s the recommended way to extract this information efficiently (recording title + artist + ISRC)?

Should I set up the full PostgreSQL DB and query recording + isrc + artist_credit tables?
Or is there a lighter-weight method that gives sufficient coverage?

Any tips for matching MusicBrainz data to Spotify tracks via the Spotify API?

I’m currently searching tracks using ISRCs, but in many cases, Spotify returns no match even though the song exists with a different ISRC or under slightly different artist names or doesn’t exist on Spotify at all.

I’d appreciate any insights, best practices, or recommended tools you’ve used for similar large-scale MusicBrainz extraction and ISRC-Spotify mapping workflows.

Thanks a lot in advance!