Greetings! I love the MusicBrainz data dump. When I downloaded it, I noticed that most songs are missing from artists that have special characters (e.g. Guns ‘N’ Roses, A-Ha). Is this a known issue?
How are you using the dump? Are you importing it into PostgreSQL, and if so, what query are you running?
I’m just looking at a smaller sample dump, but I see
Guns N’ Roses included in the
artist table (ID 1440) and referenced in various places in the
I downloaded the dump. I saved the recording and artist files as .csv (comma delimited), then imported it into mySQL. I still have the original “recording” file from the dump (it was the Jan 28 dump). When I search the original file on Guns N Roses’ id (1440) it does nothing comes up in the column with the artist IDs. In the SQL file , “SELECT * from recordings WHERE artist_id = 1440” yields nothing.
Is artist_credit_name a better table to use? I would like artists’ names and titles of the songs they did, along with necessary IDs. Anything additional is a bonus.
Please let me know if my understanding/guessing is correct:
- You downloaded a dump from one of the sources listed at https://musicbrainz.org/doc/MusicBrainz_Database/Download.
- You converted the
mbdump/recordingfiles from their original format (which I think is TSV with null columns represented as
\N) to CSV.
- You created
recordingtables in MySQL and imported the CSV data into them.
Is there a script somewhere that you used to import the dump? The data might be getting mangled somewhere in the process of getting it into MySQL.
Also, I’m not very knowledgeable about the MB database schema, but the
recording table doesn’t link directly to the
artist table (and doesn’t have an
artist_id column), so I wouldn’t expect to see the
1440 Guns N’ Roses artist ID referenced directly in
recording. See https://github.com/metabrainz/musicbrainz-server/blob/50a7a283cc3d62bedd8d391436f54c7e54ed0443/admin/sql/CreateTables.sql#L3230.
recording can be joined with the
artist_credit table, which can then be joined with
artist_credit_name.artist is where you’d see the
1440 ID corresponding to Guns N’ Roses.
So a query to get all recordings credited to Guns N’ Roses might might look something like this (untested):
SELECT ... FROM recording INNER JOIN artist_credit ON recording.artist_credit = artist_credit.id INNER JOIN artist_credit_name ON artist_credit.id = artist_credit_name.artist_credit WHERE artist_credit_name.artist = 1440 GROUP BY recording.id
Yes, those were the steps I took. I don’t recall the script or wizard I used for the conversion. But I opened the original database dump for recordings in Notepad++ and I could not find the Guns N Roses artist id in its id column.
Recordings don’t have a column with artist ID. Recordings are linked to an artist_credit, which is linked to one or more artist_credit_name, and that has an artist column containing the artist ID.
See also the schema at MusicBrainz Database / Schema - MusicBrainz and the SQL derat posted above.