Data quality in MusicBrainz?

Whilst automating enrichment, standardisation and clean up of metadata that matters to me and directly affects how I get to interact with my music I’ve noticed some data issues I’m curious about.

To add mbid’s to my existing music collection I pulled a dump of the musicbrainz database about two weeks back and then pulled the mbid and artist name from the artists table.

Doing some basic housekeeping there are 19 entries in the artists table where a mbid has no associated name/text:

Am I missing something or are these indeed anomalies?

After you have you merged some entities, the old MBID will always redirect to the merged MBID.
Maybe it’s that.

Not sure I’m following.

After eliminating namesakes there are 2,080,637 unique names in the artist table and there are 328,326 entries where a name appears more than once in the artists table - these are clearly namesakes, each with their own mbid, which is what enables the music server to differentiate one from another when presenting an artist’s discography and appearances in VA albums and albums where they’re not the albumartist.

Here’s another example of what looks like a data quality issue:
c69b34a4-3082-4a2f-b063-bd6f177e025f Unwound: A Tribute to George Strait in the artists table. That’d seemingly be an album/release as opposed to an artist.

I get no results for that MBID:

No MusicBrainz entities match the MBID c69b34a4-3082-4a2f-b063-bd6f177e025f. Either it’s incorrect, it was for an entity that has since been removed, or it is an ID for something else than an entity (for example, a relationship type).

And the the MBIDs from your screenshot all lead to artists starting with | (U+007C: VERTICAL LINE). That doesn’t look like a coincidence.

5 Likes

Has little Bobby Tables released an album? :smile:

9 Likes

That’s no coincidence, I should’ve known better than to presume nobody is stupid enough to use a vertical line in their band name, let alone start with one. Wonder how many have started their names with or included U+0009 then.

2 Likes

I’m pretty certain there is a post from mayhem bemoaning these creative artist names (in jest of course) for the confusion they bring :stuck_out_tongue:

Someones been tracking a few of them:
https://musicbrainz.org/tag/sillyname

6 Likes

What really beats me is someone cares enough about their music to have gone to the trouble to add it to MusicBrainz.

Well the aim of the project is to encompass all publicly released music so if that’s released by someone with an affinity for unicode then so be it :grin:

7 Likes