Metadata is the biggest little problem plaguing the music industry

Oh look, there is even a mockup:


I posted this on the wrong thread, so I am reposting here, as appropriate:

Yes, as a collective, changes have been discussed in numerous places. Hence the issue I had with a direct question … redefine this. I believe what @Zas was proposing is that I offer a suggestion of sorts. I have many ideas, but I would like what I contribute to be community effort. If I were to put together some form of data flow diagram and structure, is there a format that works for all here?

I am not a dev or maintainer of the database here … but I believe the next step is a proposed structure that shows a different Inheritance and attribute placement. IMO, I believe it best to start with the current chart. Is this available? Either way, I can start, but as a community, we can put together a UML class diagram, DFD, etc on a new way for a release to be built.

Rather than rambling, let’s do it.

1 Like

I am (was) one of those editors. I was making an effort at this, but I found my fruits of labor to come rotten.

EDIT: Meaning, I would thusly start again to add all of the ISRC references I have. As some know, ISRC is important to me, so this is a long list.

1 Like

No, @chaban meant it was a pity only a few editors were referencing their ISRC edits with edit notes. :stuck_out_tongue_winking_eye:


@thwaller - thanks for your thoughts!

I will let the rest of my team (thanks @zas, @reosarevok) run with the discussion about the releases.

As for your points about the music industry I agree – it is overly complex and MB does have a hard time getting beyond what I call public metadata (the data that is freely available – like on the back of a CD). ISRCs are a great example of this – the centralised ISRC database is missing and MB end users do not clearly understand what an IRSC is for.

But really, this is just the tip of the iceberg when it comes to problems in the industry. Lack of good data, a lack of a fair/level playing field for new artists, opaque accounting systems, cronyism and many more issues make the current industry an absolute minefield for anyone trying to enter the space. A perfect nightmare, really.

I’ve seen many companies come in with the best of intentions to try and “fix” the problems in the industry. Most recently a wave of companies all wielding blockchains had all the answers and one by one they all died out – mostly kept out by the industry itself. And industry resistant to any sort of change.

So my question is this: How do we build a parallel universe music industry? Rather than change the existing system, can we imagine a better system with fewer intermediaries who are taking money without adding real value? Can we pay artists and curators fairly? Can we make it more interesting for new entrants into the market, wether they are artists, curators or computer geeks?

If you have thoughts towards this goal, I would love to hear them!


@rob - sorry for writing a book.

IMO, this project, if it were to succeed, will require some good attention, so I would suggest we start real basic. Given a release, that I a consumer of music purchase, that attributes are important, and not only that, but why are they important. This can help develop relationships within the data. I think MB is quite good at physical media, but it is worth looking to see if maybe something could be tweaked or other. Digital releases are a true nightmare to attempt to catalog combined, meaning that we do not simply duplicate each stores separate catalogs as separate. Looking at those releases, I think we can start common:

  1. Artist for release
  2. Title of release
  3. The release event (as MB terms it)
  4. The core songs included + extras – meaning that the core is the common portion (if it applies) to all and the extra are bonus tracks or extra tracks that make them deluxe. The order is of no matter, just the content in common as sometimes deluxe does not order the core in the same as the standard.
  5. Identification…

Identification I believe is where the release object needs to be a child of release events by store. We can have a release (made of artist, tracks, title, ec) released by any number of stores, but the release itself on a descriptive level is the same. Once we have a child object, would can be applied to a release of all types and mediums (CD, digital, etc), we then define the specifics of the release, which for digital, is commonly the store’s release. I see it as similar to the BMG releases where the barcode is/was replaced by the BMG number, thus making it a different release, but the same in most all other ways.

So I might have a release object created, then go here to the parents:

  1. iTunes
    1a. iTunes release ID
    1b. iTunes store ID
    1c. etc…
  2. Amazon
    2a. ASIN
    2b. Store ID
    2c. etc…

We can then include on those parent release objects the details of the release like the bitrate, sample rate, container or anything else that described what that store released under its ID. When adding those attributes, I believe it wise to have the option (set true by default) to apply it to all recordings or not. This will normally be the case, but sometimes a release is not that way. There might be things like bonus tracks, videos, etc that have different attributes than the rest of the bunch.

Trying to keep this short (not really working), the same concepts apply for recordings. Forgive my likely misuse of terms as per MB, but follow the idea. We start with a work. We then get a performance of that work which is performed by a group of artists. We then use that performance as a recording to apply to releases, and when this is done, the ISRC is applied to the proper ISRC used for that recording on that release (This was the idea mentioned prior). So on and so on…

That is by no means a full plan for anything, just starting the idea. I also keep in mind complexity. As it stands, editors in MB do a lot of things wrong, and I mean that without judgment. When I first made edits they were a mess. Compare them to now (after many auto editors knocked me around :)), they are better. But still, my edits are not as complete as they could be, and still I am sure there are mistakes made, sometimes inadvertent and sometimes out of ignorance I am actually making a mistake. I also look at databases like Wikipedia and I am amazed at how much editor time is spent reversing other editor’s edits. That is why I think a more object orientated approach could be better. If someone wants to just add a list of recordings, they can do so. I can add release A, from artist B, containing these 10 recordings. I need not specify more, someone else can always create parents from it to show the attributes of the specific release.

Then you could have a set of moderation, which I believe is loosely there right now. You could have editors that watch for releases added with no parents and add one or many, etc. and even have editors that watch specifically for things like store releases to glance over the data entered. There is one auto editor who comes to mind for me that does/did this with release labels. I would have a good feeling knowing that an editor like that was watching over edits being made including my own. This means not that one verifies each and every thing, just that the obvious is looked for and all gets an even random once over.

The last item I wanted to add before completing the novel is references. References are an important part of a release and a very difficult one at that. Is Discogs a valid reference, is Wikipedia a valid reference, etc? Well it all depends on what their reference is. I could literally go to Discogs and say that Drake’s real name is mine, then go to MB and use that reference to say the same. Also, I believe that references should have the ability to be applied to specific aspects. For example, I might use a digital reference for a CD release. Example of this might be to list performers. I then mark this as a reference for performers, thus others can ignore that it is the wrong release medium. References are sometimes hard to find, but I think that all references (assuming we speak of valid ones) are valid.

EDIT: I failed to address the making sure credits are applied appropriately. With the creation of the more detailed objects, MB can quickly show what artists, performers, engineers, studios, etc are involved in what. This can apply from the work, the recording, etc all the way to the store it was sold on.

I don’t know if you already heard about it by Jaxsta launch his beta today. From what I saw for the moment, it’s really Pro oriented but also sort of MB like.

It is unfortunate that they are for profit and charge for subscriptions. That looks like a great source of data for a potential partnership and sharing of data.


Another article I found about Metadata problem, “Why Proper Metadata In Music Is So Important?” by Karl Fowlkes:


Still about data but mostly the streaming side with Spotify and Apple Music for Artists:


Looks like another will try:


The only thing I know about blockchain is that it requires ever expanding amount of machines to live on. Which is an ecological aberation, accelerating our race to doom.
So it’s all very good and welcome when a blockchain based system goes out.


And another article about licensing and Metadata. Always the same problems without real action of industry players:


via @rob on IRC:

A goal for the future is for music metadata services–including the open source MusicBrainz database as well as proprietary databases like Jaxsta’s–to make their data accessible in MEAD format through standard query protocols.

Standard for the Communication of Media Enrichment and Description Information (MEAD)


Still in the more or less same subject, Pandora (music streaming service only available in the USA) adds a new feature this week:

And at the same time, I discovered the Recording Academy’s launch yesterday “Behind the Record” campaign:


FWIW, and since someone mentioned FRBR earlier in the thread, there are relationships that could be used to indicate the remastering process: [MB track 1] (a manifestation in FRBR) could link back to [MB recording 1] (expression) with [ISRC 1], and [MB tracks 2, 3, 4] could link back to [MB recording 2] with [ISRC 2], but then [MB recording 2] could link back to [MB recording 1] with the relationship “revision”, to indicate that the second expression is a version of the first.