Metadata is the biggest little problem plaguing the music industry

Fabe56 · May 30, 2019, 5:03am

Hello,

I wanted to share this article from The Verge, interesting reading about how important is the metadata in the music industry.

The_King · May 30, 2019, 6:43am

This is why it drives me crazy seeing different ISRCs being merged everyday and remasters being merged with older masters. That mass merge tool is the most abused thing used on musicbrainz, I stopped even paying attention to recording merges that use it because it is so out of hand.

Zas · May 30, 2019, 8:23am

Thanks for sharing this interesting article !

Too bad MusicBrainz isn’t cited… If someone has Verge account, please point at MusicBrainz in comments or something, thanks !

Merging recordings usually requires a vote, and it is very possible different ISRCs match the same recording.
Definition of a remaster is rather fuzzy in music industry.

You can read:

mfmeulenbelt · May 30, 2019, 8:24am

The music industry has brought this mess upon itself and they should be the one to fix it. Why does a platform like Spotify have to pay people who write lyrics for different artists? Why don’t the artists pay the people who work for them? If every copyright holder to a recording made sure the people involved in the recording were paid according to their contribution and contract, the situation would be a lot clearer. Platforms would only have to pay the copyright holder and there would be a single entity (the copyright holder) to hold accountable if royalties aren’t paid.

But of course the industry isn’t going to fix it, because it benefits massively: all those unclaimed royalties end up in the pockets of the mayor players in the industry, and not the artists, songwriters, engineers etc.

jesus2099 · May 30, 2019, 8:30am

The royalties are paid to work authors, the recording merges usually bring work relationships to release that had none.

But then SACEM, JASRAC, etc. they have their list of royalty recipients, not necessarily always equal to authors and that’s normal (band credits, public domain, etc.)

The recording merges shall not be done for different recordings like remixes. We merge remasters because otherwise we would almost have one full set of recordings per release.

And anyway the royalties do not concern the mastering engineers.

jesus2099 · May 30, 2019, 8:43am

Because Spotify knows how many times it’s played.
You’re describing an alternate system that is not less complex, IMO.

thwaller · May 30, 2019, 8:44am

This is one of the main reasons I am not really active here anymore. To me (and my music collection) the ISRC is important and is a primary identifier. I hear the responses fro others and I have to say, 100%, my collection is far cleaner and correct keeping ISRC’s separate. To each their own I guess, I get resistance too on mastering and whether it is a duplicate recording or separate.

There is no 100% perfect way to do any of this. In the CD era, the argument to not isolate ISRC’s was somewhat relevant. In the digital era, I would NEVER combine ISRC’s in my collection.

mfmeulenbelt · May 30, 2019, 8:51am

I don’t think you understand what I suggest. Spotify knows how many times a recording is played, which should result in a single amount of money earned. That amount of money could be paid to the copyright holder, who should know who contributed to the recording because it has the contracts with the contributors. Spotify doesn’t have these contracts, and metadata isn’t going to deliver this information reliably, as the article from the Verge explains.

It’s a simpler chain of ownership: music platform > copyright holder > contributors. Each of the links in the chain knows who to sue if it suspects it doesn’t get paid enough.

thwaller · May 30, 2019, 8:58am

I think you might be missing a few things. The copyright (c) holder can change depending on where / how / when a recording is released. Do you maybe mean the ( p ) holder? It is still a copyright, so I ask for clarification there.

RE: “It’s a simpler chain of ownership: music platform > copyright holder > contributors.” – maybe I misunderstand, but I disagree with this completely. Plus, again, need distinction between (c) and ( p ).

mfmeulenbelt · May 30, 2019, 9:08am

Well, I was speaking of the copyright holder of the recording, so phonographic copyright. My chain of ownership was just an example though, the point is that each link only communicates with the next link it has a contract with. The music platform doesn’t have a contract with the studio engineer or the songwriter, and metadata is not reliable, as the article shows.

If you disagree completely, please say why. It’s not much of a discussion otherwise.

thwaller · May 30, 2019, 9:23am

Well, there are a few points. I would like to start at mastering. MB tends to disregard mastering at the recording level and apply it to the release. This is very problematic. Mastering has many levels … you can master a recording, then you can master a release (ike iTunes, CD, etc) and there is also a remaster which could involve the recording level, the release level or even both.

Second I would like to touch on the platform portion. There are items like iTunes. You have iTunes original, iTunes Plus and MFiT (Mastered for iTunes). The list of contributors will likely be different if a recording is released on more than one of the standards. Regarding digital platforms, There is a whole can of worms here. If one wants to consider a Spotify stream different from a Google Music stream for example, then we must also consider that all digital download containers, methods of encoding including bitrate, codec, sample rate, etc all be different as well.

MB is missing a level of distinction between the recording and release. That is where the ISRC really fits in, at least in my opinion. Then you can have numerous “releases” of streaming when in reality they are all the same release, if that makes sense. Kind of a distinction between the release and the delivery of the release.

mfmeulenbelt · May 30, 2019, 9:57am

Please note that my idea has nothing to do with MusicBrainz. Metadata isn’t going to solve the original problem, it’s a red herring. The only way to solve this mess is through clear contracts.

Each of these mastering jobs is done (or should be done) on a contract. The only legal obligation is between the mastering engineer or company and the copyright holder of the mastered release or recording. The only payments should be between those two parties. Streaming platforms and stores are not in the picture here (unless they are the ones contracting out the mastering).

Streaming platforms shouldn’t just be ripping CD’s and putting them online for streaming, and I really doubt any of them do that. They are in contact with a copyright holder to add recordings or releases to their platform. They agree on a contract, and pay the copyright holder and only the copyright holder. The copyright holder should take care of its own contracts further down the chain.

And this is where it goes wrong in the current system. The music platform is supposed to pay not only the copyright holder they have a contract with, but also a number of contributors like songwriters. But it doesn’t really know these songwriters. There are no contracts between the streaming platform and the songwriters. There is supposed to be metadata, but we all know that is not reliable and never will be. So many of these contributors just don’t get paid and all the companies between the consumer and the songwriter, engineer, producer etc. become extra rich.

It does make sense, but there is a trade-off I think. Adding this extra level also adds complexity for users, and MusicBrainz’ database schema already is very complex. Would the benefits of this extra information outweigh the added difficulty for users? A database with millions of different users and many different use cases is always going to be a compromise.

jesus2099 · May 30, 2019, 10:33am

I don’t know if all copyright holders are trustworthy or reliable…
I think artists feel more at ease with current royalty collectors/distributors that are JASRAC, SACEM, GEMA, etc. I think it’s better with third parties, with no “direct interest” in this or that song.

thwaller · May 30, 2019, 10:37am

Yes, I did go in a different direction than you intended I can see now.

To the last portion on the trade-off, all I can say is the article posted did say the following:
“ATTEMPTS TO CREATE A GLOBAL CENTRALIZED DATABASE FOR SONG METADATA HAVE ALWAYS ENDED IN FAILURE”
My thought would be how to change that, but there is the problem. It takes effort and detailed work, but no one has stepped to the plate as of yet.

reosarevok · May 30, 2019, 11:54am

To be fair, that is absolutely intentional: our guidelines call for merging remasters. It’s not bad editing.

psychoadept · May 30, 2019, 11:58am

As witnessed often, most recently: New editors: a guided, staged, self-training approach - LONG - #15 by rvb

mfmeulenbelt · May 30, 2019, 12:09pm

And nobody will, so why keep looking for a solution there?

rob · May 30, 2019, 1:01pm

Hi all!

A lot of good points have been raised here and the intractability of this problem is quite clear. I’ve spent years thinking about how to make an impact or otherwise improve this mess. It really feels that the incumbent industry doesn’t want any change; they don’t care if artists get paid or not.

I care.

Could something like this allow us to build an parallel universe music industry with proper values and proper rewards for people who do real work?

I’m not convinced it could work, but I am interested in starting a conversation.

Thoughts?

thwaller · May 30, 2019, 9:20pm

I think we may be still on different angles here, so please understand I speak with the primary effort of MB. This is a long post, so if you have no interest, don’t read it. I make an effort to explain my thoughts vs just listing them with no explanation.

First off, I think the best MB can ever do is to most accurately represent a release. To say no one will ever get a good database so why try is not how things work, or you would not be posting to a forum on the internet as no one would have bothered to invent it. But as you have mentioned, this comes at a price, and one must consider the trade off as diminishing returns come into play. The music industry has their own issue dealing with royalties and how they are paid, but I will add that I believe other high profile systems (like MB) do not help. Using iTunes as an example, and I do this because of the digital sales that I Am aware of, iTunes is at the top being the general winner in the area of volume, consistency of metadata, and correctness of metadata. Please note that I use correctness cautiously.

The ISRC, in my opinion, is one of the most important pieces of metadata for digital releases. Let me explain. With vinyl, cassette, CD, etc, you have things like a barcode which is visible on the release. That is a MAJOR identifier of a release. Please note that I am aware that nothing is ever perfect and even same barcodes can and do appear on different releases. Some CDs do embed ISRC, but that metadata is not as easy to extract and is not always even there. Additionally, you have a physical media. There is printing and well, the look of the media as a whole. If that is all captured, you can with good confidence identify a release, thus also being able to follow the chain to who did and/or gets paid for what.

Now we live in different times. Rather than royalties coming from radio and media sales, streaming is a major portion of that revenue and physical media sales continue to diminish. This complicates things because if a song is listened to, how do we know how to properly identify it? As I see it, at a radio station, this is easy. The station is provided the music directly with the identifiers, so when they play a recording they should know exactly what they are playing. When it comes to streaming sites, I sort of agree on your angle and your points. As it relates to MB, that is where I disagree. You are correct, I believe, that with a source like Spotify that it is easy to trace the play back to the royalty. Spotify was given the recording, and the organization that provided them that recording and metadata should know what they provided. The same applies to Apple Music on streaming. Where this gets muddy for MB is we do not know what we do or do not know. What? I mean that Sporify and Apple disclose to us a set of metadata that can be extracted from using things like JSON from the sites. We know that, but what we do not know is if there is more data that is not disclosed to us users that can identify those recordings. Now realistically, we know that not all is disclosed to us.

Please stay with me, I am trying to explain in enough detail to create the conversation, as I agree it can be helpful… Now we get to digital download sources, like iTunes, Amazon, etc. Now we have the same issue here. The releases all contain metadata put in place by the vendor/distributor. That data is holding the same issue as the streaming, what is it and what is it not? You may have one barcode on the iTunes file and a different barcode on the Amazon file and in reality they could be the same product, meaning that it is all from the same source. We simply do not know. Although we cannot be sure there, we can use the metadata and attributes like container, encoder, encoder settings, etc to identify it. That is the same logic as a physical release… we use all we can to identify something. On a CD, the color of the case matters in MB but a MP3 is considered no different from a M4A, and that makes no sense to me at all given the scope of physical releases.

It is my opinion that in order to become a leader in music metadata, you need to be more accurate on representing a release. I speak not of physical media as that is fairly well done. But over the last 20 years or so, it no longer applies to current releases as it once did, and no one is adapting. What I do is represent a release to start with. So I have a vendor (what MB sometimes does and does not consider to be a release label), ISRC, store ID(s), etc. If any of those attributes change, I do not consider it a duplicate recording. So if I have release ABC from iTunes, Amazon, and other companies, I absolutely consider those different releases and different recordings. This is where that layer between the release and recording is missing in MB. As a user, those are certainly different. But as it relates to the source / master, they may be all the same. Often times, we really do not know. That also opens up again the point I had on mastering and where it does and does not apply.

How can I relate this vs just posting a ton of words, let’s say that I am a “personal Spotify” and I broadcast from my music collection. Now, I need to pay royalties for each recording I stream. I know who to add a play to because I can identify what I played. I might have song 123 and play it, but I also know what release it came from, what vendor/distributor it came from, etc. From there, it is up to that source to do the same, know where their product comes from. On my end, the best I can do is accurately represent what I have. Does this mean that I end up with a lot of duplicate recording, sure. As an end user of 10+ terabytes of music am I ok with those duplicates, absolutely. I am happy to go into why, but that is not so relevant to this here, so I will not go into that. So for me, to use MB to identify and index my music is a downgrade. If I were to tag my files in MB, I would lose data and accuracy. This makes MB useless to be as a tagging source. I am able to use MB though to get data on the release side, but other than that, I can only use it from things like lookups (what bands did this artist perform with, what releases include this recording, etc). But for further detail, MB drops the ball and my metadata as provided by the vendor is far more accurate to thoroughly identify what the recording actually is.

So to me concern which is MB, it cannot be targeted to me as a tagging source. It cannot be marketed to record labels to identify their royalties and all that mess. I cannot use it to identify a recording like a Shazzam type directly, it cannot properly identify remasters in a sense that one CD can have a piss poor master and another have a great master but MB marks them as a duplicate recording (but they most certainly are not from the perspective of the listener), etc. What is MB trying to accomplish in the modern day? It is amazing at a historical perspective like cassettes, vinyls, CDs, etc, that is for sure. So my angle on this is that MB does nothing to contribute to the modern era of music. This is also then indirectly contributing to the problem of royalties. Although the music industry has their own issues with it, MB does not even accurately account for what they do have. I hope that explains my thoughts vs just tossing my opinion out there on this. It is one of the main reasons I do not spend my time with edits anymore as I feel I just dry the top of the tire as it continues to roll through the water. A lot of work and nothing real gets accomplished. I do take the time to discuss here though as I believe there is good to be done, and if I can help shape that for the better, than it is time well spent.

thwaller · May 30, 2019, 9:26pm

@rob - I did not see your post until now. I care in the sense that I a few years back looked for a quality metadata source, joined MB, and although I am not a top contributor, I ave contributed in my opinion more than just a casual user and with reasonable detail.

I am happy to provide input, if desired, to the cause.