Metadata is the biggest little problem plaguing the music industry

Tags: #<Tag:0x00007f2a5a04bfb0> #<Tag:0x00007f2a5a04b998> #<Tag:0x00007f2a5a04b330> #<Tag:0x00007f2a5a04a750> #<Tag:0x00007f2a5a04a250>

Release is an aggregate and stands beside the other two. On the atomic level MB goes like this: work -> recording -> track (which not coincidentally mirrors FRBR’s/LRM’s work -> expression -> manifestation, a model that NGS was to a certain extent based on). But ok, I’m arguing semantics here.

Now that’s something I can’t agree with at all! While all those years I’ve been using MB for my personal collection, still from some idealistic point of view I genuinely believe in MB’s strive for being the perfect/ultimate music database (and eventually a cultural one, indexing other art forms too). Which means it shouldn’t repeat industry’s cataloguing mistakes.

However I do agree with you that, ideally, it should fully map industry’s standards.

Multiple ISRC’s jumbled into one MB recording is simply lost data. So, if our own definition of recording doesn’t match the industry’s one (for whatever reasons - remasters, erroneous code assignment by different registrants etc.), then we need to go down to a more granular level (recording -> track) to reflect all the exceptions.

Now I don’t think I’m in a position to discuss the technical side since that’s not my cup of tea. I can only present the idea for a specific case:

  • ISRCs remain attached to MB recordings
  • Two separate masters of a recording are represented as different MB tracks
  • each ISRC can be optionally mapped to different sets of tracks

I have actually encountered a situation like this not a long time ago and it’s already been discussed in another thread:

  • An exactly the same recording, Godmother by Holly Herndon and Jlin has been first released as a single in 2018 and assigned a code GBAFL1800302.
  • It was reissued as a part of the 2019 album spanning multiple digital, cd and vinyl issues with a new ISRC, GBAFL1800329.

I don’t think there’s any need to question or ponder on this decision, only to reflect it properly. That is, the [MB recording 1] contains two ISRC codes, where [ISRC 1] maps to [MB track 1] and [ISRC 2] maps to [MB track 2], [MB track 3] and [MB track 4].

6 Likes

This is an interesting approach. Not wanting to argue over semantics as you have also stated, the object that represents the recording could hold multiple ISRC codes that would then be a key in the relationship to the album or release object. My thought was to make the recording object unique, but this may very well be a better approach.

I wanted to clarify this. What I am meaning is that the problems created within the music industry is not something anyone can fix but them. I think it is still important to capture that data as-is. However, MB can add some intelligence to that and help fix the gap so to say.

Thanks for the example. I am sure they do exist, and your idea seems to be a solid way to address it. For me, using ISRC is ideal as I can query metadata for a code and get what I am looking for. nothing else allows me to accomplish the same. I can then create releases in a more virtual sense, so if the same ISRC is used on 4 releases, those releases can all use the same recording, generally speaking.

For me, MB is not really usable. I tried and contributed and learned while doing so. What I am sad to say is that I have replaced a=literally all of the files that I used MB to tag as I wanted the original metadata back. I believe that there should be a different methodology to Picard, but that is a different topic. Please do not misunderstand me, I do appreciate MB and what it is trying to do or I would not take the time to type this. MB is king when it comes to physical releases, the best I have used. Thinking of myself as just a guy who listens to music, the fact that my music is mostly all digital now, MB does nothing to help me. I believe that MB has built a solid brand and positioned that brand as a leader. Even if my thoughts are all disregarded, I want to see MB transition to the digital era.

1 Like

Actually a very good idea IMHO.

In practice, industry is actually attributing ISRCs to tracks rather than recordings, and this is why we end with multiple ISRCs for one recording.
Also definititions of remaster/edit (=whether to create a new ISRC or not) vary a lot in the industry, they will never match MB definition perfectly anyway.

@yvanzo @Bitmap : What’s your (technical) opinion on this idea ? Is it feasible ? Any drawback ?

5 Likes

Drawbacks are mostly on the UI side: we would need to figure out a way to do this assigning. The schema is always easy, how to make it clear and understandable is almost always hard :slight_smile:

I don’t think it’s impossible, though. There’s also been talk of just assigning ISRCs to tracks only, which is also a possibility I guess. Either of them would require fairly big changes to the UI but might still be doable in a sensible way.

3 Likes

Being able to add ISRC only from release level (to add it to tracks), would be a great improvement for quality of additions, IMO. :slight_smile:

9 Likes

@Zas - I find it hard to directly answer your question on making definition changes right off the start here. There seems to be conversation on where attributes should be placed, and it seems like it is in a positive direction. The definitions, I would think, will depend on which objects hold which attributes.

Looking at past discussions, I found 2 elements to add.

  1. A digital release could be all-inclusive in the sense that references could carry attributes of their own, its own object that is tied to the release. In this, you an store the store IDs, store barcodes, etc. This would greatly reduce the digital duplicates which seems to be a major issue with many, and would also allow for the details offered by each store to be captured, like a sub release if you will of the main one. Not sure if that is technically possible given the current database, but it is a thought.
  2. The performers could also be an object of their own tied to the recording. The reason for is is lets say you have a recording and a remastered recording, for whatever reason assigned a new recording due to distinct enough differences. Instead of having to add all the performers again to the new recordings, and thusly the new entire release, you could simply attach the performers object to it. Same could apply to recordings that have a normal set of performers, and sometimes a guest musician, like in remixes and such. In these cases, you could simply add the normal performer object, and add the individual extra performers as needed.

I do think that Skeebadoo’s suggestion for ISRC handling is good, and even say as it is probably better than the ideas previously discussed. Someone had told me a while back that such additional levels of data is just not doable given the database structure, but since such things are being discussed now, maybe that has changed. Regardless, it seems like a good idea.

1 Like

Thinking about this idea, is it really best to have the ISRC at the release level vs the recording level as an attribute with a one-to-many with tracks on a release? I mean this as to not duplicate ISRC entry on all the releases an ISRC might be used.

For example, maybe as you enter the tracks to a release, it can provide a pulldown to select an existing ISRC or enter a new one. Alternatively, we would need to provide/allow for the user to say unknown and not default to anything.

I just meant that ISRC edits, like most edits are more trustworthy, IMO, when done per release because usually it means you know what you are talking about in context with more proof.

2 Likes

I agree. I was referring to the location table of the data, but yes, it is for sure applied at the release level.

1 Like

Sadly a note with source of ISRCs is made only by a handful editors.

A mapping of tracks-recordings would be really nice.


Recently I had a case of an artist who changed his digital distributor and got new ISRCs by mistake. He also got new UPCs. Nothing else changed, he said.

1 Like

That’s right.
I do add edit notes but it’s really inconvenient because I have to come back to my edit history to add edit note after the edit is made.
It would be more convenient to write edit note at the time of edit and that the ISRC submit tool would fill in its technical details there:

3 Likes

Oh look, there is even a mockup:

2 Likes

I posted this on the wrong thread, so I am reposting here, as appropriate:

Yes, as a collective, changes have been discussed in numerous places. Hence the issue I had with a direct question … redefine this. I believe what @Zas was proposing is that I offer a suggestion of sorts. I have many ideas, but I would like what I contribute to be community effort. If I were to put together some form of data flow diagram and structure, is there a format that works for all here?

I am not a dev or maintainer of the database here … but I believe the next step is a proposed structure that shows a different Inheritance and attribute placement. IMO, I believe it best to start with the current chart. Is this available? Either way, I can start, but as a community, we can put together a UML class diagram, DFD, etc on a new way for a release to be built.

Rather than rambling, let’s do it.

1 Like

I am (was) one of those editors. I was making an effort at this, but I found my fruits of labor to come rotten.

EDIT: Meaning, I would thusly start again to add all of the ISRC references I have. As some know, ISRC is important to me, so this is a long list.

1 Like

No, @chaban meant it was a pity only a few editors were referencing their ISRC edits with edit notes. :stuck_out_tongue_winking_eye:

2 Likes

@thwaller - thanks for your thoughts!

I will let the rest of my team (thanks @zas, @reosarevok) run with the discussion about the releases.

As for your points about the music industry I agree – it is overly complex and MB does have a hard time getting beyond what I call public metadata (the data that is freely available – like on the back of a CD). ISRCs are a great example of this – the centralised ISRC database is missing and MB end users do not clearly understand what an IRSC is for.

But really, this is just the tip of the iceberg when it comes to problems in the industry. Lack of good data, a lack of a fair/level playing field for new artists, opaque accounting systems, cronyism and many more issues make the current industry an absolute minefield for anyone trying to enter the space. A perfect nightmare, really.

I’ve seen many companies come in with the best of intentions to try and “fix” the problems in the industry. Most recently a wave of companies all wielding blockchains had all the answers and one by one they all died out – mostly kept out by the industry itself. And industry resistant to any sort of change.

So my question is this: How do we build a parallel universe music industry? Rather than change the existing system, can we imagine a better system with fewer intermediaries who are taking money without adding real value? Can we pay artists and curators fairly? Can we make it more interesting for new entrants into the market, wether they are artists, curators or computer geeks?

If you have thoughts towards this goal, I would love to hear them!

12 Likes

@rob - sorry for writing a book.

IMO, this project, if it were to succeed, will require some good attention, so I would suggest we start real basic. Given a release, that I a consumer of music purchase, that attributes are important, and not only that, but why are they important. This can help develop relationships within the data. I think MB is quite good at physical media, but it is worth looking to see if maybe something could be tweaked or other. Digital releases are a true nightmare to attempt to catalog combined, meaning that we do not simply duplicate each stores separate catalogs as separate. Looking at those releases, I think we can start common:

  1. Artist for release
  2. Title of release
  3. The release event (as MB terms it)
  4. The core songs included + extras – meaning that the core is the common portion (if it applies) to all and the extra are bonus tracks or extra tracks that make them deluxe. The order is of no matter, just the content in common as sometimes deluxe does not order the core in the same as the standard.
  5. Identification…

Identification I believe is where the release object needs to be a child of release events by store. We can have a release (made of artist, tracks, title, ec) released by any number of stores, but the release itself on a descriptive level is the same. Once we have a child object, would can be applied to a release of all types and mediums (CD, digital, etc), we then define the specifics of the release, which for digital, is commonly the store’s release. I see it as similar to the BMG releases where the barcode is/was replaced by the BMG number, thus making it a different release, but the same in most all other ways.

So I might have a release object created, then go here to the parents:

  1. iTunes
    1a. iTunes release ID
    1b. iTunes store ID
    1c. etc…
  2. Amazon
    2a. ASIN
    2b. Store ID
    2c. etc…

We can then include on those parent release objects the details of the release like the bitrate, sample rate, container or anything else that described what that store released under its ID. When adding those attributes, I believe it wise to have the option (set true by default) to apply it to all recordings or not. This will normally be the case, but sometimes a release is not that way. There might be things like bonus tracks, videos, etc that have different attributes than the rest of the bunch.

Trying to keep this short (not really working), the same concepts apply for recordings. Forgive my likely misuse of terms as per MB, but follow the idea. We start with a work. We then get a performance of that work which is performed by a group of artists. We then use that performance as a recording to apply to releases, and when this is done, the ISRC is applied to the proper ISRC used for that recording on that release (This was the idea mentioned prior). So on and so on…

That is by no means a full plan for anything, just starting the idea. I also keep in mind complexity. As it stands, editors in MB do a lot of things wrong, and I mean that without judgment. When I first made edits they were a mess. Compare them to now (after many auto editors knocked me around :)), they are better. But still, my edits are not as complete as they could be, and still I am sure there are mistakes made, sometimes inadvertent and sometimes out of ignorance I am actually making a mistake. I also look at databases like Wikipedia and I am amazed at how much editor time is spent reversing other editor’s edits. That is why I think a more object orientated approach could be better. If someone wants to just add a list of recordings, they can do so. I can add release A, from artist B, containing these 10 recordings. I need not specify more, someone else can always create parents from it to show the attributes of the specific release.

Then you could have a set of moderation, which I believe is loosely there right now. You could have editors that watch for releases added with no parents and add one or many, etc. and even have editors that watch specifically for things like store releases to glance over the data entered. There is one auto editor who comes to mind for me that does/did this with release labels. I would have a good feeling knowing that an editor like that was watching over edits being made including my own. This means not that one verifies each and every thing, just that the obvious is looked for and all gets an even random once over.

The last item I wanted to add before completing the novel is references. References are an important part of a release and a very difficult one at that. Is Discogs a valid reference, is Wikipedia a valid reference, etc? Well it all depends on what their reference is. I could literally go to Discogs and say that Drake’s real name is mine, then go to MB and use that reference to say the same. Also, I believe that references should have the ability to be applied to specific aspects. For example, I might use a digital reference for a CD release. Example of this might be to list performers. I then mark this as a reference for performers, thus others can ignore that it is the wrong release medium. References are sometimes hard to find, but I think that all references (assuming we speak of valid ones) are valid.

EDIT: I failed to address the making sure credits are applied appropriately. With the creation of the more detailed objects, MB can quickly show what artists, performers, engineers, studios, etc are involved in what. This can apply from the work, the recording, etc all the way to the store it was sold on.

I don’t know if you already heard about it by Jaxsta launch his beta today. From what I saw for the moment, it’s really Pro oriented but also sort of MB like.

It is unfortunate that they are for profit and charge for subscriptions. That looks like a great source of data for a potential partnership and sharing of data.

2 Likes

Another article I found about Metadata problem, “Why Proper Metadata In Music Is So Important?” by Karl Fowlkes:

https://link.medium.com/89WSQyNW1Y

4 Likes