Metadata is the biggest little problem plaguing the music industry

@rob - I did not see your post until now. I care in the sense that I a few years back looked for a quality metadata source, joined MB, and although I am not a top contributor, I ave contributed in my opinion more than just a casual user and with reasonable detail.

I am happy to provide input, if desired, to the cause.

2 Likes

Thanks for this long and constructive post, it raises some interesting questions.
I’m not sure to understand what you are suggesting to improve the situation though.

Can you elaborate on this missing layer ? How would it relate to releases/tracks/recordings exactly ? Would you attach ISRCs to it instead of recordings ?

2 Likes

It seems to me that both of the questions you present are related. I will attempt to explain that layer:
We have a recording, there is a original master and a remaster. They are not the same, although general speaking, MB consider them as such. Now, at the release level, those recordings can be reused. This differs from MB as it puts the mastering back on the recording.

Trying to keep this short… in today’s messed up era, ISRC plays a critical role as it relates to the digital aspect of songs. In my opinion, meaning one suggestion, is never should two ISRC’s be combined into the same recording. It is like when I started here and I was told no recording can have more than one ASIN, which is actually false, but this is the same premise as it relates to the origins and royalties.

The missing layer, to be more direct, is the mastering of the recording. See, the original recording is what it is. Now, you do in fact have a second layer of mastering … MFiT, Amazon, FLAC distributors, CDs, etc. That layer is missing. MFiT is a sore point with me, and I wish to avoid that debate here.

In short, MFiT only means that in the master they gave to produce “another master” is fine tuned with the intended second master as focus. See, with a release, you will master an already mastered recording to the release, please not that I use the term master/mastering in a certain way. On one side, you master when the different mics are combined into a recording, this is a mastering process. Now, you then take those and master them to a release. In older days, that as all the same. In modern days, MFiT is proof it now differs. I can recall proof of CDs being sold as nothing but MP3s pressed onto CD, I did not say burned, but passed to CD. We have the same today with “fake FLACs”.

Digital releases NEED specificity. I personally have MP3 releases and a “duplicate” M4A release. Are they the same? NO. They serve different purposes and are like having a release on CD and cassette. The same can be said on FLAC files of 16 bit and 24 bit, but again, not going there. I mean just to raise the point, not to debate it.

I feel as I am rambling … A recording that is listed as a recording on more than one release should sound identical on all releases it is used on, have the same meta attributes, etc. If it is unknown, it should be a unique recording, assuming we still have no intermediate layer.

1 Like

I wanted to address this separately. The ISRC is applied to the recording master, not the medium master. The same royalties can apply to 10 CDs all with different barcodes as an example. What is critical is to know what recording, or better what rendition of a recording, is used where.

So you would attach exactly one ISRC to one “master of a recording” entity, right?
And “original recording” would have metadata like today (but not ISRC), and linked to one or more “master of a recording” ?

Different compression algorithms or even audio formats settings = different “master of a recording” ?

Current definition of a recording is:

A recording is an entity in MusicBrainz which can be linked to tracks on releases. Each track must always be associated with a single recording, but a recording can be linked to any number of tracks.

A recording represents distinct audio that has been used to produce at least one released track through copying or mastering. A recording itself is never produced solely through copying or mastering.

Generally, the audio represented by a recording corresponds to the audio at a stage in the production process before any final mastering but after any editing or mixing.

What would you change in this definition and how would you define “master of a recording” entity ?

Sorry for asking too many questions, but i’m trying to understand which changes would be required and how they would fit in current MB database structure.

3 Likes

Let’s get some official ISRC documents in here.

4.1.3 No re-use
[…]
A new ISRC should be assigned whenever a recording has been re-issued in a revised or fully remastered form. Also see Sections 4.9.1 Re-mixes/ Edits / Session Takes and 4.9.10 Re-mastering

4.1.4 Format Independence
A single ISRC is used for each unchanged recording regardless of the format in which it is released.

4.9.10 Re-mastering
When a track is re-mastered for the purpose of reproduction on a new carrier without restoration of
sound quality (also see Section 4.9.1 Re-mixes/ Edits / Takes), then no new ISRC is required.
It is nevertheless the Registrant’s responsibility to decide where to draw the line between sound
restoration (full re-mastering) and simple re-mastering.

FAQ:

7. Our company uses in-house code for identifying our sound and music video recordings. We then use this in the designation code of the ISRC. Sometimes an in-house code may apply to two versions of the same recording because we have remastered some of our backstock for re-issue. Can we use the same ISRC for the new remastered version?

No. Re-use of an ISRC that has already been allocated to another recording or to
another version of a recording is not permitted. This is in order to guarantee the unique
and unambiguous identification provided by an ISRC.

A new ISRC should be assigned whenever a recording has been re-issued in a revised or re-mastered form, even if both items have the same in-house code.

More stuff:
ISRC bulletin archive

GRid (Release identification, BTW has anyone seen these in the wild?)

What a bummer:

The primary function of the GRid is to support machine-to-machine communication
through system-to-system messaging. It is therefore intended to be largely invisible
in use. However, there may be circumstances in which GRid is displayed to a human
user, in which case some rules for presentation are in place.

8 Likes

That’s all nice and fine, but the specs say how it should be. Doesn’t change the fact that the exact same recording often gets assigned new ISRCs in reality. I have seen plenty of examples for this. And I bet there are enough cases where the same ISRC was used for different recordings. I don’t know any myself, but one of you can probably provide an example.

In the end MB needs to deal with the often messy reality.

4 Likes

I think we are doing fine with our definition. Just wanted to give all of this a bit more context. :slight_smile:

4 Likes

@Zas - give me a bit to put this together. This was brought up in the past, so I want to collect all the data on this before just posting info. The concept remained with the three tiers (work, recording, release) but changed where and what data is stored in which object, and the creation of a new object off of the recording with a one-to-many relationship to the recording.

1 Like

@outsidecontext and @chaban-
Thanks for posting that, it is helpful for all to see the specification. Although there are times that a new ISRC is assigned to the same recording, I would like to see some data supporting the frequency of this. I believe (meaning that it is my educated guess) that this is more an issue in the past. That said, I could be wrong on that, I can only speak from my own personal experience.

I believe the issue will be a wash with the barcode issue in the same. A barcode is sometimes used on two different releases, a new ISRC is sometimes assigned to the same recording. The barcode is a bit more destructive since having a duplicate recording does not really provide bad/conflicting data, whereas a barcode on two different releases can cause some brain malfunction while you try and figure it out.

Although I am 100% on the side of a reform here, I will also be first to admit that nothing will ever be perfect. Even if MB could design a system that itself is perfect, junk in = junk out. So any errors from the music industry themself, or any other source for that matter, are simply out on the control of MB. Although I hate saying it from a data standpoint, it is just something we need to accept and should focus on the side of matching the metadata, not trying to change it to what we think it should be. There is a different place for stuff like that, as it is also important data.

Speaking using an example of iTunes music files via download … there is a difference of data importance compared to MB. This is also the case, but a bit different, with other stores like Amazon, but I have a large base of samples to use for iTunes so I speak to that with facts to support it in hand. One great identifier is the “vendor” atom. Currently this is formatted as :isrc:. The ISRC is the ISRC, simple. The vendor is the label (and sometimes not a label) that is tied to the ISRC. There is also a “copyright” atom which supplies the c and/or p holders, depending. That is one big conflict I See trying to locate the proper label to see in the label MB field as it is not really disclosed to us in the same as physical media where you match logo/picture to name and you have it.

As I mentioned before, I believe our job should be to first to properly identify music as it is and can be identified, not what we think should be there. Second is to make some sense of the crap and make it usable.

1 Like

Release is an aggregate and stands beside the other two. On the atomic level MB goes like this: work → recording → track (which not coincidentally mirrors FRBR’s/LRM’s work → expression → manifestation, a model that NGS was to a certain extent based on). But ok, I’m arguing semantics here.

Now that’s something I can’t agree with at all! While all those years I’ve been using MB for my personal collection, still from some idealistic point of view I genuinely believe in MB’s strive for being the perfect/ultimate music database (and eventually a cultural one, indexing other art forms too). Which means it shouldn’t repeat industry’s cataloguing mistakes.

However I do agree with you that, ideally, it should fully map industry’s standards.

Multiple ISRC’s jumbled into one MB recording is simply lost data. So, if our own definition of recording doesn’t match the industry’s one (for whatever reasons - remasters, erroneous code assignment by different registrants etc.), then we need to go down to a more granular level (recording → track) to reflect all the exceptions.

Now I don’t think I’m in a position to discuss the technical side since that’s not my cup of tea. I can only present the idea for a specific case:

  • ISRCs remain attached to MB recordings
  • Two separate masters of a recording are represented as different MB tracks
  • each ISRC can be optionally mapped to different sets of tracks

I have actually encountered a situation like this not a long time ago and it’s already been discussed in another thread:

  • An exactly the same recording, Godmother by Holly Herndon and Jlin has been first released as a single in 2018 and assigned a code GBAFL1800302.
  • It was reissued as a part of the 2019 album spanning multiple digital, cd and vinyl issues with a new ISRC, GBAFL1800329.

I don’t think there’s any need to question or ponder on this decision, only to reflect it properly. That is, the [MB recording 1] contains two ISRC codes, where [ISRC 1] maps to [MB track 1] and [ISRC 2] maps to [MB track 2], [MB track 3] and [MB track 4].

7 Likes

This is an interesting approach. Not wanting to argue over semantics as you have also stated, the object that represents the recording could hold multiple ISRC codes that would then be a key in the relationship to the album or release object. My thought was to make the recording object unique, but this may very well be a better approach.

I wanted to clarify this. What I am meaning is that the problems created within the music industry is not something anyone can fix but them. I think it is still important to capture that data as-is. However, MB can add some intelligence to that and help fix the gap so to say.

Thanks for the example. I am sure they do exist, and your idea seems to be a solid way to address it. For me, using ISRC is ideal as I can query metadata for a code and get what I am looking for. nothing else allows me to accomplish the same. I can then create releases in a more virtual sense, so if the same ISRC is used on 4 releases, those releases can all use the same recording, generally speaking.

For me, MB is not really usable. I tried and contributed and learned while doing so. What I am sad to say is that I have replaced a=literally all of the files that I used MB to tag as I wanted the original metadata back. I believe that there should be a different methodology to Picard, but that is a different topic. Please do not misunderstand me, I do appreciate MB and what it is trying to do or I would not take the time to type this. MB is king when it comes to physical releases, the best I have used. Thinking of myself as just a guy who listens to music, the fact that my music is mostly all digital now, MB does nothing to help me. I believe that MB has built a solid brand and positioned that brand as a leader. Even if my thoughts are all disregarded, I want to see MB transition to the digital era.

1 Like

Actually a very good idea IMHO.

In practice, industry is actually attributing ISRCs to tracks rather than recordings, and this is why we end with multiple ISRCs for one recording.
Also definititions of remaster/edit (=whether to create a new ISRC or not) vary a lot in the industry, they will never match MB definition perfectly anyway.

@yvanzo @Bitmap : What’s your (technical) opinion on this idea ? Is it feasible ? Any drawback ?

5 Likes

Drawbacks are mostly on the UI side: we would need to figure out a way to do this assigning. The schema is always easy, how to make it clear and understandable is almost always hard :slight_smile:

I don’t think it’s impossible, though. There’s also been talk of just assigning ISRCs to tracks only, which is also a possibility I guess. Either of them would require fairly big changes to the UI but might still be doable in a sensible way.

4 Likes

Being able to add ISRC only from release level (to add it to tracks), would be a great improvement for quality of additions, IMO. :slight_smile:

9 Likes

@Zas - I find it hard to directly answer your question on making definition changes right off the start here. There seems to be conversation on where attributes should be placed, and it seems like it is in a positive direction. The definitions, I would think, will depend on which objects hold which attributes.

Looking at past discussions, I found 2 elements to add.

  1. A digital release could be all-inclusive in the sense that references could carry attributes of their own, its own object that is tied to the release. In this, you an store the store IDs, store barcodes, etc. This would greatly reduce the digital duplicates which seems to be a major issue with many, and would also allow for the details offered by each store to be captured, like a sub release if you will of the main one. Not sure if that is technically possible given the current database, but it is a thought.
  2. The performers could also be an object of their own tied to the recording. The reason for is is lets say you have a recording and a remastered recording, for whatever reason assigned a new recording due to distinct enough differences. Instead of having to add all the performers again to the new recordings, and thusly the new entire release, you could simply attach the performers object to it. Same could apply to recordings that have a normal set of performers, and sometimes a guest musician, like in remixes and such. In these cases, you could simply add the normal performer object, and add the individual extra performers as needed.

I do think that Skeebadoo’s suggestion for ISRC handling is good, and even say as it is probably better than the ideas previously discussed. Someone had told me a while back that such additional levels of data is just not doable given the database structure, but since such things are being discussed now, maybe that has changed. Regardless, it seems like a good idea.

2 Likes

Thinking about this idea, is it really best to have the ISRC at the release level vs the recording level as an attribute with a one-to-many with tracks on a release? I mean this as to not duplicate ISRC entry on all the releases an ISRC might be used.

For example, maybe as you enter the tracks to a release, it can provide a pulldown to select an existing ISRC or enter a new one. Alternatively, we would need to provide/allow for the user to say unknown and not default to anything.

I just meant that ISRC edits, like most edits are more trustworthy, IMO, when done per release because usually it means you know what you are talking about in context with more proof.

4 Likes

I agree. I was referring to the location table of the data, but yes, it is for sure applied at the release level.

1 Like

Sadly a note with source of ISRCs is made only by a handful editors.

A mapping of tracks-recordings would be really nice.


Recently I had a case of an artist who changed his digital distributor and got new ISRCs by mistake. He also got new UPCs. Nothing else changed, he said.

2 Likes