Presence or absence of mould SID enough to have separate releases?

spitzwegerich · February 25, 2018, 1:53pm

Recently, editor Flexx has added this release. As far as I understand from his edit and disambiguation comments, the only difference to this release is that the second one has a mould SID, but the first one does not.

My feeling is that the presence or absence of a mould SID should not be enough to have 2 separate releases. IIRC, different mould SIDs are not considered enough.

What do you think?

Kid_Devine · February 25, 2018, 4:12pm

I don’t know if there’s any consensus on how to treat such situations, I wouldn’t create separate releases myself.
Presumably, once MBS-8393 is implemented this question will be answered, that is, we’ll either be able to enter a single mould SID per release or multiple.

Freso · February 25, 2018, 5:21pm

Except that the odd release exists that has more than one mould SID, so even if we want to distinguish on this (and I think we should), we will still need multiple mould SIDs per release, even if 99.9% of releases with a mould SID will only have one.

This is essentially (a continuation of) the same discussion that’s been had earlier:

I agree with @aerozol’s comment there:

Flexx · February 26, 2018, 4:10pm

Since I am the “reason” for this post, let me add my thoughts to this: First, it might be helpful to know that I tag my files using both MusicBrainz (via Picard) and Discogs (via the foobar2000 plugin) data. I am therefore somewhat invested in having correct Discogs links on the releases I care for. In other words: If a MBID points to a Discogs release contradicting it in some of it’s data, I’ll act. Typically by either correcting the link to the one that is consistent with claimed features, or the existing artwork. Or, if needed, I will create a new release on MusicBrainz and/or Discogs.

In the particular case, the different european represses are all well known, and well sorted out. It’s especially known that the original issue did not (could not!) have had SIDs due to it being released in 1992. Now, some will argue that there is no difference at all in the artwork uploaded to MB, and that may be true at this time, but in very many cases it is not. I sometimes did not create a new release, if the only difference is in the hub area, if there is no artwork showing a detailed scan of the hub area, or the matrix codes aren’t visible on the medium image, typically due to the scan quality simply not showing it or some opaque coating covering it. In such cases I have added a second discogs link to the “Similar on MusicBrainz” MBID.

In this case though, the existing european release entry could not be used, because there is a scan of the hub that clearly shows the mould SID. Now, I simply don’t have this medium. Any edit I make on this release carries a risk that the claim “ain’t different” simply doesn’t hold – the SID release was repressed at least 2 years later, the booklets, tray in lay, etc. will probably have been reprinted. I may have been from the same sources, it may have been printed in a different place, and I can’t really be sure, because I simply do not possess a copy of that later product.

BTW, I would love to see a distinction of “PRODUCT” as an entity, which is a subset of RELEASE (i. e. PRODUCTS are children of RELEASE just like RELEASES are children of RELEASE_GROUP). A product, then, is a set of physical items that has every detail, including SID and matrix codes the same. Products share similar artwork, not neccessarily releases to every minute detail. A release, in my book, would be similar if it has the same tracks and recordings, in the same order, etc. A bit like all the European repressings of “Céline Dion”. We’d all say we have the 14 track european release of it, but we might have 6 different products in hand, actually even 30 or 40 if you consider all SID variants. It’s incorrect, BTW, that a release (i. e. CDs that hit stores on the same day in France) will have only one mould SID. Production runs certainly allowed for the glass master being used in different presses for high volume releases. Technically that’d make them differnt products, but not different releases. They’d be paired, for example with booklets and artwork from the same print run. That’s just the way it is in reality. And trust me, I am a software architect / developer professionally, I know this much: Failing to correctly and accurately model the real worlds facts, will make you hurt badly – especially in your data model. It may save you time and effort early on, and you can fix some kinds of modelling errors in code, but if you made basic errors, you’re in deep trouble. You’ll have bad data quality, and the code and “organizational remedies” (i. e. policies humans must follow) to cope with it will be hard to maintain, ugly and buggy/incomplete.

Now the fun thing is that I can proove (on my own) that a particular procuct exists (because I have it in hand) without doubt. So in principle perfect data quality is available for my particular product. But I cannot ever hope to know all of them, and I mustn’t ever assume I do, or be led on assuming I do by the user interface. One of the most common errors in this kind of application is “erroneous condensation” of data (e. g. failure to require to explicitly enter the absence of a SID in an optional SID field, making it impossible for a future editor to discern if the original author didn’t care to fill the optional field, or meant that there is no SID, a typically complication on Discogs. MB typically is better in that regard – particularly the “this release has no barcode” field. Other examples are erroneous combination of releases that in fact do have different artwork, like distribution codes, or even the same information printed differently – I’ve seen cases of the same barcode, data-wise, but printed differently – numbers all on on line vs. raised check digit, end up in the same release). That is the fundamental problem in crowdsourced databases like MB or Discogs, and neither got it quite right. MB is a lot better due to the use of AccoustId and by having a notion of “recordings” and “works”. The composer of a work will remain the same, and the bassist on a particular recording will remain the same on whatever compilation said recording appears. It’s unfortunate that we don’t have a song hash (e. g. SHA-515) verification of recording equality for lossless tracks via Picard, it would help mitigate recording duplication automatically with near-perfect confidence (i e. when importing CD’s via piccard from lossless sources, we could “lock” the recording to a known/existing one, in the track editor).

But I am digressing. As I said, in the particular case, the artwork simply disambiguates these MBIDs (and Discogs submissions), and I had to create another release record to link them accordingly. Also, nobody, unless you have both releases in hand can really say if artwork is the same, unless there is a full set of scans (which would be great to be able to “tag” as complete in some way) for both releases, so I can proove/disproove similarity to the level we’d like to track on MB while having only one product in hand, and the other merely in the form of scans.

If we somehow manage to have a system (beginning with an entity relationship model) where individual editors cannot fail under the premise of them having “their” copy in hand, and excercising due diligence, we’d make a huge step forward in database accuracy. The main user error/neglect, if you ask me, is to not fill the disambiguation and annotation field with well-known “disambiguators” like format codes, matrix codes, french price / distribution codes.

(BTW, whichever term you prefer, contrary to claims, there is no actual definition of either – I prefer distribution code, becasue it doesn’t contradict it being used to set price, but doesn’t require that use. As far as I am concerned we might also call it “that other code we find on european releases, with two characters and three digits”, because it doesn’t matter what is means, but that it’s there, and what it’s value is. Let those who think they are experts use the “meaning” they wish for. We’re not shops, we don’t need to care if it’s a price code or distribution code or whatever – but I sure want the images I embed to accurately represent the very product from which I ripped my FLAC files, even if I have no idea what “WE 833” actually meant at the time I bought the album.)

Often a well-meaning editor has no chance to know if another release needs to be created or not. Sometimes artwork must be consulted (which is a lot more work than “scraching off” release list entries by seeing distribution, matrix codes, “made in the UK”), sometimes there were no scans available, so the only actual way to tell what the original editor(s) may have had in hand is via the linked Discogs entry. Often this leads to “release hijacking”, followed by either resignation and acceptance of the invalid data, or a tedious, manual process that is typically still ending up with incorrect data on either side. Both in Discogs and MB we have merge support, but literally no split support at all. Incorrect merges and/or release hijacks are the main cause for data problems on either database.

Again, my primary concern is that products are identifiable, and entering contradicting information about them is hard by design of the UI. Make product entry easy and fast, and you do a great service to those who want that level of correctness in their tags. Others, let them tag/edit to the release level, without it even interfering with product information, and you’ll have a far better database than Discogs can ever hope to have due to their cardinal error of never even considereing works and recordings.

That’s why I used MB to tag performers and the like, but need to use Discogs (and local files / manual tags) to keep track of my collection to the detail of what actual product I have.

Flexx · February 26, 2018, 4:44pm

In accordance with my post above, it’s important to point out that the only known difference is the absence/presence of SID codes. I cannot verify the accuracy of the artwork attached to the release I don’t have. But I can tell that it has SIDs, and was linked to the Discogs entry I cannot link “my” release to.

I can also tell that it’s absolutely impossible the “release event” on that release was 1992. That’s just not possible, nobody could have bought this product in a store in 1992. And it should really be fixed (removed) there. It’s likely impossible to determine the actual date this MBID should have, as is often the case with represses/reissues.

I also like that MB does not have a separate “reissue” or “repress” flag, as it’s virtually impossible for anybody not working at the label or pressing plant (or both) at the time to know what it was. In such pre/post 1994 cases, and some cases based on dates showing in matrix codes, we can know that it was some form of “later product” and by deriving an “original year” tag from the release group, we’re doing all we can to provide data we can trust, instead of “inventing” data and data labels as is common on Discogs.

spitzwegerich · March 9, 2018, 7:53pm

So you want to link your releases against discogs and musicbrainz, and you expect that there must be a 1:1 assignment. What I don’t like is that apparently you take it as granted that this always must be possible.

At the discussion Freso gave us you see that there is not a broad agreement on when two releases should be separated in musicbrainz.

Actually, I think there should be some limit on which minor differences indicate to create a new release. Adding a new release for each minor difference in the mould SID might result in an insane number of releases and massive data duplication (which always indicates a bad database design).

But I recognize that there is some interest in the differentiation of those minor production details. I think the best way would be to introduce a sub-release entity “production variant” (or whatever) for keeping track of those. The discogs link could then be set on release- or on variant-level, depending on the data state on the discogs side.

Kid_Devine · March 10, 2018, 7:41am

Coincidentally, release “variants” were discussed at the most recent meeting.

Notes from #MetaBrainz meeting 2018-03-05

IRC Logs for #metabrainz | MetaBrainz Chatlogs

zas:

In Digital releases we discussed about handling of digital formats

the current policy on MusicBrainz is we don’t care actual digital format, we manage “digital media” support releases

during this discussion, i had the feeling we were missing a tool to manage this

So i had the idea of release “variants”

Basically that’s a release based on the data on another release.

It would fit things like: colored vinyl releases, digital formats releases otherwise identical (mp3 vs flac vs ogg …)

A fair amount of discussion ensued; comparing “variants” to alternative tracklists, some questions about current practices, some discussion about when a new entry would be a Release vs. a “Variant” of another Release.

Ultimately, too many unanswered questions and concerns, so @zas agreed to try and make a more “formal”/in-depth specification of it or similar and post that on the forums in its own topic for further discussion before bringing it back up at the meeting.

spitzwegerich · March 12, 2018, 8:28pm

Interesting! Hopefully that “variant” thing will not fall asleep again.

agatzk · June 2, 2021, 4:54pm

Chiming in here because I add physical releases and am intentional about the mastering SIDs and mould SIDs. My two cents is that we should allow for multiple SIDs per release, whether or not that’s accomplished with the “variants” idea proposed here. I’ll note that the “variants” section on Discogs releases is often incomplete and could potentially never practically be complete (for many releases).

I’ve encountered many releases lately with differing SIDs (e.g. Edit #79818764 - Remove relationship (Edit #79818764 - MusicBrainz)) that should probably be handled in a better way than adding a new release for each SID.

Furthermore, I’ve encountered CDs which date/time stamp the matrix/runout (e.g. https://www.discogs.com/Aphex-Twin-Syro/release/6112678) which would generally imply that every individual CD deserves its own release (clearly impractical).

I wish I had more answers than questions, but at this stage I’m still working to read through and understand the historical perspective and conversation as to not fall into the same traps we’ve already hammered out.

edit: here’s some other conversations: