High data quality requirements


#1

In my opinion, for high data quality we should also demand a full set of hq cover art in CAA, including a scan of the medium. Otherwise, we cannot check the integrity of the data. Furthermore, keeping apart releases which differ only slightly is almost impossible without cover art.


Community Cleanup #4: Hyperion
#2

I don’t see full scans necessary if the same or even more data is available via reliable source. Booklet isn’t always the most reliable source (conflicting data & recording dates often have typos). With these Hyperion releases some older releases are missing recording locations but their website might still store it.

Because of potential copyright violations and potential future collaborations with labels I wouldn’t request anyone to store full sets of art (including all the pages of booklet) scanned. This isn’t necessary for identifying the correct release. I typically only include front, back, spine & medium scans.


#3

You never know if that reliable source will still be online tomorrow.

That’s true. Wrong info on the cover art should always be documented in the annotation.

Of course, we cannot strictly request that. But still, better have the full art work that not. If some label does not allow them being online … well … then their releases won’t be “high data quality” on musicbrainz.

As already mentioned elsewhere, I’m in favor of adding an additional quality level “very high”. Then “high” could be “all available data added” and “very high” should be “all we can dream of” (including full cover art).


#4

This is the only good solution I can see if we want to require full artwork for the highest level.


#5

As a aside, isn’t there a basic problem with the ‘data quality’ field, that it is always relative to the database schema at the time when it was set? If a new field were to be added (for example, “number of pages of booklet”), would that suddenly make all ‘high quality’ releases not high quality any more?


#6

To some degree, yes :slight_smile: That’s why I think “high” should be “all the info we can”, but if a tiny thing is missing, but there’s a lot of detail, it can probably still be High. Of course, that’s probably also why @spitzwegerich wants a level where all the sources are also included and always available - because then you can add any new things from the images we already have.