Improving Picard's lookup - by catalogue number?

Use Search for Similar Albums and replace the default search criteria with the catalogue number.

1 Like

When I said “CDs” I meant “Digital Files”.

I too ripped 500+ CDs with EAC, and would then pass the resulting digital files to Picard to get the exact edition. Normally using that Other Versions pop-out menu. Sometimes needing to click the Lookup in Browser button to make a more exact selection. Pressing the Green TAGGER button to load that version back to Picard.

(Wish I knew about that “Search for Similar Albums” @Sophist has just pointed out when I was doing my main tagging :smiley: )

Did you make EAC rip logs? If so, no need to ever get the CD out of storage. This link is useful: https://eac-log-lookup.blogspot.com

You can paste the TOC from your logs into that page and it will lookup the CD in the database. Not perfect as sometimes common discs are on the wrong edition, but helps narrow things down in an alternate way.

Well, after your response, I went to find the actual CD! :slight_smile: I tried it out in Picard. It matches to the AU version, because (my guess) is that it is the only one that has a matching Barcode Number… of 093624657729 (with a leading zero removed) vs the Catalog number I have from EAC of 0093624657729?

I went off and read the definitions of Catalog Number and Barcode number in the MusicBrainz documentation. I don’t understand why they are defined the way they are. My physical CD has a barcode on the back of the CD, with 09362-46577-2 printed underneath. The catalog number of 0093624657729 is what EAC read and put in the CUE file.

If I look at all the entries that are shown in the screenshot, then none of them seem to make sense to me when I look at the CD in my hand. The US, GB and AU ones have a number that matches to the Catalog Number EAC read and put in the CUE file. It’s just that it’s called Barcode.
It might be interesting to see how many of the EAC catalog numbers match to Barcodes in the database…

Thanks for the link to the lookup site. Yes, I do have my logs. This one was done in 2007! with a 99.9% quality and overall confidence. The TOC matches back to the same set of choices. I also have the freedb discid in the CUE file = DF11BC0E. Unfortunately, not found in Musicbrainz.

Where is EAC going to read a catalogue number from? That will just have been a lookup in an online database (best guess). Having the physical media in hand will be your main way to match the exact edition.

It is quite common you find a barcode used as a catalogue number. I would trust the media in hand more than EAC.

With physical CD case in hand you can look at any attached artwork and spot the little differences. Especially on the rear of the case. If there is no artwork, look for the discogs links and check that artwork too.

If it is Australian, then the small print under the barcode will mention Australia: Release “Pilgrim” by Eric Clapton - Cover art - MusicBrainz

A CD from a popular artist like Eric Clapton can be trickier to get an exact match on as so many editions will have been printed over the years. Discogs lists 17 variations of editions with that barcode

My understanding is that EAC reads it from the physical CD disk, not a database!
It is put on the disk as part of the production / mastering process of creating a CD. It is defined in the CUE file that is used to write the disk, same as the ISRC numbers for each Track:. This view is based on all these docs:

Note the description on https://wiki.hydrogenaud.io/index.php?title=Cue_sheet:

  • CATALOG – A 13-digit UPC/EAN code, also referred to as the Media Catolog Number (MCN). 12-digit UPC codes should be prefixed with a “0”.

Blockquote

I just ripped the Pilgrim CD using CueRipper - part of Cuetools and got a CATALOG number of 0093624657729. The same value as EAC. If you google the number, you’ll see it showing up as the EAN in lots of places.

So, if my understanding is correct, the catalog id from the disk is critical information. We should have a database of these numbers that match to the actual releases. I don’t understand why musicbrainz doesn’t have this mapping. It is what should be in the catalog number field.

You are looking for the barcode field:

See also

1 Like

Well, I learnt something new. I always assumed EAC was just a perfect audio ripper. I didn’t realise there was much other data coming from the CD. But as @chaban points out, you have a Barcode there with extra zero padding. The Catalogue number in MB language is the Record Catalogue number as used by the Record Label and found on printed the spine of the CD case.

You still end up with a similar problem - the EAN\UPC\Barcode gets you close, but you’ll need to check the artwork in hand to be exact. It depends how detailed you want to get. :slight_smile:

Well, I learnt something new. I always assumed EAC was just a perfect audio ripper. I didn’t realise there was much other data coming from the CD

Yes, You can see in the CUE file, places to put the Title, Performer, Track Name and Track performer.
The early CD players could actually read this info off the CD and display it in a LED display. There were two problems: 1) CD players were very expensive - over $500 and 2) enough CDs were made without the information on the disk that caused a problem of people thinking the player was broken. To get the price down and to stop all the returns, they took the display off the players.

The Catalogue number in MB language is the Record Catalogue number as used by the Record Label and found on printed the spine of the CD case.

It’s confusing as all the tagging systems use catalog number to mean the value off the disk. At a minimum, I think MB should update its documentation to make it clear. “This is not the catalog number you are looking for” . How does this get done?

the EAN\UPC\Barcode gets you close, but you’ll need to check the artwork in hand to be exact

Yes, and the plot thickens:

  1. the number under the printed bar code on the Pilgrim CD is 09362-46577-2
  2. if I scan the actual bar code, the number returned is 0093624657729, This is a 13 digit number and is an UPC/EAN number based on the description on https://wiki.hydrogenaud.io/index.php?title=Cue_sheet:
  • CATALOG – A 13-digit UPC/EAN code, also referred to as the Media Catolog Number (MCN). 12-digit UPC codes should be prefixed with a “0”.

@chaban Thanks for the link to the MCN.

So there’s confusion about what is in the barcode field:

  1. is it the plain number 0093624657729 that a bar code scanner reads which is 13 digits and therefore is an UPC/EAN number OR
  2. is it the number printed underneath “09362-46577-29” (a variation of the 12 digit number) OR
  3. a 12 digit MCN? eg 093624657729

(I also checked my copy of Pink Floyd, Dark Side of the Moon; There is no catalog number on the actual disk. The barcode on the back of the case is 7777-46001-2, when I read it with a barcode scanner, it reads as 0077774600125 which almost matches the bar code field in the data base for the specific version of the CD that I have if you add a leading 0. Someone must have read the bar code with a scanner as the check digit isn’t printed on the case)

Yes, EAC can read the UPC/EAN from the CD-Text. But.
This # retrieved from the CD is oftentimes not the same as on the printed artwork. Represses, different plants and labels etc etc etc. And even more often the catalog # data is not there at all, some software incuding EAC then just returns a string of 0s.

1 Like

And even more often the catalog # data is not there at all, some software incuding EAC then just returns a string of 0s.

yes and it depends on your drive. Some don’t support it.
But if it is there, I would like to use the information.

I think the confusion is more that MB is not a tagging database. It is first a database for documenting different releases. Second it is also a useful tagging database. So once the translation is in your head for catalogue equals Barcode it is easy. :slight_smile:

And you have only just started looking down the Rabbit Hole of Madness that is different versions of CDs! A single barcode\EAN\UPC will cover many different variations of a release.

One thing to notice is the barcodes in MB will always drop the dashes when typed in. Nearly always skips the extra zero, but will check the maths of that check digit on the end.

Older disks, older standards. Bet there also isn’t a SID code printed in the matrix too.

MB is more about the visual. What can be seen on the paperwork and external packaging. Those discIDs are calculated just from the TOCs, but this does take into account that different editions will have different edits of the tracks. Dark Side of the Moon being a good example.

I think the confusion is more that MB is not a tagging database. It is first a database for documenting different releases. Second it is also a useful tagging database. So once the translation is in your head for catalogue equals Barcode it is easy.

surely being able to tag well is the main motivation?
What’s recommended instead?

The main motivation here is a big catalogue of all released editions of an album music in all forms, but IMHO there is no better source of data than MB. If anything there is TOO MUCH data here when all you need to do is tag some digital tracks.

If it was only for tagging, there would be no point in listing all the vinyl, 8-tracks and everything else.

seems like I’ve no alternative but to do the manual matching to specific CDs despite all the info in the database. :sweat_smile:

1 Like

The database will get you close with that barcode, but you then have some manual work to get the last few steps. And beware - the more you realise the little details the more fussy you will get at an exact match. :crazy_face:

Before joining MB to tag some albums I didn’t even know where UDEN was!!

If you use cluster and then lookup it should match to the release group in Picard. Then you are a single right click away from checking that it’s the right version.

It’s not as automated as we would all like, but let me stress that even if Picard did some extra wizardry with cat no’s from cue sheets I would still always check the versions for every album when tagging. There is no automated system or program that will be able to deal with the duplication of this data on popular albums, mistakes in printing or manufacturing, or errors by users who’ve entered the data into the database.

But anyway, that one click isn’t so bad, and as IvanDobsky said it definitely becomes part of the fun!

I don’t know enough about CUE sheet data etc, but it sounds like you have a pretty good grasp on it? Unlike other platforms here you are welcome to create a ticket outlining your proposal, or even code/get it coded yourself. It sounds interesting. Not sure if it’s off-topic I always thought pre-gap info would be interesting to store somewhere :thinking: Maybe there’s a sensible place/way to store your info/strings in the DB (and then these can then be used by Picard). That said, I have to assume that if any of this is the killer app that would solve all tagging issues then it would be implemented already. Maybe there’s an existing ticket? The bug tracker is https://tickets.metabrainz.org
(I think this is where you would report documentation issues/suggestions as well?)

Take a bit to familiarize yourself first though. When you enter or edit a MB release the barcode field won’t let you enter dashes. It also automatically checks and recommends a checksum if you haven’t put that digit/s in (though you can click ‘my release doesn’t have this printed’ instead). I think that was a unanswered question if yours.

2 Likes

i now know where UDEN is

2 Likes

One of the reasons for jokingly mentioning Uden is that even with a catalogue number \ barcode match. And then narrowing down the text on the back of the case is England and not Australia. You then find that large releases are pressed in multiple plants across Europe over many years. And now you are focusing on the unreadable bits of scratched text on the inner ring of the CD.

Matching the catalogue number is only the start. I came here to just tag some files, and have ended up knowing so much more detail about the production of CDs than any sane person should need to know.

1 Like

Just to add some information to this, because I can see the confusion:

The CD itself can contain the so called Media Catalog Number (MCN). This field is supposed to be filled with the EAN or UPC, which is an article number that usually is also present on the packaging in machine readable form as a barcode. This is what is supposed to go into the barcode field on MusicBrainz.

Not all CDs do contain the MCN. I don’t know any exact dates, but if you have CDs from the 1990s they pretty likely do not have this. Current CDs usually do have the MCN (if they have a barcode at all).

In your example the actual EAN is 0093624657729, that is what is encoded by the barcode and also what is present on the disc itself. Usually the number is also printed in human readable form below the actual barcode. Often it ads hyphens for readability, sometimes the last digit is omitted, as in your example with 09362-46577-2. The last digit is a checksum that can be calculated from the other digits and is used to verify whether the EAN / UPC has been read correctly and is valid.

MusicBrainz currently only stores the normalized version of the barcode, e.g. 0093624657729. I think there is an open ticket to allow to also store it as printed.

Ideally one would of course expect that the EAN / UPC that is read from the CD is identical to the one on the packaging. But that does not need to be so. E.g. sometimes the very same disc is made available in different packaging, e.g. there might be the standard version in a jewel case and a limited version which comes in a digipack with a bonus disc. But because the main CD is otherwise identical, they do a single pressing with a single EAN on the disc, but each edition gets a different EAN on the packaging. For MusicBrainz the barcode field should be filled with what is on the packaging (and this is also what the label and stores care about, because they need to differentiate the products).

As for the catalogue number field, that is something separate. This is supposed to hold the catalogue number as issued by the label or publisher. Like an EAN it is also a way to identify a specific product for stock keeping and sales, but it is internal to one company. Hence sometimes there are multiple catalogue numbers when multiple labels are involved. Usually the catalogue number is printed on the spine. Today labels deal differently with this:

  • Many use a number derived in some way from the EAN, e.g. this release has the catalogue number “9981592” and the EAN barcode 5051099815926 or this release has the catalogue number “NB 3797-2” and barcode 727361379728
  • Some have their own unique numbering scheme, e.g. this release has the catalogue number “CDMFN 152”
  • Some just use the EAN or actually don’t use a separate catalogue number at all anymore.

EAN/UPC are only around since sometime in the 1970s. So earlier releases won’t have a barcode. But they usually do have a catalogue number, because somehow the companies needed to do their stock keeping anyway.

3 Likes

A CD from a popular artist like Eric Clapton can be trickier to get an exact match on as so many editions will have been printed over the years. Discogs lists 17 variations of editions with that barcode

Understood, but even after ive picked the exact match released by Reprise Records, it is listed with Duck Records as well. I manually edit it…