Understanding how Picard uses existing metadata in files to match to the database

dpr · June 8, 2018, 9:41pm

Is there any way to understand how Picard is using the metadata in files to match to the database?

I have the files from my Abbey Rd CD with some good metadata; Album Name, Artist, Date etc.It ought to be enough to match to an entry in the database for the 1987 CD, but Picard keeps wanting to match to the entry for a vynl record from 1980 (https://musicbrainz.org/release/27f534f7-12c5-4c56-8dda-39fbdad10c17?tport=8000)

I have set the matching threshold to 99% for ‘Minimal similarity for file lookups’
(Options->Advanced->Matching) but it still matches to the wrong entry. I’d like to fix this!
Is there any description of matching rules or even better a log of the matching rules at work?
Are there any other ways to control the matching?

Thanks

InvisibleMan78 · June 8, 2018, 9:51pm

Option -> Metadata -> Preferred Releases
on the lower right part:
Choose/Push the “Preferred release formats” from the left side to the right side,
in the sort order you most likely want it, like
“CD”
“Vinyl”
Help: https://picard.musicbrainz.org/docs/options/#preferred_releases

dpr · June 8, 2018, 9:59pm

Thanks. I also updated ‘Minimal similarity for matching files to tracks’ to be 99%, but still ignores the date in the metadata and goes to the wrong entry…

aerozol · June 9, 2018, 7:54am

I haven’t used that slider before so this is probably super dumb… but maybe putting it to 1% is what you want? The slider descriptor is a bit confusing.

dpr · June 9, 2018, 8:03am

From the help and Configuration — MusicBrainz Picard v2.11.0rc1 documentation

seems pretty clear to me. The higher the %, the more similar is must be to match… a higher % is closer to 100… maybe its not implemented properly?

InvisibleMan78 · June 9, 2018, 8:13am

If I’m not completely wrong, the date is not one of the fields to be used for similarity:

comparison_weights = {
    "title": 13,
    "artist": 4,
    "album": 5,
    "length": 10,
    "totaltracks": 4,
    "releasetype": 20,
    "releasecountry": 2,
    "format": 2,
}

from https://github.com/metabrainz/picard/blob/8478b3150abdb233b5e88c9425e3818e70b9776a/picard/file.py

dpr · June 9, 2018, 10:47am

InvisibleMan78:

If I’m not completely wrong, the date is not one of the fields to be used for similarity:
comparison_weights = {
    "title": 13,
    "artist": 4,
    "album": 5,
    "length": 10,
    "totaltracks": 4,
    "releasetype": 20,
    "releasecountry": 2,
    "format": 2,
}

Thanks for the pointer to the code. The weights seem strange… if its by highest number, releasetype , followed by length - before album and artist…

As you point out, no date…

outsidecontext · June 9, 2018, 6:12pm

The releasetype needs some special mention: The value is intentionally high, but usually not applied this way. The first difference is that the releasetype is not matched against metadata in the file but against your settings of preferred releasetype. And the high value of 20 is modified by the slider you set in those preferences. Depending on how you set the slider a certain releasetype has either no or a very high influence on the matching.

The date IMHO really is missing and would improve your situatiin. Not sure if it was intentionally left out.

aerozol · June 9, 2018, 10:05pm

“more similar” seems to me quite different from “minimal similarity”, which is what the slider description says. That’s why I thought I’d mention it

dpr · June 11, 2018, 6:37pm

Here’s the help:

Blockquote
Minimal similarity for file lookups: The higher then %, the more similar an individual file’s metadata must be to MusicBrainz’s metadata for it to be moved/matched to a release on the right-hand side

so a higher % means the metadata must be closer to MB to be matched…
And From

InvisibleMan78:

If I’m not completely wrong, the date is not one of the fields to be used for similarity:
comparison_weights = {
    "title": 13,
    "artist": 4,
    "album": 5,
    "length": 10,
    "totaltracks": 4,
    "releasetype": 20,
    "releasecountry": 2,
    "format": 2,
}
from picard/picard/file.py at 8478b3150abdb233b5e88c9425e3818e70b9776a · metabrainz/picard · GitHub

we know the metadata…Am I mis-understanding?