How does Picard pick a recording?

Alex1 · February 3, 2022, 1:52pm

How does Picard pick a recording when there are several recordings linked to an Acoustid?

Hello there! I spent some time here in 2020 and returned to continuing where I stopped back then. There are still many tracks in my iTunes library with very little metadata on them. Sometimes I can only predetermine the artist and song name but not the album. This means I want to rely on the AcoustID properties to narrow down the choice of which album tags to apply.

In Picard, I scan the file. Then I right-click on the AcoustID value to look up the AcoustID details in the web browser. As an example, see this AcoustID of Careless Whisper. From the recordings listed there, Picard suggests this release from this recording. How?

I have some “preferred release countries” and some “preferred release formats” set in Picard preferences. Other than that, what criteria does Picard use to pick a recording – and then based on that recording, a release?

Alex1 · February 3, 2022, 4:14pm

One criteria I imagine could be used during the lookup process is the track length. In my case, the length of the Careless Whisper track is 5:06. I imagine that MusicBrainz looks for other fingerprints with the same length and then for linked recordings that mostly use those fingerprints.

In this case, there is none. Picard actually picked a release on which the track length is 4:59. That is on the opposite extreme end of the range of all track lengths listed for the AcoustID. That’s confounding.

In an ideal world, a person submitting fingerprints would do so only after making sure that the specs of the recording correspond to the information available on or for the file. In this case, we could then assume that there are 20 recordings of the same audio (counting only the linked recordings without a strikethrough).

As I see it, MusicBrainz links the AcoustIDs, not the fingerprints, to recordings. It should therefore be impossible for Picard to produce a “best match”. Only a human being with more data could do that.

I am left to choose among more than 50 possibilities, right?

I just want to have an opinion to be sure that the album returned by Picard is really just an arbitrary suggestion among more than 50 possibilities.

To stay with Careless Whisper, the recording is linked to more than one AcoustID. But a recording should only have one AcoustID, correct?

For some reason, others (= 19 sources) submitted fingerprints that (technically) belong to AcoustID 68ad…, yet it is linked to the same recording. Why? Human error?

outsidecontext · February 3, 2022, 7:23pm

When there is more than one recording to choose one Picard makes a metadata comparison between existing metadata and the recordings. This is very similar to how it compares the metadata when you do a recording lookup.

The criteria includes comparing track title, artist, album, length, date, but also the preferred release type and country.

Mostly yes, because an AcoustID and a MB recording are conceptionally very similar. But there are some differences. E.g. tracks with and without a long period of silence at the start or end would still be considered the same MB recording, but would get different AcoustIDs due to the length difference (if it is above 30 seconds).

There are some recent discussions about this topic:

https://community.metabrainz.org/t/obscure-recordings-that-have-many-acoustids-linked-to-them-with-no-duration-unusual-title/562814/7

Alex1 · February 4, 2022, 12:14pm

As I said, there is no metadata (besides artist name and song name). Therefore, there is nothing Picard can use to produce a “best match”, if all it does is to look whether the fingerprint already exists.

The question is still: How does Picard pick a recording?

The Acoustid pages would be more useful, if the fingerprints would be grouped under the recordings. That, way, if you know the fingerprint, you could see which recording other users with the same fingerprint chose to match with.

Speaking of the fingerprint, can the new acoustid_fingerprint tag be used to figure out which 8-digit Acoustid fingerprint it represents?

outsidecontext · February 4, 2022, 1:08pm

If the tags are not set at all they can of course not be used for comparison. This leaves Picard with length and your settings for preferred releases.

This tag is supposed to hold the text representation of the actual acoustic fingerprint as reported by fpcalc. It is not only 8 digits, it is much longer.

The tag is actually not that new, new is that Picard can write the fingerprint there (before that it only read the tag). Doing so is probably not that useful in most cases. The fingerprint can be calculated at any time from the audio again. You’d only store this if you have a specific need for it, e.g. doing AcoustID lookups later without the need for running fpcalc again on the file.

Alex1 · February 4, 2022, 1:48pm

Do you know if the data sent from Picard compares the length of the fingerprint against the (median) lengths of the linked recordings? Or does the lookup check which recordings have previously been used to match to files with that particular fingerprint?

Alex1 · February 4, 2022, 2:15pm

I was refering to the 8 digits as in https://acoustid.org/fingerprint/21427805

I would like to know which such fingerprint applies to the file to validate with my own eyes that Picard took a fingerprint length equal to the value of the length shown in iTunes, Picard or another tag reader. I thought that the acoustid_fingerprint tag could maybe be decoded to reveal which fingerprint that is.

outsidecontext · February 6, 2022, 4:27pm

The fingerprint is calculated locally from the audio using the fpcalc utility. The fingerprint is based on the first 120 seconds of audio. As an additional identifier to the total length of the audio is being used (again as returned by fpcalc).

For a lookup the length and the fingerprint are being submitted to the AcoustID lookup web service. AcoustID searches for matching fingerprints by similarity. The length is being used as an additional check, fingerprints with a length difference of more than 30 seconds are not being considered. The webservice than returns a list of AcoustIDs with similarity score and matching MB recording metadata. An AcoustID (internally in the AcoustID code called “track”) is essentially a grouping of fingerprints that are all considered to represent the same recording.

Picard then retrieves this list of AcoustID (most often 1, sometimes it can be more), similarity score and the MB metadata. In order to select a recording and release Picard then uses the AcoustID similarity score + the metadata as described above.

Alex1 · February 6, 2022, 5:57pm

First of all, thanks for sharing what you know!

So, the length data goes into the similarity score. I wish the fingerprints were linked to the recordings, because the linking is done by humans and is what gives the AcoustIDs any meaning. Information is lost, because the AcoustID service is just focused on clustering fingerprints. The information lost is that you can’t tell which recording your particular fingerprint has previously been linked to by other volunteers.

I am tempted to say that I would redesign this whole process from the ground up.

When I scan a file without metadata, AcoustID is great to identify the song, based on the recordings already linked. But it is impossible for Picard to select among the recordings.

Someone using Picard may think that it magically returns the correct album, but it doesn’t. It has to be a random choice within the bounds of the preferences set in Picard.

In my workflow, I therefore do not match (or select only a few tags to save) in Picard, if and when I can’t be sure about the album.

In my workflow, I keep such a track in iTunes with:

album artist = Unknown ([name of artist])
album = empty
artwork = placeholder.gif “Unknown Album”
track number = empty
disc number = empty

In the comments of a track I note any number of tags to document the research and matching success with MusicBrainz – e.g. mb-au, mb-m7:

outsidecontext · February 6, 2022, 7:59pm

Would be a possible model. But you’d still end up with the same decision issue: Your search for a fingerprint would give you a list of fingerprints with varying similarity, and each potentially linked to multiple MB recordings. And since the linking to MB recordings is a human process just like it is now it involves the possibility that a fingerprint is linked to the wrong recording.

What you’d loose is the knowledge that there is a set of fingerprints so similar that they very likely describe the same recording.