Some possibly interesting and likely useless findings

This is sort of a follow-up to some of the stuff discussed in my previous thread: What can be done with super polluted AcoustIDs?

I was submitting new fingerprints for an album as most tracks didn’t have one yet since I first created it (Heart of the Ocean) and I noticed most did the usual grouping of “main” and “instrumental” tracks into one AcoustID with different fingerprints, however, some did not and I tried to see if I could find some sort of correlation to help me understand why some do this and some don’t.

TLDR: I have no idea what makes some recordings group together under one AcoustID and some not.

So first off these three recordings are grouped under one AcoustID (3715ab0a-4cbd-4b0f-8ce2-1bf488e2c044):
The Omen King > 95335987
The Omen King (Choral) > 95335982
The Omen King (instrumental) > 95335986

These two look quite distinct to me and the fingerprint strings generated by Picard match in the first ~20.81%:

Compare fingerprints #95335986 and #95335987

These two are indeed quite similar and the fingerprint strings generated by Picard match in the first ~21.03%:

Compare fingerprints #95335986 and #95335982

These two look decently distinct but not as much as the first two and the fingerprint strings generated by Picard match in the first ~20.89%:

Compare fingerprints #95335987 and #95335982

Then we have these two recordings which do not share an AcoustID:
Heart of the Ocean > c23a47e1-1994-414f-8ba6-3324d7ab27e8 > 95351573 (this one has two nearly identical fingerprints, no idea why it even has two, but mine matched the linked one)
Heart of the Ocean (instrumental) > fc0b2b4e-adfd-4e50-b921-5b87c079eebe > 95335978

I would say these have a similar difference as to when I compared fingerprints #95335986 and #95335987, yet those two got combined under one AcoustID while these did not. The fingerprint strings generated by Picard match in the first ~27.61% (much higher than any of “The Omen King” recordings):

Compare fingerprints #95351573 and #95335978

Finally, we have these other two recordings which also do not share an AcoustID:
Sunlight > 4f8dbf03-62bc-4cf9-a08f-e39c7240caef > 95335985
Sunlight (instrumental) > 65ad9c41-b3ac-4baa-87c6-59d37282c237 > 95335983

Once again these are quite distinct as well and the fingerprint strings generated by Picard match in the first ~2.73%:

Compare fingerprints #95335985 and #95335983

I was hoping to find some sort of correlation to understand why some recordings group under the same AcoustID and some don’t but after all my playing around I found nothing.

I even played around with the Chromaprint “fpcalc” program, which turns out to have 5 algorithms to choose from (#2 is the default, which Picard uses), but even when comparing with different algorithms the strings match percentages only changed by 1% at most when compared to the default algorithm. I did not submit them to the server though as I didn’t trust that I was not going to generate busted data or something.

There you have my possibly interesting findings, which probably mean absolutely nothing to anyone.

3 Likes

I still would have expected the first 3 fingerprints to go to different acoustIDs too, but it looks like they are similar enough. :slight_smile:

Another factor is the length of the submitting files. Even very similar fingerprints create a new acoustID if the track lengths of the submitting files are too different. (the files which submitted the first three fingerprints in your example have probably the exact same length)

Files with different playback speed may also add to the same acoustID, but only to a certain extent.

They sure do, but the same goes for the others.

1 Like

Yes, but the other fingerprints are not similar enough and go to separate acoustIDs, despite of identical length :slight_smile: