acoustID 2-minute mark

spitzwegerich · November 17, 2017, 10:39am

At the acoustID guide I found “False matches: […] Recordings which only diverge after the 2-minute mark where the acoustID fingerprint ends.”

So does acoustID really only cover the first two minutes? Why? I mean, acoustID is a really clever approach, but not so much ignoring everything after 2 minutes.

spitzwegerich · November 18, 2017, 12:01pm

I did some comparisons and my impression is that, fortunately, fingerprints do not end at 2:00. So where does that “2-minute mark” in the acoustID guide come from? Is it just wrong info?

lukz · November 19, 2017, 5:39pm

How did you find out they do not end at 2:00? Picard explicitly calls the AcoustID fingerprinting tool with fpcalc -length 120 to limit the audio to 2 minutes. That’s also the default value for fpcalc.

The reason behind that decision was that in order to identify a file, you don’t need more than that, so running the algorithm on the whole file is a waste of time. The fingerprinting algorithms MusicBrainz used before only used 60 seconds (MusicIP) and 30 seconds (TRM, if I remember correctly).

spitzwegerich · November 19, 2017, 9:09pm

Fingerprint of a ~2 minute recording: Fingerprint #19986863 | AcoustID.
Fingerprint of a ~5 minute recording: Fingerprint #45607656 | AcoustID

The second fingerprint is about 2.5 the length of the first one.

jesus2099 · November 20, 2017, 7:02am

The first one is an old AcoustID, they were shorter indeed. If you rescan now the 2 minute recording, it will get a long AcoustID as well.
But I also think it would be nice that all the recording is scanned, if it is only a matter of CPU, instead of only first two minutes.
TRM < PUID < AcoustID < larger and larger but still why not complete as we are almost there ?

lukz · November 20, 2017, 7:19am

It’s mostly just CPU. Back when I was designing AcoustID, I really wanted it to be super fast. For Picard-style lookups, even the 2 minutes was too much, but I though it’s a nice compromise.

I didn’t realize AcoustIDs would get used for other purposes, like checking which recordings are the same, it was only meant as a search tool. I’d like to upgrade the database to full fingerprints, but the server code currently doesn’t handle that, so I need to first make it possible to replace those short fingerprints with full ones. It will eventually happen, but it will take time.

spitzwegerich · November 20, 2017, 9:48am

So both fingerprints only cover 2:00. But the second appears to be longer as it comes from a newer pingerprinting version which produces more data. It this what you are saying?

Also, I noticed that for the first fingerprint, a length of exactly 2:00 minutes is reported while the associated recording is of length 2:02. This might mean that it is cut at 2:00 (or it might come from the fact that a single MB recording may be linked to a variety of tracks with differing lengths). However, for the second fingerprint, a length of 5:05 is reported for the fingerprint as well as for the recording. So I thought that this fingerprint covers the full length and is not cut.

Could you clarify, please? I really would like to understand this.

spitzwegerich · November 20, 2017, 9:54am

Thank you for this information!
Yes, it would be great to remove the 2:00 limitation, to start collecting full length fingerprints, better sooner than later. The CPU problem should self-heal as processors get faster and faster over the years.

ProfChris · November 27, 2017, 10:54am

I’m not sure whether this is to do with the 2 mins mark but sometimes I find false matches with insufficient information to resolve. Take https://acoustid.org/track/dcd88e4a-3de8-4332-9151-2f5898adda64 for example. This is associated with 2 recordings of the same work, by different performers. Both releases seem to have been correctly tagged with their performers according to the cover art. How might these be teased apart?

a23bed · January 23, 2018, 12:18pm

I’ve read about the 2 minute issue before and knew there were some limitations but acoustic id usually does a good job at matching the recordings.

But now I’ve noticed some fingerprints I’ve submitted get the same acoustic id.
It seems to confuse the instrumental/vocal versions. Even though they diverge before the 2 minute mark.

Just added these a few hours a go.

The vocals begin around the 0"53’ mark
https://acoustid.org/track/ff198376-3081-4a7c-aa56-8e720459f628

Again, here the main vocals begin around the 1"37’ mark.
https://acoustid.org/track/66e2acb6-934d-462d-bb4c-fb4bc3e42bcd

I first noticed this behaviour with "Into You (instrumental) by The Cinematic Orchestra"
The track was initially merged with the vocal version, after separating and resubmitting the fingerprint, it gave the same acoustic id.
https://acoustid.org/track/19cfda46-2f31-4a3b-9686-7d876a0d7b31

When I scan the tracks in Picard, with all the meta data cleaned out and even the files names change to something ambiguous, it always prefers the vocal version.
Although the vocals clearly begin around 1"06’ mark. I thought may be because they were previously merged, it had some kind of “memory”. So didn’t dwell on it too much.

But now with the ones I’ve added above, I’m beginning to think there’s something else going on.
I’ve tried submitting with the fingerprint app (windows) and Picard, same results.

jesus2099 · January 24, 2018, 11:25am

Hello @a23bed, just to make sure, the term instrumental is often used for two kinds of versions:

instrumental versions: a lead solo instrument or an orchestration replaces the sung part
karaoke versions: the vocal track(s) is/are removed from the mix

Which one describes your cases the best?

a23bed · January 25, 2018, 3:30am

Would have to say the later.
But I wouldn’t use the term “karaoke” for these particular recordings.