I may be wrong, but I vaguely seem to recall that the AcoustID is based on the first X (30?) seconds of the recording, so would not show them as different if one had applause or other additional audio at the end.
I saw that, but I took it to mean that it couldn’t be used to identify a file based on a snippet from the middle of the file. Also, there appears to be something in the code to ignore periods of silence. Perhaps that’s it?
They are nearly identical, even if there are small differences. So why are those not assigned to a single AcoustId? Likely because of the length difference. As mentioned above AcoustId fingerprints are based on the first ~30 seconds~ 2 minutes* of audio, but that’s not all information the AcoustId server uses. In addition to the fingerprint the total length of the recording is considered.
Now why are both AcoustIds linked to the same recording? Basically because someone decided to do so. The connection between recordings and AcoustIds is essentially a manual process (done by submitting the fingerprint with a recording ID with a tool like Picard).
If it is correct that both AcoustIds are linked to the same recording depends on whether we consider both the shorter and longer version the same recording on MB. If they are considered the same obviously both AcoustId should be linked to it. If the longer version is considered a separate recording the corresponding AcoustId shoud lribably also be removed from the shorter recording.
EDIT: The fingerprint is based on up to 2 minutes of audio, not as I wrote originally 30 seconds. I got confused by the previous discussion.
So what you’re saying is the acoustid with the longer time means someone attached it to this shorter recording, but their source was a longer recording? Because this is the only length in the database for this recording. (And therefore we can infer that there is a longer recording with the same first ~30 seconds, somewhere out in the wild?)
This is true for audio with the first 2 minutes identical and roughly identical length. If there is a notable length difference there should be a separate AcoustId, see my comment above. I’m not entirely sure where the threshold is, though.
It’s 2 minutes, sorry for adding to the confusion. But basically yes, at least the fingerprint got submitted with a different length. This does not necessarily mean this recording exists somewhere officially. It could also be just a file with some random radio moderation at the end, or the beginning of the next song, or silence. It also could be a submission error where the software doing the submission reported the wrong length.