AcoustID editing policy

Tags: #<Tag:0x00007f2a00c3d8f0>


I was looking for some AcoustID editing policy or discussion somewhere.
We could be tempted at times to clean up AcoustID by unlinking minoritary recordings or stuff like that.

  • I think it should be important to stress out that we should only edit (link/unlink) AcoustID for recordings we do own in hands from CD, and have scanned.
  • I think it’s also important to not merge MB recordings only because there is 1 common AcoustID.

Unfortunately we don’t see edit notes in AcoustID pages, only basic info.

cc. @texke @culinko


We have and and I think that’s about it of semi-/official policy/discussion. I recall there being some discussion as well, either on old forums or mailing lists…

Indeed! I have a feeling AcoustIDs will often get submitted for “close enough” matches (even if it’s not a match at all), because not everybody is that particular about confirming their data. (This would also require them to know exactly what data it is they have to begin with!)


For linking an acoustId, I agree. A CD or other original digital source is really the only reliable source for an acoustId. Analog sources aren’t trustworthy, and I avoid submitting acoustIds from an analog source eg. from digitizing an LP. I do make an exception for albums I have that are rare and out of print, where often an original digital source is simply unavailable. Picard usually does find matches for my less obscure digitalized vinyl tracks, so I am fairly confident that my equipment is functioning well enough for this to be useful.

Regarding unlinking sources based on “statistics”, it seems the best way to clean up some of the out of control acoustIds on popular tracks. Pick any popular tracxk, eg. Eagles Hotel California and you’re bound to find dozens of acoustIds. Since an acoustId should be unique for given audio, this shouldn’t happen, but it does. In my experience the vast majority of those acoustIds have only a single source, and only one will have a significant number of sources. For this recording, it appears to be 8815564 that sourced our track 1305 times. What are we to make of the track by The Beagles (a UK tribute band), sourced 5 times? If it’s also valid, then we cannot rely on acoustId to identify unique audio at all. I say unlink that one; it can’t also be right.

Some due diligence is required, of course. For example, here is an acoustId for Make It With You by Bread. Again, a very popular track, dozens of acoustIds on the main recording. The acoustId also shows a link to a recording by David Gates. Seems like a similar situation, until you discover that David Gates was the singer for Bread, and the track is from David Gates Songbook. So, it could be a later solo recording, but it could also be the same Bread track, misattributed to David Gates instead of his former group. Best left alone, I think, until some more research or a sound check can be done.

I’d like to understand better why errors like this are so common. There seems to be some serious issues in the process by which editors are attaching acoustIds to recordings.


Unlinking can be done without having the audio source, as SheamusPatt said.

You need to use common sense and don’t go blindly unlinkink tracks, but it can be done. My general approach is that if I have an AcoustID with two MBIDs, one with many sources, another one with few and the one with few has a lot of sources on another AcoustID, I feel safe to unlink it from the first AcoustID. I usually don’t unlink something that does not have any other AcoustID linked to it, unless it’s really obviously wrong. I consider a match against close, but not the same track better than no match at all.

Merging based on AcoustID without more information should not be done. AcoustIDs can be a hint, but not much else.


About unlinking by statistics, sometimes, statistics are wrong, like in the original post and like in that other example: