Report showing acoustids likely to be bad link to musicbrainz recordings

I keep tripping over stuff you have already checked yourself, but I would ask about this one: https://acoustid.org/track/9496661

Why are there 5min and 10min tracks still on the page with an 8 min track? I punched the “burn the artist” button and expected all of those to have vanished as outside the margin.

The good thing is I don’t think I have seen anything wrong yet. In fact, it is more a case of seeing stuff your script has missed. Or multiple examples like the one above. Or versions of that where different tracks by different artists are involved.

Yeah - strangely missing stuff. https://acoustid.org/track/167979c6-691b-491b-b5a0-4b04f0d8489e That had two 4min tracks that should have both been picked up.

-=-=-

I think it may be worth re-running your detection scripts and see if they pick stuff up again after the first blast of deletes. Something makes them skip matches.

1 Like

Hi, okay these ones are being missed because the acoustid only links to a single mb recording. The original reports looked for cases where there was a bad match and a good match for an Acoustid and hence were only interested in acoustids that matched multiple mbrecordings

and the later reports rely on an interim table used by the earlier reports so not including any acoustids that only link to one mbrecording, I will modify the script and rerun.

You misunderstand how the button works, the reports are listed in artist order, so if you click on Unlink by Artist it will unlink all the records for the same artist shown on the Albunack report, but it will not unlink any other pairings shown on that Acoustid page if they are not listed in the report that you selected Unlink by Artist for. The reasoning for this is that often there is enough information on the report to work out if all pairings for an artist are invalid without needing to check each individual Acoustid page,

Oh blimey, just rerun now the report that shows fingerprints that were 30% or greater out and was nearly clear has gained 250,000 records

Deleting these records would mean that these fingerprints no longer link to any recordings, so views please.

1 Like

Ah - thanks for solving my inquisitive questions :slight_smile:

Thanks, makes sense. I am looking in too much of a human way. The positive from this shows your scripts are erring on the side of “these are definitely wrong” :slight_smile:

You have had me looking at so many of these variations now I hit them like a machine cleaning up. :smiley: I can also see how often so many of the problems come from just one or two bad uploads from a single user. Picard (and other apps) need to hide the Submit AcoustID buttons deeper so only geeks who know what they are use them

Now that does not surprise me. I thought it looked a little small. I’ll be back another day this week to punch through a bit more…

1 Like

Okay what I have done is split the over 30% difference into those with single submission for the pairing and those with multiple submissions, so split the 250,000 into 194,000 and 58,000. Because extra reports have been created the reports numbering has changed, all accessible from JThink

My own view is the 194,000 can be auto deleted because they are way out on fingerprint length and each one has only ever been submitted by one user, but Il wait for you to take a look later in the week and see what you think.

The ones with multiple sources are more problematic, what Im not sure about is if the same user can submit the same pairs multiple times or if they have to be from different users.

2 Likes

Also to differentiate between the new records and the old added a No of others Mbids this Acoustid linked to column, this will be zero for these new records. But also realized these tracks havent been check for pairs that are actually already disabled but not marked as such in database (because of problem with Acoustid datafeed) so I need to query Acoustid for all these to find out if still active, may take some time…

Is there a way to completely remove some fingerprints from the AcoustID database?

No, except if the Acoustid developer does it, but why would you want to do that, fingerprints themselves are not wrong just what they may link to on MusicBrainz

2 Likes

There is an instrumental track, wrong linked with the fingerprint of the main song (with vocals) (or it could be vice versa??). I want the fingerprint gone so that a new fingerprint can be created for the instrumental.

You are using confusing terminology. An AcoustID is a group of fingerprints. There’s no way to make a fingerprint go away.

Perhaps see Guides/AcoustID - MusicBrainz Wiki

We may not be the best people to help. We are just removing links between Fingerprints and Musicbrainz Recordings. Looking for bad data. We are not the AcoustIDs Hamsters.

What happens when you submit the data from Picard? If the instrumental is different enough it should make separate fingerprints. But this is one of those areas where multiple recordings can link to a common pool of Fingerprints if the Instrumental looks close enough to the “with vocals” version. Don’t think there is much that can be done about it. The maths will just see it as “the same”.

I’ve seen it happen with concerts too when some tracks can be performed in a very similar manner which starts to cross-pollute the fingerprints due to too many similarities.

That’s what happened. When I submitted the instrumental, it linked to the “with vocals” track. The algorithm(?) didn’t create a seperate acoustid for it.

I think you just mean the fingerprint has incorrectly been linked to a MB recording that is the vocal version, that doesn’t mean the fingerprint itself is the same as the fingerprint would be if you had the vocal version of track as well.

So you can unlink this, just need to login to Acoustid first (can use mb account to do this).

Then what you want to do is correctly link the fingeprint to the correct musicbrainz id, are using Picard or something else ?

Do you have the vocal version as well then you check that it does in fact have a different fingerprint.

Would help if you could post the acoustId.

1 Like

https://musicbrainz.org/release/83756407-0125-4032-bdff-0799cdb1b921

Tracks 1 and 3 Track "403e553d-82ba-4237-9907-956896c13235" | AcoustID.
Tracks 4 and 6 Track "f09a26fd-6c62-45fe-890b-81d139279247" | AcoustID.

have you fixed it now then?

I think yes but only tracks (recordings) 4 and 6 are fixed. Both recordings have seperate fingerprints.

Tracks 1 and 3 are still merged into one fingerprint.

No feedback so Im going to start deleting these.

1 Like

just not had any time free this week. but no one else seems to care anyway so load up the napalm

So I have been napalming that report, but not quite as simple as pressing one button. Instead I have private version of reports that split the pages into 5000 records each (larger pages seem to be too large/slow) , then I can press button once for each page. However sometimes it only disables the first record then stops (and I couldn’t get it work at all for a week, any ideas ?) but if it does work it will continue for the whole page, so its been a bit of a battle but now at the last 10,000 for the first >30% report.

Also for the earlier reports the only records left had been checked they were either clearly valid, or it was not possible to know if they were valid or not, so I have now moved these records into an internal acoustid_mbid_checked table so they are no longer shown in reports.

Next week I will update my database with the latest Acoustid data and the latest MusicBrainz data, will be interesting to see how many potential invalid matches come up in these first reports.

2 Likes

Okay fixed my problem by using a GreaseMonkey script instead, now just napalming the Acoustid links to songs that vary by at least 30% from fingerprint length, multiple submissions, and not covered in earlier report report

I checked some records in this report and when there are multiple submissions but no other mbids linked to the same acoustid there is rarely any user submitted metadata so I think the same user is just resubmitting the same wrong link in Picard.

e.g Track "3c53454e-8612-4e57-863d-1ce56d43c3e7" | AcoustID

When acoustid is linked to other mbid then usually case that matching to the right song but wrong version of the song (wrong length) when there is one matching length available.

e.g Track "e023a522-1e31-4e9c-b3de-32d71b510263" | AcoustID

2 Likes