The title wasn’t changed, Freso moved this part of the discussion out into a completely new thread
I just realized I can edit the title! So have done so.
Why not? That’s the aim. What variables would capture this set of submissions?
The simplest scenario: ID X submitted 100 Stephen King AcoustID’s 10 years ago. ID X has never been active again, they submitted nothing else. How large a % of the AcoustID’s from that session would you manually want to unlink before the system decides to auto unlink the rest?
Complex scenario: ID Y submitted their whole library of 10k files in one go 3 months ago, without checking anything. 10% were bad, including their hastily tagged Stephen King collection. Since then they have continued adding acoustIDs for new additions, which are tagged correctly. Is there a threshold where we remove the 10k hurried submissions (anything submitted by that ID on that date for instance), to save you x hours having to clean Stephen King entries? And accept that there are good ones that will be removed*?
Your answer may well be “no”, which is totally understandable. Personally I think bad submissions outweigh a lot of good ones. I am not talking about permanently banning or besmirching key/ID 255267’s good name btw. Just questioning how we might be able to use what info we have to make the DB more reliable overall.
Note: This more nuanced session based approach relies on AcoustID storing submission timestamps as well as IDs. Otherwise it would be a more crude purely % based approach, which might not be acceptable.
*even the laziest Picard clicker is going to struggle to mistag a entire collection completely! I would be very impressed 10% is huge tbh.