How qwould you even automatically determine that a date is incorrect? Or what the correct date would be?
Philipp you don’t understand, no offense.
He can check 5 sources with release dates and then choose the one with the most points, e.g. 3 to 2 or 4 to 1.
And when it comes to automatic ones, I want the corrections to be made by a computer, not a human. Because we have few editors.
But I’m just a dreamer…
And how would that automatic routine know that these are 5 independent sources and not all merely copying off one another?
Discogs and AllMusic do this?
Computers make errors unless humans check them.
Discogs use humans. Humans make less errors. Pretty sure AllMusic is all human data entry too.
Is it currently possible to determine which portal (MB, AM, D) has the fewest errors?
What you say are truisms.
The idea is, of course, that a human should write the best algorithm possible, and the machine will do all the work.
But that can’t work with music data like this when it is impossible for a machine to cross check the original sources.
It is an idea which fails from start. Machines are experts at doing as they are told. They are not experts in accuracy with historic data like this. Humans need to make a judgement call as to which source to trust most.
There is this current madness of wanting “AI” to do everything. This leads to good examples of a machine failing dramatically when given the wrong tasks. Here is a funny story from yesterday when AI was accusing a writer of murders he had wrote about: Microsoft Bing Copilot blames reporter for crimes he covered • The Register Another example of the algorithm being wrong and still needing checking by humans.
Can something like this be done?
Create a list of suspicious data, save it even in a TXT file, then distribute e.g. 1000 to each editor.
There are plenty of these kinds of lists already. Created from database searches. Give me a moment and I’ll update this post with some of them…
They are called “Reports” and found on the main “Editing” Menu
Plenty of other messed up data that is having human eyes run over it for manual correction. If you are bored, there are a few reports you may be able to dig into yourself.
Fixing bad data still comes down to the normal editors keeping eyes open in areas that they have knowledge on. A common one I see is people adding Digital Media from Spotify etc and using original release years like 1989… Now THAT could be picked out with a database search.
Everything is in MusicBrainz. This is a well-written program.