MusicBrainz editor bot: FresoBot

Note that «Yes-voting on bot edits is discouraged unless the voter can 100% confirm they’re correct, since it helps them to go through with less eyes on them. If a bot edit gets rejected, a non-bot user can always re-enter it if he feels it’s correct - reverting the edit is much more difficult, especially for removals and merges.» Of course, in these cases, it is fairly easy to 100% confirm that they are correct, so maybe it’s okay to get voting on them and clear the queue so it doesn’t hit the “max. 2000 open edits at a time” limit (and so the changes go through sooner). :slight_smile:

2 Likes

I’ve made and run a new script, exit_url_cleanup.py which made I guess around 200 edits. I double and triple and quadruple checked the output several times, incl. a handful of the actually created edits on test and on the live database when I let it loose there. Many of the URLs need further cleaning than what this does, but it’s a step closer to being usable URLs. (Also lots of relationships that need fixing, which I also hope to be able to make the bot do.) Anyway, just a heads up. :slight_smile:

1 Like

I still haven’t gotten any complaints about any edits the bot has made, so I’ve been thinking about upping this to 200/day starting next Monday. If no one has any objections, I’ll Make It So™.

3 Likes

Could FresoBot also think about the Amazon Art issue? Would seem a useful task for a bot to handle.

Locating when CAA cover is missing and an ASIN link in place, could the bot then be the one running around downloading from Amazon and then uploading to CAA?

It could be really handy as it could mark it as “FresoBot Automatically Uploaded Art from Amazon”

It could, but I wouldn’t. Artwork from Amazon is in many instances not correct, and I have no way to verify every edit made by the bot, and I don’t want to make an automated instance upload wrong data.

5 Likes

If the ASIN link is already wrong, the displayed cover is actually also wrong.
It doesn’t get “wronger” if you upload this cover to CAA.
The big advantage would be that such a cover is available “forever” on CAA, not only for the time remaining until Amazon starts to charge MB and others for this service.
With a comment like @IvanDobsky “FresoBot Automatically Uploaded Art from Amazon” everyone knows, that the quality and matching for this cover isn’t 100% guaranteed.

2 Likes

Even if the ASIN is right, the cover may be wrong (and, by extension, the wrong ASIN may actually provide the right cover art!). Saying that “this Release corresponds to this ASIN” is making a statement that the Amazon Standard Identification Number corresponds to our MusicBrainz Identifier. Uploading the cover art to CAA is making a statement that that cover art is the correct cover art for the release. Those are two very different statements that are independent of each other.

Either way, I’m not going to do it for FresoBot.

5 Likes

But if below the cover you see “provided by Amazon” you already expect it to be wrong. You’ll think: Ah, that’s about what the cover of this album looks like, but nobody uploaded the correct cover for this very release yet.

5 Likes

I don’t believe that the majority of MB users really think this way.
They assume: What I get/see here is basically correct. It’s an encyclopedia. So every link to an external source is accurate.
Or do you really assume that every link to other external sources like Wikipedia, Facebook, Soundcloud, iTunes, Google Play etc is just a “looks like” or “maybe true”?

No, you’re right. Most users probably don’t think that way and when they e.g. download the cover art via picard they expect it to be correct. Then again most of them probably don’t care much if the cover art is a bit different.
But I meant the editors and not the users. If an editor sees an uploaded cover art they probably expect it to be correct, but if they see a cover art provided by amazon they might feel the need to find and upload the correct cover art.

5 Likes

« the … cover is … wrong … such a cover is available “forever” on CAA »

This is even wronger.

3 Likes

Not really, because you can always switch it to the matching release .-) :wink:
The risk that CAA is closing down is (IMHO) smaller then Amazon starts charging or prevent the direct linking to artwork on their servers.

I still have 60,000 MB Artist to Discogs links see http://reports.albunack.net/mbartist_discogsartist_report2.html and Is there any kind of project to improve holes in MusicBrainz coverage that I would like to add but have no mechanism to do so. Since you mention Discogs links perhaps you could add these, I can make the list available as a csv file.

3 Likes

It is wronger to have a wrong data.
Having no data is good, having wrong data is no good. Here is why:

Because having data (wrong or not), then chances that a human will edit is less than having no data.
Automatically setting wrong data will prevent human edits.

Human edits are either same low level (using linked cover without checking) or high level (eye check that it’s the correct version of the cover).

Bot edits will never eye‐check as thoroughly.

How do you do that? :thinking:

4 Likes

Waiting for a solution on https://tickets.metabrainz.org/browse/CAA-73? :innocent:

2 Likes

I’ve started having FresoBot do 200 Spotify URL edits/day now (as of yesterday), by the way. 3850 URLs left, so at the new rate, it should have gone through them all in 20 days. :slight_smile:

Edit: Found some more URLs(/URL patterns), so now there are 3893 URLs left. :scream: … But that should still be done within 20 days though. :sweat_smile:

3 Likes

And after 76 edits today, it’s chomped through them all now! :astonished: There are probably some stray Spotify URLs, but the vast majority of them should be “clean” now (or in a week anyway, when @FresoBot’s 1276 open edits have gone through). Now I need to actually make some code so it can start toiling away at something else. :slight_smile:

5 Likes

Hi Freso! Is your bot for hire? :slight_smile: Monstercat finally changed their music.monstercat.com links from http to https a couple of days ago and I would like to fix all of the existing URLs that we have in mb. I expect around 1k unique URLs (estimated from 1099 releases). The releases usually share the same URL for download and streaming. It would be great if your bot could change all of them automatically so I don’t have to do it manually :wink: The following are examples of how the fix should look like: this or this. Thank you in advance!

2 Likes

Just found you/you’re bot. Good news.

secondhandsongs:

If you are looking for a nice one: https://tickets.metabrainz.org/browse/MBS-9082
strip www and replace all http with https…

3 posts were split to a new topic: Albunack MusicBrainz/Discogs report