Statistics: What happened on May 14, 2019?

Tags: #<Tag:0x00007f4bf70c8dd0>

I was just browsing the MusicBrainz country statistics by chance when I came across this fascinating anomaly:

  • There was a sudden rise in the number of US artists from 128k to 170k! This also happened for many other countries I have checked - especially the top ranked (just click the new timeline icons).
  • The total number of artists did not change significantly.
  • For some French oversea departments I observed a contrary anomaly, e.g. the number of Réunion artists fell from 156 to 1, this lasted until December 2020 before it went back to the previous level! This might somehow explain the increase of French artists but what about all the other countries?

There had been a MBS schema change on May 13 but I can not find anything that seems related in the changelog:

I am really curious whether someone has an idea or an explanation what could have happened :face_with_monocle:


Database had a hiccup and added duplicates? Just seen the same size spike on GB.

Lots of extra artists found down the back of a sofa?

Next search - now to do a search for Edits on that date for new artists added to the database? (heads off to poke at search boxes)

I guess ¯\_(ツ)_/¯


Only 472 artists added on the 14th… random dips on random dates shows that 400ish is about average each day. (There are some weird things added to this database)

@chaban’s discovery there makes more sense. If I understand that edit correctly, previous to that “London” would not be GB. After that “London” appears as part of the GB total?

But why is the graph not calculated on demand? Weird. Shouldn’t a change like this be back dated to make the stats make consistent sense?

I think retroactively updating the historic dates is near impossible. You’d need to do check the edit history to replicate the state of the database at each given point in time, including all added or removed artists and all changes to any artists location.


Isn’t that just a case of make a search request for a date and counting the data?

I guess it is rare something comes along that can so dramatically skew the data like this.

I don’t do web stats so didn’t realise they were generated daily like that. Thanks for the info.

Well, first you need the state of the data at that date. You can of course query the database easily to get the count of artists in a specific area right now. But any earlier date you basically need to replay the entire editing history, either forward or backwards. E.g. if you know there are 1000 artists in a certain area today and want to get the count for yesterday you need to check how many artists where added with this area, how many removed, and how many got their area changed to/from this area since yesterday. Then when you have the number for yesterday you can repeat this for the day before etc.

I guess with the data available at MB that likely would be possible, but I doubt it is worth the effort. Also it depends on whether the edit history contains the needed information for the complete history of MB.