I noticed that sitemap for musicbrainz.org hasn’t been updated since
2016-11-29. This means that many (probably all) pages for new
artists, new releases etc created since this datum cannot and have not
been indexed by Google search engine (and probably all others).
By grepping through all 29 sitemaps in the form
“https://musicbrainz.org/sitemap-artist-NUM.xml.gz” I found that URL
of the “Old Artist” can be found in “sitemap-artist-9.xml.gz” but URL
of the “New Artist” cannot be found at all.
Finally I checked if there is sitemap-artist-30.xml.gz and I didn’t
find it.
Based on these findings I suspect that sitemaps weren’t really updated
since long ago, most likely 2016-11-29.
Okay, there was indeed an issue here. We’ve been generating sitemaps correctly all this time, but we’ve not been rsyncing them to the correct server afterward due to an unset environment variable. This failed silently without logging any error.
I’m glad we have users paying more careful attention to our services than we’ve apparently been; sorry for not noticing this sooner and thanks for the report, @l984a.
About whether it’s related to MBS-9449, I’m honestly not sure @bsammon. My understanding is that they weren’t indexing pages much older than 2016-11-29 even (ones which should have existed in that sitemap). But their indexing process is mysterious to me, so I can’t rule out some relation.
Anyway, I’ve opened MBS-9994 where you can track updates on this issue.