Sitemap on musicbrainz not updated since 2016-11-29

Tags: #<Tag:0x00007f2a53b21bd0> #<Tag:0x00007f2a53b21a90> #<Tag:0x00007f2a53b21950> #<Tag:0x00007f2a53b21810>

Hi!

I noticed that sitemap for musicbrainz.org hasn’t been updated since
2016-11-29. This means that many (probably all) pages for new
artists, new releases etc created since this datum cannot and have not
been indexed by Google search engine (and probably all others).

Can it be fixed?
Thank you!

1 Like

It looks like https://musicbrainz.org/sitemap-index.xml.gz was last generated today AFAICT, but all its entries do indeed seem to have their lastmod date set to 2016-11-29.

@zas, @bitmap, @yvanzo?

Hi!

I checked the status of sitemaps again. Here are the results.

From what I see “sitemap-index.xml.gz” does not have a single newer
“lastmod” since 2016-11-29.

To verify if new artists are actually missing from artists’ sitemaps I
took two artists as a reference - one “new” and one old:

“Old Artist”

“New Artist”

By grepping through all 29 sitemaps in the form
https://musicbrainz.org/sitemap-artist-NUM.xml.gz” I found that URL
of the “Old Artist” can be found in “sitemap-artist-9.xml.gz” but URL
of the “New Artist” cannot be found at all.

Finally I checked if there is sitemap-artist-30.xml.gz and I didn’t
find it.

Based on these findings I suspect that sitemaps weren’t really updated
since long ago, most likely 2016-11-29.

Thank you!

2 Likes

Thanks for bringing attention to this! I’ll investigate more tomorrow and let you know what I find.

4 Likes

Hmm… this could explain:

https://tickets.metabrainz.org/browse/MBS-9449

https://community.metabrainz.org/t/musicbrainz-not-fully-indexed-by-google/329115/5

Okay, there was indeed an issue here. We’ve been generating sitemaps correctly all this time, but we’ve not been rsyncing them to the correct server afterward due to an unset environment variable. This failed silently without logging any error.

I’m glad we have users paying more careful attention to our services than we’ve apparently been; sorry for not noticing this sooner and thanks for the report, @l984a.

About whether it’s related to MBS-9449, I’m honestly not sure @bsammon. My understanding is that they weren’t indexing pages much older than 2016-11-29 even (ones which should have existed in that sitemap). But their indexing process is mysterious to me, so I can’t rule out some relation.

Anyway, I’ve opened MBS-9994 where you can track updates on this issue.

3 Likes

Hi!

Thank you very much for helping so fast! I see that sitemaps are already updated. I am glad to see this problem solved.

Best regards!

2 Likes