The Cover Art Archive is currently experiencing difficulties

coverartarchive
technicalissues
Tags: #<Tag:0x00007f3099d971c0> #<Tag:0x00007f3099d96fe0>

#1

Is it just me, or is there an ongoing problem? Every time I have tried to load cover art recently, I have got the message “Warning: The Cover Art Archive is currently experiencing difficulties. Adding images for this release is unlikely to work at the moment.”
Also, no existing cover art is displayed. e.g. “Front cover image failed to load correctly.”


Keeping or dropping Amazon as a CAA fallback on MusicBrainz.org
#2

Sorry that nobody replied here before. If you’re still seeing these issues, can you link to some examples? As of right now our indexer seems to be working, and the IA’s S3 stats look okay too. Recent cover art additions all load for me except, interestingly, a few that are blocked by uBlock Origin ("||ia601507.us.archive.org^ Found in: Malware domains").


#3

I’ve experienced that too. I wasn’t sure whether to report it to the IA or not.


#4

I emailed RiskAnalytics about it, the people behind https://www.malwaredomains.com/ (where that IA domain is included). We’ll see if they remove it.

Edit: They responded:

Thank you for bringing this to our attention. We will remove this false positive from our list in its next update. We typically update every 24-48 hours. If you have any questions, please let us know.


#5

It seems to be an ongoing issue for me. Sometimes it works, but more usually not.
This is what I get as a home page:

And one from my collection:

The above are on a W10 PC using Slimjet, but I get the same problem with Safari on an iPad.


#6

Would you be able to find out how to open Slimjet’s network tools/console and see if it shows why those images aren’t loading? Some kind of response code (404, 503, etc.) and which server/URL returned the code would help a bit in debugging this. Or, if it shows something like ERR_CONNECTION_REFUSED or ERR_CONNECTION_CLOSED (I think Slimjet is based on Google Chrome), the request may be blocked by your browser or network in some way.


#7

Thanks - it shows ERR_CONNECTION_CLOSED
As I said, it happens with Safari on the iPad too, so maybe it is a network issue?
How do I track that down?


#8

Aha. Some progress … of sorts…
As I have a rubbish landline (BT’s best wet string) with no prospect of improvement, I have added a 4G connection with an external aerial and a load-balancing router. If I use the ADSL connection, I get images, but if I use the LTE connection, I don’t. I only kept the ADSL for FTP purposes (static IP with LTE is really expensive…).
I have put in place a work-round to enforce the ADSL connection for the MusicBrainz IP address (138.201.227.205) and this fixes it!
But that means I am stuck with a slow connection to MB and I don’t know why it fails on 4G.

EDIT: Actually, it’s the archive IP address that needs to be routed via the ADSL - I’ve done this for 228.241.224.1 - 228.241.228.255, which I hope covers it - and means that the main MB address is on the fast line.


#9

About six months back I have seen a few sub-domains of archive.org getting onto Eset’s block lists for a few days. This would block a few of the images for me, but others still load.

Then on other random days I’d get a full set of non-loading images.

Similar random hiccups on uploading artwork too. Some days just feels like a painfully busy \ underfunded server to me. Works most of the time but suffers the odd hiccup.

Certainly been a lot better for me in the past few months.

I don’t know how archive.org controls what is on their network but I do know I have seen dubious items hosted on archive.org addresses. Stuff clearly not aimed at being hosted in an “archive”. A classic being KODI repos. Those add-ons that allow access to dubious film and movie streams. It almost looks like people hiding these items on there to avoid the ACE lawyers.

So what else is being hidden on their servers? Is this all then suffering pointless excess load? OR DDoS hits? I know when I hosted a (perfectly legal) repo it eventually got DDoS’d to death. So having KODI repos knocking around on the server could be causing odd load issues. And what else is being hidden on there?


#10

Someone needs to feed the hamsters at CAA.

Totally borked at the moment. Can’t upload anything now for a couple of days.

Have been trying to upload, and it fails. Though to be fair there is a big banner on the page telling me it won’t work.

I notice too that the same Eset pop-up is back when I go to the MB homepage complaining about images being on a dodgy site.
image

Is there something serious happening over there? Are they having their servers hijacked? I am a KODI user and I have seen some of the dodgy repos seem to get hidden in the Archive.org servers. Is this causing them to get onto block lists?


#11

Managed to get arts uploaded today. So happy hamsters again.


#12

Afaik ia801508.us.archive.org and ia601508.us.archive.org (or was it ia501508.us.archive.org?) are blocked by ESET at the moment. You can send them an e-mail about a false positive at https://support.eset.com/kb141/#SubmitWebsite if you think they should not be blocked. However, there are some interesting things if you try to google some of these domains. For example googling ia801508.us.archive.org shows 12-13th result for me which links to https://www.reddit.com/domain/ia801508.us.archive.org/ which lists some very sketchy stuff on the first glance.

Anyway, for the time being I went into NOD Antivirus’ options (Advanced Setup/Web Access Protection/URL Address Management/Address List) and added “*.us.archive.org” into “List of Allowed Websites” so I could see the cover arts that were being blocked. Use this on your own risk.


#13

I’m actually an official reseller of Eset products so could kick a bit higher into the system if it was a false positive.

But from those Reddit links you have found I do wonder what else is on that server. Is it just open season? Anything goes? Really weird way to run a server.

I have never looked deep into who runs the CAA, but why share a server like that which clearly has no rules in place? That is only asking for trouble. When I spotted some dubious KODI repos were sitting on the WaybackMachine it made me realise quite how much that server must be getting abused.

I know I once hosted a perfectly legal KODI repo. And was stunned one weekend when I was being hammered with some kinda DDoS pulling multi GBs per minute from my server. The repo was only a few KB in size, but was being downloaded in a mad quota burning loop. I expect similar is happening to archive.org

I wonder if CAA could request to move into a “cleaner” corner of the servers away from the dubious stuff.


#14

All of Internet Archive’s collections are, AFAIK, stored together—including the Wayback Machine, which archives pretty much anything on the World Wide Web it’s pointed at, dubious or not. I don’t think they have any backend code that would allow to split certain collections to only/not be available on certain of the sub servers (which might in turn also result in a much lower pool of servers to choose for for uploading/downloading, which could lead to increased downtime for the CAA (posting and/or getting images)). @Bitmap might have more concrete information though and would be able to ask the IA people whether something like you propose could be done.


#15

@Freso that does make sense, but it is clearly open to abuse.

The MB artwork is being directly uploaded to CAA. So doesn’t seem to be using the WayBackMachine.

And the dodgy stuff I am talking about with KODI is also directly being uploaded somehow. These were up to date files, but in those cases there were “waybackmachine” headers on the web pages. I never wrote the examples down so I can’t go look.

That set of Reddit links @culinko found are very worrying. They were all blocked to me, but just looking at the names of the items they look well sketchy.

So some how the archive servers are being filled with other live data.


#16

What I’m saying is that both Cover Art Archive and the Wayback Machine are “just” two of multiple “collections” with the Internet Archive (Wayback Machine actually seems to be stored as a series of different collections, as far as I can discern). They currently have 380,331 collections, all sharing the same storage servers and other infrastructure. CAA is of particular interest to us, but is “just” “one of many” to the IA.