Rate limiter has changed?

jesus2099 · March 29, 2016, 1:22pm

Continuing the discussion from Public Server Status Page available?:

I feel that, recently, a few month, the rate limiter behaviour has changed.
My user scripts (those that use the Web Service, not those that load normal web pages), began to get 503 errors although they fully respect the 1 request per minute very carefully.

OK, once I was looking for reference for this post, I eventually fount out thanks to foo_musicbrainz forum that:

XML Web Service / Rate Limiting / Global

We allow through 300 requests each second (on average), and decline (http 503) the rest.

Do you know if this is new or if it is because our global amount of WS requests recently crossed the 300 per second line?

I will certainly add auto‐retry and/or on‐demand loading in my mind for the future codes of mine…

Freso · March 29, 2016, 1:37pm

We actually discussed the increase of 5XX responses during last night’s #MetaBrainz meeting:
http://chatlogs.metabrainz.org/brainzbot/metabrainz/msg/3551329/

rob · March 29, 2016, 1:39pm

The rate limiter has not changed, but over overall traffic has and this causes more 503 errors to be thrown. At peak times like last night we’re hitting our 30mbit/s committed traffic limit. We could buy more bandwidth, but then we would likely need to buy more servers to handle the capacity.

We’re currently trying to get away from operating our own servers and instead rent servers from some other provider. In order to do that we need to finish establishing our EU presence (a few weeks more, sadly) and then we need to select a new provider and negotiate a new arrangement with them.

Then we can begin the process of migrating across the pond, which is also going to be an incredible amount of work. Fortunately @zas has been making a lot of progress on this. For now, we just need to bear with the current situation.

jesus2099 · March 29, 2016, 2:12pm

Thanks for your answers, then it’s the request amount that has been progressively going up and that has crossed our limiter recently.
That makes sense.
I will make sure all my scripts that are using WS are acting the most gently as possible, with each update I make.
The amount of 503 I get is not that bothering, it’s just that I had recently noticed them for the first time and wanted to know more about them.

Cheezmo · April 8, 2016, 4:34pm

Something needs to be done. Picard tagger has become unusable for me with maybe 30% of releases not loading, requiring to try over and over again.

With Picard not working for me, my interest in editing is declining.

I am getting to where I can no longer recommend Picard are Musicbrainz to others. It would be embarrassing given how broken it appears to be right now.

jesus2099 · April 8, 2016, 6:29pm

I thought Picard did include a retry on error but it seems not. Let’s add this feature!

chirlu · April 9, 2016, 9:16am

Could you point out which part of @rob’s post gave you the impression that nothing will be done, so that he can improve the wording?

Cheezmo · April 9, 2016, 12:37pm

This…

I agree I was a little harsh and I apologize, but I had just spent over 2 hours trying to check my tags with Picard (something that used to take 5-10 minutes so I was a little frustrated.

But then Jesus2009 came up with a perfect solution thanks to me expressing my frustration so I’m glad I did.

jesus2099 · April 10, 2016, 9:29am

I did not code the solution, I do not really know Perl or Python.
But I can give it a try someday.
I have created a ticket (PICARD-807).

Cheezmo · May 5, 2016, 10:35pm

So, is Picard usable for anyone? With over 50% of lookups failing on my last attempt to tag my library, I just gave up. I’m not interested in retrying releases over and over and over again. I’m so frustrated I’m about to give up on Musicbrainz. It no longer works for me, why should I contribute and edit.

There is an open ticket to look into the problem. https://tickets.musicbrainz.org/browse/PICARD-807

Anyone have any workarounds, etc, that would let me use Picard for tagging again?

sbontrager · May 5, 2016, 11:01pm

It’s not just Picard, I’m getting a lot of failures as well. I’m glad MB is becoming a well-know and well-used tool. But I’d like it to be able to use it too… anyone running an open mirror we can use?

psychoadept · May 6, 2016, 12:36am

Have you read all this? State of things, Q2 2016

I hope an interim solution can be found, but I would rather be patient and wait for a real fix than cause the real fix to be delayed by a lot of effort going into a bandaid solution

Cheezmo · May 6, 2016, 1:49am

I had not read that, and it is really what I was looking for, some insight into what is going on with the whole system. Sounds like much patience is in order.

zag2me · May 6, 2016, 8:12am

Yep I’ve been bringing this up in IRC a few times recently. Its hard not to come over as ungrateful when bringing up problems with a free api server, so I wrote this little blog with my thoughts:

I’ve switched to http://musicbrainz-mirror.eu:5000/ for now on my personal sites, but for Kodi we can’t do that as the user base is so huge(in fact i’d say we are probably responsible for the overload) as our userbase is growing exponentially and anyone scraping music will hit the MB Web service thousands of times. We tried running a mirror but without any linux server admins it just got out of sync after a while and we took it down. I see there is a new VM which motivates me to look at this again.

It would be great if someone ran a pay mirror or something that open source projects could use at a free or discounted rate. Having it all centralized on a single server run by metabrainz foundation seems not the best design to me. The server i linked to above has 2 servers load balanced due to demand. Leveraging the cloud for this is also a good idea.

aerozol · May 6, 2016, 9:22pm

Super interesting, thanks @zag2me!
(even for a non-tech guy like me)

One of the things that might be causing a lot of load for the server is the ‘changes to your library’ email notification/ subscription/ link. It almost never loads for me, and seems to running a crazy amount of queries (assuming that’s what’s happening when it’s trying to load).

Perhaps it would be worth shutting down a few complicated operations for a short while if it means things are going to be useable again in the interim.

Zas · May 7, 2016, 11:17am

@zag2me : MetaBrainz is in the process to improve those things, and many of the recommandations on the blog post will be implemented.

Here are our plans:

improve the way we currently rate limit, we are experimenting a solution based on openresty at the moment, this may be in place very soon
increase bandwidth and use faster hardware, this is the NewHost move we’re initiating, it should be done within next 3 months
use much faster database system, using better hardware (SSD) and read only slaves
split web service and web sites, this is a part of the move to NewHost
web service version 3 is planned, api key and json inside, it will come later

About the web service, we are using all the bandwidth we can afford for now, and a good part is wasted by web service abusers, the new rate limiter software will greatly improve the situation.

About Picard, i’m lacking of time to work on it, so if some python guys are around… feel free to pick up few issues in JIRA and push few PRs

zag2me · May 11, 2016, 7:33pm

Sadly I don’t think the latest changes have helped much…

I loaded 70 albums into Picard and 17 of them failed. That’s about a 20%+ failure rate I believe.

psychoadept · May 12, 2016, 3:21am

It looks like the Picard updates are in process (Ticket), but haven’t been released yet (Daily Build: Recent Changes)

zag2me · May 12, 2016, 7:55am

While that may fix the problem in picard, its going to just add to the API calls.

If the problem is bandwidth, why not leverage the cloud providers like cloudflare? They have made bandwidth a non issue for most big sites these days. I’m on the free tier and it saves about 3tb a month.

reosarevok · May 12, 2016, 9:14am

The API calls it will add should be negligible compared with the hammering we’re taking from external sources unrelated to Picard, so I don’t think that should be too problematic (not saying Cloudflare wouldn’t help, I have no idea about that stuff but thankfully I’m not in charge of understanding it either )