What do the X-RateLimit headers mean and how should they be used?


Sorry if this is documented somewhere, i cant find it and i have searched the forum.
I found this page:

But that has no detail about what these fields are defined as and how to use them.

So i ran a few queries a few minutes or seconds apart, and got the following result:
X-RateLimit-Limit: 800\r\n X-RateLimit-Remaining: 654\r\n X-RateLimit-Reset: 1493872380
X-RateLimit-Limit: 800\r\n X-RateLimit-Remaining: 23\r\n X-RateLimit-Reset: 1493873035
X-RateLimit-Limit: 800\r\n X-RateLimit-Remaining: 302\r\n X-RateLimit-Reset: 1493873161

From this am failing to define what they mean and how they should be used.

Of course this mainly has to do with the rash of 503 codes returned at certain times, so i want to learn to best limit my requests.

But i feel like im missing something, does the body of a non 200 message have any information, is so what is the schema?
Any help?


It isn’t well documented because:

  • it is a bit hacky, the new rate limiter was set up to workaround issues we had with previous system
  • it is temporary, we want to move to something much sturdier
  • it may change at any moment (non-stable)

That said, i can clarify the meaning of those headers, as it may help people to understand what is happening:

X-RateLimit-Limit: the rate limit ceiling
X-RateLimit-Remaining: the number of requests left for the 2 seconds window
X-RateLimit-Reset: the remaining window before the rate limit resets, in UTC epoch seconds
X-RateLimit-Zone: the zone applied for this rate limit

Currently we have a global zone, if you get 503s with X-RateLimit-Zone: global it means basically our servers are overloaded and reducing your rate will only help us but not necessarily you.
If X-RateLimit-Zone value is per-ip, it means we get too many requests from this IP.

Retrying your request before X-RateLimit-Reset time is very likely to fail.
X-RateLimit-Remaining indicates how may requests are still allowed during this 2 seconds window. If the value is near X-RateLimit-Limit


Ok I believe i understand what you are describing, but unfortunately i don’t see how i (or anyone really) can use it to be a good consumer…
Exactly I was looking to do my best NOT to contribute to the overload problems…

I could have swore i had seen some kind of information somewhere in the a response to retry in a certain amount of time?


Well, when you hit global rate limit it basically means servers are overloaded, so, apart delaying even more your requests to alleviate the load, not much to do.
We are working to improve the situation, and i expect we’ll be able to increase the rate limit in the near future.
If you need to do a lot of queries without being limited, the best way is to deploy your own mbs server and use replication mechanisms.


Right now I’m just a dude in his bedroom and never expect much traffic at all, just would like to find the best way to get successful queries, and be a good citizen user…

In another thread I asked what the current error codes used are for… 503 rate limit… ect…
If there might be any information there that would help users be better users, well its hard to find any info on that as well.


Is it ok to do this as a free consumer? Siruation in my case is as follows:

At the moment I don’t make any money with my upcoming app which uses musicbrainz. So I cannot contribute any money yet. But on the other hand I definitely need a stable api. So is it ok in my case to do such a replication?


Hi @The_Unknown!

I answered your question via email–if not let me know and I’ll respond once more!


@Zas Us there a document with best practices how to respect the limit of 1 request/second? At the moment I have a user independent http request cache. These are cached for 1 day. But what can I do if I simply get more than 1 uncached request per second? Some kind of queue? Some info on that would be great.

As @Quesito mentioned via email live update feed is not an option for me as currently non profit.


Picard adds some delay to a request based on the time of the last request. So e.g. for musicbrainz.org there is the 1 request per second limit. If the last request was 0.4 seconds ago it will add a delay of 0.6 seconds before starting the next one. For this Picard remembers the time of the most recent request made.

The code for this in Picard got a bit more complex over time, since Picard supports different request limits per host, but the core of the calculation is at https://github.com/metabrainz/picard/blob/master/picard/webservice.py#L356


Thanks this is the kind of concrete example im always glad to see.
Only my understanding of python is nil :grimacing:


Hi All,

Are there any updates on this topic?

Is the X-RateLimit-Limit value the number of requests, system wide, alowable in the 2 second window?

What does the X-RateLimit-Reset value mean?

Thanks in advance


That’s the time before the rate limit is reset (in UTC epoch seconds).

T+0 we start to count incoming requests
T+2 we reset that count
If the number of requests reach the limit between T+0 and T+2 subsequent requests are refused, until we reach T+2, then we start again to count (and allow), rinse and repeat.
That’s for the global limit, at the moment we have enough resources and this limit shouldn’t be an issue.

We enforce limits per IP & User Agent too, if the recommended rate is respected, a fair user shouldn’t hit this limit either.

A proper user agent string allowing the application owner to be contacted is required.

But, in any case, it is heavily recommended to properly handle error codes in your application, as limits may be hit due to high traffic, or because we lowered them temporarily (for example if we have issues with backend load). In case of 503, pause and retry. One may also use those X-RateLimit-* headers to make it smarter.

Unfair use is not recommended (at all).


Thank you very much for that info @Zas. :slight_smile:

I am working on a mechanism that repects the MB resource.