Public Server Status Page available?


#1

Is there a public server status page available? In the last few days (since the last server update) we noticed some timeouts accessing the web service interface. We are not sure if it is an issue on our side or your side.
A status page would help.

Thanks
Beat


#2

Would be nice to have something like https://www.statuspage.io. Wonder if they offer something for open source projects or non-profits.


#3

We have http://stats.musicbrainz.org/ - @Zas would be the one who can answer questions about it.


#4

We are not aware of any new issues. Are those real timeouts (no response at all) or 503 server responses?


#5

This are real timeouts (we cancel the requests after 28s)
They occur sporadically during the day.

806.084409013 10.100.19.86 -> 72.29.166.157 TCP 74 45617 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3259526143 TSecr=0 WS=128
807.084090149 10.100.19.86 -> 72.29.166.157 TCP 74 [TCP Retransmission] 45617 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3259527143 TSecr=0 WS=128
809.084089475 10.100.19.86 -> 72.29.166.157 TCP 74 [TCP Retransmission] 45617 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3259529143 TSecr=0 WS=128
813.084090246 10.100.19.86 -> 72.29.166.157 TCP 74 [TCP Retransmission] 45617 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3259533143 TSecr=0 WS=128
821.084096977 10.100.19.86 -> 72.29.166.157 TCP 74 [TCP Retransmission] 45617 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=3259541143 TSecr=0 WS=128

(We see also sometimes 503 due to rate limiting)

Manual check with wget

--2016-03-23 20:43:37--  https://musicbrainz.org/ws/2/release?label=47e718e1-7ee4-460c-b1cc-1192a841c6e5
Resolving musicbrainz.org... 72.29.166.157
Connecting to musicbrainz.org|72.29.166.157|:443... failed: Connection timed out.
Retrying.

--2016-03-23 20:44:41--  (try: 2)  https://musicbrainz.org/ws/2/release?label=47e718e1-7ee4-460c-b1cc-1192a841c6e5
Connecting to musicbrainz.org|72.29.166.157|:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Wed, 23 Mar 2016 19:44:42 GMT
  Content-Type: application/xml; charset=utf-8
  Content-Length: 17180
  Connection: keep-alive
  Keep-Alive: timeout=15
  Server: nginx/1.4.6 (Ubuntu)
  ETag: "9d22e4e8deb1549003811c72b7e3fd57"
  Access-Control-Allow-Origin: *
Length: 17180 (17K) [application/xml]

Rate limiter has changed?
#6

Would it make sense to provide a status.meb.o page for high-level views of project availability?


#7

Who will update it when things break?


#8

The idea would be to update it automatically.


#9

Based on Nagios reports?

I’m not sure that would be particularly useful; essentially, it would only tell users what they already know (“service X is currently down”).


#10

They don’t necessarily know whether it’s down on our end or somewhere between our machines and theirs.