So we had "503 Max restarts limit reached" problem on our production while doing some batch operations and some edits of entities. Problem for both was that server response was slow as both have large queries running which take more than 20 seconds(yeah should be optimized and probably add some indexes to fields). With fastly running on our server it timed out, we couldn't get any errors in our logs and searched all around the enviroment to in the end find that fastly was the culprit.
.between_bytes_timeout = 30s; .connect_timeout = 5s; .first_byte_timeout = 60s;
Fastly config for our hosts which are used as backends (load balancers in our case) was set too low, between_bytes_timeout was on 10 seconds and then this would trigger that nothing is sent until it reached first_byte_timeout which would then send error. So to fix this first is to make this setting larger to 30 seconds, if that is not enough you really need to check your queries and what is going on.