Since there are these days some extremely aggressive web crawlers out there that launch multiple concurrent VM instances to crawl (e.g. DigitalOcean droplets), that rapid-fire requests to websites to crawl them (including fora), it can lead to (effectively distributed) denial of service for regular users. Why? Because these crawlers are designed to index first the URLs that are valid/available by spidering over pages, using HEAD requests.
For PHP (and any other server-side processed scripting), HEAD requests are disastrous because all they do is cause the script processor to do work while the result of the request is never sent or used.
Recently, this very forum came under extremely heavy load because of this bad practice (by a bot called crazywebcrawler) launching so many requests that the server load skyrocketed to 70+ (in case you don't know, that is the number of full-load cores that would be needed to process the tasks without delay). After blocking DigitalOcean as a whole to stop the DDoS I looked into the cause, found this behavior, and instated the following to disable HEAD requests in nginx (inside the .php block):
Code: Select all
location ~ ^(.+\.php)(.*)$ {
if ($request_method = HEAD ) {
# refuse HEAD requests since they cause PHP processing without benefit.
return 405;
}
*** any other fastcgi parameters you need to pass php to the processor ***
}