Web Crawler & User Agent Blocking Techniques


Web Crawler & User Agent Blocking Techniques

This is a simple script that allows hackers to block specific crawlers based upon website requests from specific user-agents. This is useful when you don’t want certain traffic from being able to load certain content – usually a phishing page or a malicious download.

if(preg_match(‘/bot|crawler|spider|facebook|alexa|twitter|curl/i’, $_SERVER[‘HTTP_USER_AGENT’])) {
logger(“[BOT] {$_SERVER[‘REQUEST_URI’]} – 500”);

header(‘HTTP/1.1 500 Internal Server Error’);

Using preg_match, the script looks for certain known crawler strings in the user-agent.

Continue reading Web Crawler & User Agent Blocking Techniques at Sucuri Blog.


Source link

What do you think?


Written by Luke Leal

Years Of Membership


Leave a Reply

Your email address will not be published. Required fields are marked *



Jevelin | Multi-Purpose Responsive WordPress AMP Theme

iThemes Enters the WordPress Membership Plugin Market, Acquires Restrict Content Pro – WordPress Tavern