The Baiduspider and .htaccess
I HATE bots/spiders and other nefarious automated creepy crawlies.
I can't seem to stop this thing from hitting my site... Logs show Code:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html) Code:
RewriteCond %{HTTP_REFERER} ^http(s)?://(www\.)?baidu.com.*$ [OR] Code:
wget robots=on -U "Baiduspider/2.0" bournetoraiseshell.com robots.txt Code:
User-agent: * but it just keeps coming! What am I missing? Thanks! NOTE: Server, Security, and Software all seemed appropriate... |
http://www.baidu.com/search/robots_english.html
It seems you set up your robots.txt correctly. There is also other examples of blocking via htaccess. Why not block it via firewall such as iptables? Here's Baidu ip ranges if you want to categorically block their entire network. Code:
iptables -A INPUT -s 119.63.193.0/24 -j DROP SAM |
Quote:
|
Thanks Sam:
I added those CIDR addresses to my cloudflare block list. We'll see what happens in the next few days. John |
All times are GMT -5. The time now is 02:25 PM. |