LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Security
User Name
Password
Linux - Security This forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.

Notices


Reply
  Search this Thread
Old 06-27-2019, 11:04 AM   #1
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Rep: Reputation: Disabled
Nginx setting to block robots and proxy not working


I have webserver running nginx. In the /var/www/html I first created an empty
Code:
robots.txt
file. Then in the nginx.conf I have added this

Code:
location = /robots.txt {
          add_header  Content-Type  text/plain;
          return 200 "User-agent: *\nDisallow: /\n";
        }
In my daily logwatch I still notice this.

Code:
A total of 1 ROBOTS were logged
In addition I also notice on this.
Code:
Connection attempts using mod_proxy:
    95.213.177.125 -> check.proxyradar.com:80: 1 Time(s)

 A total of 1 sites probed the server 
    112.66.70.223
How to further block all these attempts ?
 
Old 06-28-2019, 04:48 AM   #2
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 12,566

Rep: Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794
Hi,

Mind that robots.txt is used by web-crawlers to index your site. So IMO it's useful to be present and accessible to anyone.
If you still want to block access to it, you can use:
Code:
location /robots.txt {
       deny all;
       return 403;
       }

Also if you don't want your server to be probed for proxying, you can try the following:
Code:
if ($request ~* ^[A-Z]+\ http ) {
 return 404;
}
Regards
 
Old 06-29-2019, 01:07 AM   #3
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Bathory,
Yes I know about the indexing task of robots but this is a test platform and I dont want any indexing on this site. Ok I have implemented your solution on the code for the robots.Also in my robots.txt I added this text.
Code:
User-agent: *
Disallow: /
Is this necessary ?

I also added the ($request ~* ^[A-Z]+\ http ). Actually what does this do I know there is a regex infront of it. In addition I have done this
Code:
include /etc/nginx/blockuseragents.rules;
Code:
map $http_user_agent $blockedagent {
        default         0;
        ~*malicious     1;
        ~*bot           1;
        ~*backdoor      1;
        ~*crawler       1;
        ~*bandit        1;
        ~*profound        1;
        ~*scrapyproject   1;
        ~*netcrawler      1;
        ~*nmap            1;
        ~*sqlmap          1;
        ~*slowhttptest    1;
        ~*nikto           1;
        ~*jersey          1;
        ~*brandwatch      1;
        ~*magpie-crawler  1;
        ~*mechanize       1;
        ~*python-requests 1;
        ~*redback         1;
        ~*curl            1;
        ~*wget            1;
        ~*libwww-perl     1;

}
Is there any thing else I should do to further harden nginx ?
 
Old 06-29-2019, 03:58 AM   #4
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 12,566

Rep: Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794
Quote:
Ok I have implemented your solution on the code for the robots.Also in my robots.txt I added this text.

User-agent: *
Disallow: /

Is this necessary ?
IMO better use the robots.txt you've posted above and don't forbid access to it, so at least the benign bots will not index your site obeying robots.txt. B
ad bots will always try to harvest your site because they don't bother about robots.txt files.


Quote:
I also added the ($request ~* ^[A-Z]+\ http ). Actually what does this do
It forbids requests starting with a capital letters string followed by http(s) like:
"GET http://..." "POST http://..." "CONNECT https://.." etc, so your server could not be use as a proxy.


Quote:
Is there any thing else I should do to further harden nginx ?
Watch server logs and use something like fail2ban to block hacking attempts.


Regards
 
Old 06-29-2019, 08:56 AM   #5
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Bathory,
I am a bit confused here. So you say just add this
Code:
User-agent: *
Disallow: /
into the robots.txt. So you dont suggest to implement this

Code:
location /robots.txt {
       deny all;
       return 403;
       }
Is that what you are suggesting ? So how to block the bad robots any idea on it ? When you say watch your logs I have added logwatch and is there any other tools I should look out to view and analyse my logs. I have implemented fail2ban for ssh. Is there any specific jails for nginx. In addition I have also configured modsecurity with my nginx too.
 
Old 06-29-2019, 11:05 AM   #6
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 12,566

Rep: Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794
Quote:
Originally Posted by newbie14 View Post
Hi Bathory,
I am a bit confused here. So you say just add this
Code:
User-agent: *
Disallow: /
into the robots.txt. So you dont suggest to implement this

Code:
location /robots.txt {
       deny all;
       return 403;
       }
Is that what you are suggesting ? So how to block the bad robots any idea on it ? When you say watch your logs I have added logwatch and is there any other tools I should look out to view and analyse my logs. I have implemented fail2ban for ssh. Is there any specific jails for nginx. In addition I have also configured modsecurity with my nginx too.
Yes, I think that legitimate bots will honor robots.txt and will not index your site, while bad bots will not.
You can adapt this .htaccess for apache to increase your list of bad bots.

Re. fail2ban, there are 2 or 3 nginx filters you can activate in addition to the security measures you've already taken

Rehards
 
Old 06-29-2019, 12:41 PM   #7
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Bathroy,
How do you block the bad bots then ? Can this help.

Code:
location /robots.txt {
       deny all;
       return 403;
       }
I am not apache I am on nginx so how can I adapt the bad bots list to improvise my security? I did a google on filters of nginx is this what you referring too eg. https://alonganon.info/2016/06/22/ng...ters-fail2ban/
 
Old 06-30-2019, 02:38 AM   #8
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 12,533
Blog Entries: 9

Rep: Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393
You can't block the bad bots through robots.txt. It is not compulsory.
Anything that is visible to your clients your bots can also crawl if they want to.
You need to take other measures to prevent them - based on IP, UserAgent etc. I think fail2ban is a good start.
 
Old 06-30-2019, 02:41 AM   #9
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 12,566

Rep: Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794
Quote:
How do you block the bad bots then ? Can this help.

location /robots.txt {
deny all;
return 403;
}
No! This just forbids access to robots.txt.
Once again, I suggest you to use a robots.txt like the one posted above that denies indexing for / and for every bot.


Quote:
I am not apache I am on nginx so how can I adapt the bad bots list to improvise my security?
You can add these bot names to the $blockedagent you're already using.


Quote:
I did a google on filters of nginx is this what you referring too eg. https://alonganon.info/2016/06/22/ng...ters-fail2ban/
Yes, these are the filters that come with a stock fail2ban installation.
You may also try this

As a last note, keep in mind that a User-agent can be easily spoofed and also don't overdo it with counter measures or else your server will slow up.
 
Old 06-30-2019, 10:21 PM   #10
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Ondoho,
How to deal with the user agent and fail2ban any good links to follow? Is any good tool to analyse on the logs and react.

Quote:
Originally Posted by ondoho View Post
You can't block the bad bots through robots.txt. It is not compulsory.
Anything that is visible to your clients your bots can also crawl if they want to.
You need to take other measures to prevent them - based on IP, UserAgent etc. I think fail2ban is a good start.
 
Old 06-30-2019, 10:23 PM   #11
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Bathory,
Ok on the bots I will go with your suggestion. What do you mean by over do I dont quite get you. I notice on the fail2ban many links suggest different solution. So in your opinion what is the most harden mechanism?
 
Old 07-01-2019, 03:54 AM   #12
bathory
LQ Guru
 
Registered: Jun 2004
Location: Piraeus
Distribution: Slackware
Posts: 12,566

Rep: Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794Reputation: 1794
Quote:
Originally Posted by newbie14 View Post
Hi Bathory,
Ok on the bots I will go with your suggestion. What do you mean by over do I dont quite get you. I notice on the fail2ban many links suggest different solution. So in your opinion what is the most harden mechanism?
As I told you, User-agent is very easily spoofed, so if someone wants to harvest your site, he could easily do it. So there is no reason to add hundreds of bot names/Ips in lists for fail2ban to block.

IMO except fail2ban, you should watch logs for suspicious activity and act accordingly. There are many resources about hardening a webserver and especially nginx. So use your favorite search engine and start reading
Also make sure that your server is always updated with the latest security fixes.

Regards
 
Old 07-01-2019, 12:54 PM   #13
newbie14
Member
 
Registered: Sep 2011
Posts: 614

Original Poster
Rep: Reputation: Disabled
Hi Bathory,
Ok I will follow your suggestion on the bot names true cause people could create new bot names and just try to go in too. I am trying to find a good tool to watch the logs for suspicious activities e.g. it could alert. Secondly from today's logwatch I notice

Code:
Connection attempts using mod_proxy:
    95.213.177.126 -> check.proxyradar.com:80: 1 Time(s)
even though I have implemented.

Code:
if ($request ~* ^[A-Z]+\ http ) {
 return 404;
}
Secondly I also saw this

Code:
 403 Forbidden
       /: 25 Time(s)
       /robots.txt: 1 Time(s)
and also this.

Code:
A total of 1 ROBOTS were logged 
    - 1 Time(s)
 
Old 08-05-2019, 04:06 AM   #14
ondoho
LQ Addict
 
Registered: Dec 2013
Posts: 12,533
Blog Entries: 9

Rep: Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393Reputation: 3393
Quote:
Originally Posted by newbie14 View Post
Hi Ondoho,
How to deal with the user agent and fail2ban any good links to follow? Is any good tool to analyse on the logs and react.
I guess fail2ban is a good starting point.
https://wiki.archlinux.org/index.php/Fail2ban
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Setting up an Nginx Reverse Proxy LXer Syndicated Linux News 0 02-03-2019 06:03 PM
Nextcloud login loop on Nginx behind Nginx reverse proxy horizn Linux - Server 0 12-27-2018 02:44 PM
nginx + php-fpm and nginx modules fantasygoat Linux - Server 0 06-09-2011 12:21 PM
Ubuntu 8.04: Transparent proxy using squid working but block domain not working bleketux Linux - Networking 10 03-16-2009 06:41 AM
All traffic showed as proxy after installing nginx proxy to apache centosfan Linux - Server 0 10-25-2008 08:41 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Security

All times are GMT -5. The time now is 08:33 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration