LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Security
User Name
Password
Linux - Security This forum is for all security related questions.
Questions, tips, system compromises, firewalls, etc. are all included here.

Notices


Reply
  Search this Thread
Old 11-14-2008, 11:07 AM   #1
haydenyoung
LQ Newbie
 
Registered: Nov 2004
Location: Perth, WA, Australia
Distribution: Ubuntu Dapper Drake, CentOS4
Posts: 13

Rep: Reputation: 0
Blocking Web crawlers, bots, spiders, proxies, etc from private site areas


Hi

I have a web site that runs a number of different web applications; joomla, bugzilla, firestats, nagios, etc.

While I'm happy for my public site's content to be spidered and cached I would prefer certain apps such as bugzilla to not be, rather I would like to them to accessible via the web for employees, customers, etc but not be publicly advertised or searchable.

Is this something I should be worrying about, and, if so, how do I reduce the ability of say spiders to spider this content?

Any help much appreciated.
 
Old 11-15-2008, 04:23 AM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
formally you would use the robots.txt file.
 
Old 11-15-2008, 05:36 AM   #3
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by haydenyoung View Post
Is this something I should be worrying about
If you find people shouldn't have unrestricted access to some information for whatever reason the answer is "yes".


Quote:
Originally Posted by acid_kewpie View Post
formally you would use the robots.txt file.
IMHO a robots.txt should fit in a set of measures like DMZ, firewall or per application or webserver configurable access restrictions and authentication, usage of HTTPS, reverse proxies, tunneling and whatnot. The best way to select stuff to implement is to look at what your nfo is worth.
 
Old 11-15-2008, 11:03 AM   #4
haydenyoung
LQ Newbie
 
Registered: Nov 2004
Location: Perth, WA, Australia
Distribution: Ubuntu Dapper Drake, CentOS4
Posts: 13

Original Poster
Rep: Reputation: 0
Hi

Thanks for your replies.

First of all I should say that password protection should stop robots from accessing valuable data, but I guess I want to be proactive and don't what web searches turning up links to my private app area.

Next, I have configured a robots.txt and expect those robots that respect the configuration to not index those parts of my site that are off limits. However, I'm taking the approach that not every bot is good (e.g. they are not good bots like google, yahoo, curl, etc) and that there will be those bots that either, a) ignore my robots.txt, or worse, b) use robots.txt to carry out malicious attacks.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Arab proxies to eliminate my web requests? robbbert General 4 10-01-2006 01:15 PM
How to program Web Bots??? chutsu Programming 3 05-30-2006 01:10 PM
help with web content filtering/proxies Trio3b Linux - Security 2 02-08-2006 08:07 PM
Site Blocking mathew5 Slackware 7 09-16-2004 12:35 PM
know any web cache proxies? (for ssh -C) jago25_98 Linux - Networking 0 07-23-2003 07:35 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Security

All times are GMT -5. The time now is 03:12 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration