LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 09-02-2006, 01:17 PM   #1
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Rep: Reputation: 0
Spiders Cause Load to run over 200!!!


Hi,

I have a couple of dedicated servers and the load is well over 200 at times, which makes my websites non functional.

This problem is due to too many spiders visiting sites.

i only have about 133 sites on one and 233 on the other. Is there an easy way to take care of this.

We blocked Google temporarily but the sites are still getting attacked. What can I do to block all of them completely, maybe except yahoo or just allow yahoo and Google. Or shoot all of them until I get this resolved.

thank you for your help.

W
 
Old 09-02-2006, 02:55 PM   #2
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,159

Rep: Reputation: 328Reputation: 328Reputation: 328Reputation: 328
I'd suggest just using robots.txt to make the spiders go away (and using mod_rewrite or similar to block those spiders which do not respect robots.txt, but all the "big ones" should).

How many spiders are visiting at a time? If your server hardware can't handle several concurrent visits, maybe it's time to think about upgrading?
 
Old 09-03-2006, 01:44 AM   #3
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Original Poster
Rep: Reputation: 0
Googlebot is killing the server right now. I didnt have any robots.txt. I guess the situation is getting a balance on allowing spiders for indexing and not crashing server.

I am pinging about 50 different servers with wordpress, only about 133 websites on dedicated server. never had this problem before
 
Old 09-07-2006, 12:42 AM   #4
Matir
Moderator
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Ubuntu
Posts: 8,507

Rep: Reputation: 118Reputation: 118
Google's bots have been designed not to create huge amounts of load on servers (they have delays inserted). What is your load without the googlebot hitting you?
 
Old 09-08-2006, 09:08 AM   #5
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Original Poster
Rep: Reputation: 0
i think your right, for some reason is very very high

i only have 233 websites with small php script, i have some of these on one other server and load is around 0.01 almost all the time.

i have set a robots.txt but I dont think it is the spiders after looking.

i know it is not memory related, yahoo related, i am perplexed.

thank yyou for your help!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Does .htaccess block search engine spiders? MicahCarrick Programming 2 08-24-2006 12:16 PM
alsa loads manually but will not load in run level kawinter Linux - Software 4 09-17-2005 03:53 PM
lose 200$ or sace 200! HELP HELP HELP! OMEGA-DOOM Linux - Software 8 10-23-2004 08:47 PM
how to run the mysql load statement in cron gschrade Linux - Software 5 07-27-2004 03:27 PM
Help, Cant use modprobe to load modules to run isis! apeacez Linux - Software 0 01-15-2004 08:19 AM


All times are GMT -5. The time now is 02:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration