LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 09-02-2006, 12:17 PM   #1
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Rep: Reputation: 0
Spiders Cause Load to run over 200!!!


Hi,

I have a couple of dedicated servers and the load is well over 200 at times, which makes my websites non functional.

This problem is due to too many spiders visiting sites.

i only have about 133 sites on one and 233 on the other. Is there an easy way to take care of this.

We blocked Google temporarily but the sites are still getting attacked. What can I do to block all of them completely, maybe except yahoo or just allow yahoo and Google. Or shoot all of them until I get this resolved.

thank you for your help.

W
 
Old 09-02-2006, 01:55 PM   #2
btmiller
Senior Member
 
Registered: May 2004
Location: In the DC 'burbs
Distribution: Arch, Scientific Linux, Debian, Ubuntu
Posts: 4,290

Rep: Reputation: 378Reputation: 378Reputation: 378Reputation: 378
I'd suggest just using robots.txt to make the spiders go away (and using mod_rewrite or similar to block those spiders which do not respect robots.txt, but all the "big ones" should).

How many spiders are visiting at a time? If your server hardware can't handle several concurrent visits, maybe it's time to think about upgrading?
 
Old 09-03-2006, 12:44 AM   #3
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Original Poster
Rep: Reputation: 0
Googlebot is killing the server right now. I didnt have any robots.txt. I guess the situation is getting a balance on allowing spiders for indexing and not crashing server.

I am pinging about 50 different servers with wordpress, only about 133 websites on dedicated server. never had this problem before
 
Old 09-06-2006, 11:42 PM   #4
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
Google's bots have been designed not to create huge amounts of load on servers (they have delays inserted). What is your load without the googlebot hitting you?
 
Old 09-08-2006, 08:08 AM   #5
oadvantage
LQ Newbie
 
Registered: Mar 2006
Posts: 12

Original Poster
Rep: Reputation: 0
i think your right, for some reason is very very high

i only have 233 websites with small php script, i have some of these on one other server and load is around 0.01 almost all the time.

i have set a robots.txt but I dont think it is the spiders after looking.

i know it is not memory related, yahoo related, i am perplexed.

thank yyou for your help!
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Does .htaccess block search engine spiders? MicahCarrick Programming 2 08-24-2006 11:16 AM
alsa loads manually but will not load in run level kawinter Linux - Software 4 09-17-2005 02:53 PM
lose 200$ or sace 200! HELP HELP HELP! OMEGA-DOOM Linux - Software 8 10-23-2004 07:47 PM
how to run the mysql load statement in cron gschrade Linux - Software 5 07-27-2004 02:27 PM
Help, Cant use modprobe to load modules to run isis! apeacez Linux - Software 0 01-15-2004 07:19 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration