Apache server getting overloaded by... what?
Hi. I need some help analyzing what it is that's overloading my web site.
I'm running LAMP on Fedora 12, on an AMD 64 processor. My web site is relatively low volume; a good day is over 250 visitors, and most days it's below 200. I can't see anything there that would overload even a small box like mine. Or so you would think. Several times a day -- perhaps 5 or 6, it's hard to say because I'm not always there -- I get flooded with requests. I run "top" in the background all the time to watch it, and what I see is the load average going through the roof -- I've seen the 1 minute figure go over 50 in about 3 minutes -- but the cpu numbers stay reasonably low, which I interpret as the system being I/O bound. The "top" display will show at least 40 or 50 http sessions in flight, with PID numbers spanning around 150 numbers slightly out of sequence, suggesting that the requests hit in close proximity but not precisely at the same time. These episodes can last up to 40 minutes before the system clears and the load avg goes back down to something sane, although I've had instances where I get a flurry of activity that lasts maybe 5 minutes, and the load avg goes no higher than about 20. The httpd log does not show any particular pattern of clients hitting my server. The requests appear to be historical pages from the blog (e.g. I notice requests for images from older pages, not the current front page.) My best guess is that what I'm watching is Google or some other web caching service scanning my site for caching purposes. But I don't know. Maybe I pissed off some aggressive hacker (it's a political site) and he/she/it has figured out a way to periodically cause me grief from masked sites. I have two questions: 1) Does anybody recognize this pattern? Can you tell me what it is? 2) How can I streamline mysql and apache so these incidents don't cripple me for half an hour? Thanks in advance for any help. This is really messing me up. Phil Weingart |
Quote:
Quote:
LAMP: you've specified the Linux Apache, and Mysql parts, but the P. Perl or Python (or something else)? Are you using a CMS? Could you use a lighter web server (Nginx, or something)? Do these things occur at regular times? Do they come from a small sub-set of web addresses, and if so, where are those? What about caching? You may have some caching, somewhere (internal to a CMS, or external), but it sounds as if this pattern is causing problems because it consists of accesses to older data which may have been erased from cache...could you just let it stay for longer before it reaches its best before date? And the storage sub-system is something simple, isn't it (not anything like NAS, which could itself be varying in performance with network load)? |
Quote:
I'm pretty new to web service, and I'm a bit shocked to discover that web crawlers can have such a dramatic impact on server performance. I was also surprised when I scanned my access_log to note the sheer volume of web-crawler requests; I'm wondering what percentage of my blog readership statistics are the result of automated web crawler accesses. In answer to your questions, the "P" is PHP, and my disk controller is your run-of-the-mill IDE controller on a PC motherboard. This is not a professional operation, I have a web server in my living room serving my own political blog. |
Quote:
Quote:
Quote:
Quote:
|
All times are GMT -5. The time now is 01:23 AM. |