System Suddenly Overloading...please help!
I have a webserver and just this morning it seems to have started to overload.
As of this morning, my server has been running at an abnormally high load due to what seems to be a whole lot of httpd connections. However, we are receiving no more traffic to our websites than normal, so I am sure it is not related to our website traffic. Also I have not changed any settings nor have I edited any of our website scripts. So none of these things could be causing this. We normally run at under 1% server load. When I examine our apache access-log file, i see that there are thousands of ""GET /whm-server-status HTTP/1.0" 200" messages. Could this be the culprit? Also, the /var/log/secure file has thousands of lines of this: Jan 21 09:30:12 ns3 xinetd[2374]: START: imap pid=4648 from=127.0.0.1 Jan 21 09:38:35 ns3 xinetd[2374]: START: imap pid=6216 from=127.0.0.1 Jan 21 09:46:58 ns3 xinetd[2374]: START: imap pid=9846 from=127.0.0.1 Jan 21 09:55:21 ns3 xinetd[2374]: START: imap pid=11350 from=127.0.0.1 Here is an export from a TOP i just ran: 18:50:56 up 56 min, 2 users, load average: 127.47, 106.16, 66.71 913 processes: 910 sleeping, 2 running, 1 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 13.1% 0.0% 10.4% 0.0% 0.4% 74.2% 1.6% cpu00 16.9% 0.0% 6.0% 0.0% 0.8% 76.0% 0.0% cpu01 13.0% 0.0% 10.1% 0.0% 0.0% 72.0% 4.8% cpu02 10.7% 0.0% 8.9% 0.1% 0.7% 79.4% 0.0% cpu03 11.9% 0.0% 16.7% 0.0% 0.0% 69.6% 1.6% Mem: 2074544k av, 1982340k used, 92204k free, 0k shrd, 25936k buff 1612984k active, 226180k inactive Swap: 2097136k av, 878380k used, 1218756k free 332448k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 17569 root 22 0 5888 1872 4888 R 2.6 0.0 0:04 3 top 3568 root 0 -20 0 0 0 DW< 0.4 0.0 1:54 2 loop0 17060 root 15 0 0 0 0 SW 0.4 0.0 0:00 2 pdflush 6877 nobody 16 0 17448 8280 15068 D 0.2 0.3 0:00 0 httpd 6923 nobody 16 0 17464 8184 15068 D 0.2 0.3 0:01 0 httpd 7728 nobody 16 0 17852 8644 15068 D 0.2 0.4 0:02 0 httpd 9116 nobody 16 0 17448 8252 15068 D 0.2 0.3 0:00 0 httpd 17560 root 19 0 7452 3872 6180 S 0.2 0.1 0:00 2 exim 17572 root 20 0 7436 3880 6180 S 0.2 0.1 0:00 0 exim 2966 nobody 16 0 17504 8316 15068 S 0.1 0.4 0:01 3 httpd 6492 nobody 16 0 18148 8784 15068 S 0.1 0.4 0:01 2 httpd 6593 nobody 15 0 17448 8284 15068 R 0.1 0.3 0:03 3 httpd 6601 nobody 16 0 18148 8860 15068 D 0.1 0.4 0:00 3 httpd 6687 nobody 15 0 17448 8252 15068 S 0.1 0.3 0:01 0 httpd 6935 nobody 15 0 17472 8260 15068 S 0.1 0.3 0:02 0 httpd 7192 nobody 16 0 18240 8828 15068 D 0.1 0.4 0:03 0 httpd 7197 nobody 16 0 18068 8840 15068 D 0.1 0.4 0:03 0 httpd 7223 nobody 16 0 17472 8308 15068 D 0.1 0.4 0:02 0 httpd 7236 nobody 16 0 17908 8656 15068 D 0.1 0.4 0:01 3 httpd 7481 nobody 16 0 18040 8768 15068 D 0.1 0.4 0:01 1 httpd 7549 nobody 15 0 18852 8808 15068 D 0.1 0.4 0:01 0 httpd 7631 nobody 17 0 17792 8600 15068 D 0.1 0.4 0:01 0 httpd 7636 nobody 17 0 18052 8808 15068 D 0.1 0.4 0:02 0 httpd 7701 nobody 16 0 17980 8756 15068 D 0.1 0.4 0:04 0 httpd 7726 nobody 16 0 17856 8628 15068 S 0.1 0.4 0:02 1 httpd 9880 nobody 16 0 17956 8696 15068 D 0.1 0.4 0:02 1 httpd 17586 root 25 0 7568 3908 6180 S 0.1 0.1 0:00 1 exim 1 root 16 0 1508 456 1356 S 0.0 0.0 0:12 3 init 2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0 3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0 4 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1 5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1 6 root RT 0 0 0 0 SW 0.0 0.0 0:00 2 migration/2 I am stumped and don't know what else to do. I've rebooted a couple of times. Restarted apache server...mysqld server...exim server....etc. I really need to get this fixed as there are a few high traffic websites on this server i need to have up. Pleasssseeee help guys! -farmerjoe |
Do you run anything that should check /whm-server-status?
Do the /whm-server-status queries come from one or multiple IP addresses? What happens if you block access to /whm-server-status? Is the box swamped in outbound traffic? To remote port TCP/25 by any chance? Did you run a Chkrootkit / Rootkit Hunter check on the system (just in case)? Otherwise checked for "weird" stuff in the system and daemon logs? Otherwise checked for "weird" stuff in /tmp, /var/tmp and any other temp dirs apps are allowed to write? |
What does our old friend netstat show?
I'm not an an Internet Service Provider guru, but I think I would: 1) Limit all services, such as IMAP, smtp, etc to the WAN, either by disabling or firewall. 2) Take several netstat snap shops, sort and diff them to see if you can see a incoming source. 3) Maybe use ethereal for a similar purpose. 4) Try the rate limiting features in the Linux firewall. |
the whm check is a local check. Its something Cpanel does to make sure its still working.
I think I got this problem solved....or at least temporarily solved. Most of the HTTPD requests seemed to be coming from SSL connections. I started apache without SSL and that seemed to immediately solve the load problem. We have no need for SSL, so this is a temporary solution at least. Thanks so much for your tips. I'll update this thread if the problems resume today. |
All times are GMT -5. The time now is 01:11 PM. |