LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 04-14-2011, 07:22 AM   #1
brajan
LQ Newbie
 
Registered: Apr 2011
Posts: 2

Rep: Reputation: 0
Apache timeouts / Server load spikes


Hi,
I am having a headache with finding out what is going on my server. It keeps getting random load spikes and apache becoming irresponsible. My system is build with nginx in front and apache in background. Clean reverse proxy with a backup upstream configured on other machine. This works fine until ... yeah this is the point which i can't figure out.

System: CentoOS (64bit) as a VMWare Virtual Machine
Kernel: 2.6.18-194.32.1.el5.centos.plus
File system: XFS
CPU 4x
RAM 6GB
Server version: Apache/2.2.3 (Server built: Aug 30 2010 12:28:40)
nginx version: nginx/0.8.53
PHP 5.1.6 (cli)

There is a webshop application written in PHP (mySQL is hosted on separate server) running on it but also i got mirror on other server as well. NGinx as default servers static files from localhost and using local apache to provide dynamic content. I had to create a mirror (via nginx upstream settings) in case of local apache timeing out ... which is where the problem is.
Most of the time everything works fine but there are these weird load spikes i can't explain. Load goes up to 15-25 and localhost apache stops responding.

I started to collect a lot of stats on this server but i still can't see where the problem is. Please take a look at some stats being dumped during such moments. Maybe you can see something i don't
We gonna user two time point where problem appeared. Take a look at 16:10 and then between 20:00 - 20:30
Here we go:

SAR
Linux 2.6.18-194.32.1.el5.centos.plus (********) 2011-04-13

00:00:01 proc/s
16:00:01 1,22
16:10:01 1,43
16:20:01 1,32
20:00:01 0,84
20:10:01 1,17
20:20:01 0,93
20:30:01 0,93


00:00:01 cswch/s
16:00:01 2383,38
16:10:01 2092,80
16:20:01 1763,54
16:30:01 1725,95
20:00:01 3081,22
20:10:01 2996,59
20:20:01 2602,10
20:30:01 3037,61


00:00:01 CPU %user %nice %system %iowait %steal %idle
16:00:01 all 21,04 0,00 8,20 0,01 0,00 70,75
16:00:01 0 15,99 0,00 4,73 0,01 0,00 79,27
16:00:01 1 16,10 0,00 5,09 0,01 0,00 78,80
16:00:01 2 16,33 0,00 4,89 0,02 0,00 78,76
16:00:01 3 35,74 0,00 18,07 0,00 0,00 46,18
16:10:01 all 18,02 0,00 7,42 0,01 0,00 74,55
16:10:01 0 14,50 0,00 4,52 0,01 0,00 80,97
16:10:01 1 13,82 0,00 4,68 0,00 0,00 81,50
16:10:01 2 13,85 0,00 4,44 0,01 0,00 81,70
16:10:01 3 29,93 0,00 16,04 0,01 0,00 54,02
16:20:01 all 14,81 0,00 6,89 0,05 0,00 78,26
16:20:01 0 11,05 0,00 3,85 0,01 0,00 85,08
16:20:01 1 10,76 0,00 4,09 0,01 0,00 85,15
16:20:01 2 10,88 0,00 4,00 0,15 0,00 84,97
16:20:01 3 26,53 0,00 15,62 0,01 0,00 57,84
20:00:01 all 26,65 0,00 8,32 0,01 0,00 65,02
20:00:01 0 20,57 0,00 4,47 0,01 0,00 74,95
20:00:01 1 20,57 0,00 4,68 0,00 0,00 74,75
20:00:01 2 20,24 0,00 4,54 0,01 0,00 75,20
20:00:01 3 45,21 0,00 19,59 0,00 0,00 35,19
20:10:01 all 28,65 0,00 10,63 0,10 0,00 60,63
20:10:01 0 23,69 0,00 6,88 0,06 0,00 69,37
20:10:01 1 23,26 0,00 7,20 0,03 0,00 69,51
20:10:01 2 23,18 0,00 7,10 0,29 0,00 69,43
20:10:01 3 44,40 0,00 21,28 0,01 0,00 34,30
20:20:01 all 23,68 0,00 8,17 0,17 0,00 67,98
20:20:01 0 17,53 0,00 4,61 0,06 0,00 77,80
20:20:01 1 17,52 0,00 4,62 0,08 0,00 77,79
20:20:01 2 17,62 0,00 4,98 0,52 0,00 76,88
20:20:01 3 42,05 0,00 18,48 0,02 0,00 39,45
20:30:01 all 27,21 0,00 8,79 0,01 0,00 64,00
20:30:01 0 21,45 0,00 5,23 0,01 0,00 73,32
20:30:01 1 21,24 0,00 5,37 0,00 0,00 73,39
20:30:01 2 21,42 0,00 5,09 0,02 0,00 73,48
20:30:01 3 44,71 0,00 19,47 0,00 0,00 35,82


00:00:01 INTR intr/s
16:00:01 sum 2897,75
16:10:01 sum 2654,63
16:20:01 sum 2522,15
20:00:01 sum 3218,31
20:10:01 sum 2776,08
20:20:01 sum 2743,67
20:30:01 sum 2855,21

00:00:01 tps rtps wtps bread/s bwrtn/s
16:00:01 17,04 0,15 16,89 7,68 460,67
16:10:01 16,71 0,08 16,63 6,45 482,71
16:20:01 21,43 1,73 19,70 84,90 614,03
16:30:01 12,14 0,06 12,08 3,10 259,70
20:00:01 15,56 0,13 15,43 12,12 346,32
20:10:01 25,74 3,71 22,03 638,55 431,66
20:20:01 51,41 31,67 19,74 3809,10 371,26
20:30:01 18,05 0,18 17,87 19,48 373,91

00:00:01 frmpg/s bufpg/s campg/s
16:00:01 -96,44 0,06 9,52
16:10:01 118,87 0,04 8,44
16:20:01 -280,55 0,06 13,78
16:30:01 220,07 0,06 7,08
20:00:01 -52,40 0,06 12,92
20:10:01 19,53 0,15 13,20
20:20:01 -9,89 0,23 10,71
20:30:01 -43,41 0,05 12,48

00:00:01 IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s txcmp/s rxmcst/s
16:00:01 lo 222,22 222,22 415134,50 415134,50 0,00 0,00 0,00
16:00:01 eth0 2009,81 2284,46 847467,03 1536557,96 0,00 0,00 0,00
16:00:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
16:10:01 lo 176,01 176,01 335443,42 335443,42 0,00 0,00 0,00
16:10:01 eth0 1753,58 1886,29 734979,73 1224188,08 0,00 0,00 0,00
16:10:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
16:20:01 lo 130,54 130,54 224751,39 224751,39 0,00 0,00 0,00
16:20:01 eth0 1472,17 1657,99 558715,35 1147156,41 0,00 0,00 0,00
16:20:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
16:30:01 lo 139,64 139,64 248102,76 248102,76 0,00 0,00 0,00
16:30:01 eth0 1422,17 1674,51 584657,81 1258328,53 0,00 0,00 0,00
16:30:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
20:00:01 lo 305,68 305,68 593854,92 593854,92 0,00 0,00 0,00
20:00:01 eth0 2790,26 3340,79 1199849,69 2452070,71 0,00 0,00 0,00
20:00:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
20:10:01 lo 310,87 310,87 613079,93 613079,93 0,00 0,00 0,00
20:10:01 eth0 2882,77 3485,29 1229454,51 2618356,31 0,00 0,00 0,00
20:10:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
20:20:01 lo 272,95 272,95 521194,48 521194,48 0,00 0,00 0,00
20:20:01 eth0 2309,35 2712,83 1001607,87 1951056,64 0,00 0,00 0,00
20:20:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00
20:30:01 lo 295,53 295,53 584662,85 584662,85 0,00 0,00 0,00
20:30:01 eth0 2635,89 2986,98 1143017,97 2047382,43 0,00 0,00 0,00
20:30:01 sit0 0,00 0,00 0,00 0,00 0,00 0,00 0,00


00:00:01 pgpgin/s pgpgout/s fault/s majflt/s
16:00:01 1,85 103,68 1905,90 0,00
16:10:01 1,61 110,69 2236,18 0,00
16:20:01 21,23 143,92 2134,28 0,00
16:30:01 0,78 55,49 1334,28 0,00
20:00:01 3,03 74,03 516,50 0,00
20:10:01 159,37 88,73 911,49 0,01
20:20:01 952,27 75,20 487,53 0,00
20:30:01 4,87 78,11 540,69 0,00


00:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
16:00:01 2994036 3112756 50,97 39280 2187416 6094732 108 0,00 0
16:10:01 3278072 2828720 46,32 39380 2207588 6094732 108 0,00 0
16:20:01 2607052 3499740 57,31 39524 2240536 6094732 108 0,00 0
16:30:01 3133432 2973360 48,69 39656 2257472 6094732 108 0,00 0
20:00:01 2713056 3393736 55,57 42432 2712312 6094732 108 0,00 0
20:10:01 2755240 3351552 54,88 42756 2740824 6094732 108 0,00 0
20:20:01 2732140 3374652 55,26 43296 2765860 6094732 108 0,00 0
20:30:01 2628996 3477796 56,95 43408 2795520 6094732 108 0,00 0


00:00:01 dentunusd file-sz inode-sz super-sz %super-sz dquot-sz %dquot-sz rtsig-sz %rtsig-sz
16:00:01 287680 3060 137318 0 0,00 0 0,00 0 0,00
16:10:01 287819 2550 136591 0 0,00 0 0,00 0 0,00
16:20:01 288394 2040 137309 0 0,00 0 0,00 0 0,00
16:30:01 288605 2550 136745 0 0,00 0 0,00 0 0,00
20:00:01 297651 3570 137214 0 0,00 0 0,00 0 0,00
20:10:01 298491 3570 137689 0 0,00 0 0,00 0 0,00
20:20:01 299280 2550 137937 0 0,00 0 0,00 0 0,00
20:30:01 300222 3060 138322 0 0,00 0 0,00 0 0,00

00:00:01 totsck tcpsck udpsck rawsck ip-frag
16:00:01 988 820 6 0 0
16:10:01 662 544 6 0 0
16:20:01 920 738 6 0 0
20:00:01 897 793 6 0 0
20:10:01 892 778 6 0 0
20:20:01 807 700 6 0 0
20:30:01 1064 954 6 0 0

00:00:01 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
16:00:01 4 230 1,46 1,52 1,33
16:10:01 3 164 1,55 1,44 1,36
16:20:01 2 227 0,60 0,97 1,20
16:30:01 5 159 1,08 0,83 0,96
20:00:01 4 161 1,39 1,74 1,63
20:10:01 6 167 7,28 5,35 3,16
20:20:01 6 159 1,18 1,92 2,34
20:30:01 4 160 1,95 1,95 2,09

This is a load dump (my little script) output which is launched whenever load goes above 5 but also this time apache gone bad:
TOP 20 PROCESESS:
apache 15423 23.0 0.5 288004 31120 ? S 20:08 0:00 \_ /usr/sbin/httpd
apache 15419 7.8 0.5 289016 32156 ? R 20:08 0:00 \_ /usr/sbin/httpd
apache 15418 7.6 0.5 289040 32068 ? R 20:08 0:00 \_ /usr/sbin/httpd
apache 15396 7.1 0.5 289052 32124 ? S 20:07 0:01 \_ /usr/sbin/httpd
apache 15412 5.6 0.5 289040 32576 ? R 20:07 0:00 \_ /usr/sbin/httpd
apache 15395 5.5 0.5 289040 33068 ? S 20:07 0:01 \_ /usr/sbin/httpd
apache 15378 5.2 0.5 344212 33120 ? S 20:07 0:01 \_ /usr/sbin/httpd
apache 15375 5.2 0.5 289044 32684 ? S 20:07 0:01 \_ /usr/sbin/httpd
apache 15373 5.2 0.5 289052 33024 ? S 20:07 0:02 \_ /usr/sbin/httpd
apache 15391 5.1 0.5 344172 32232 ? S 20:07 0:01 \_ /usr/sbin/httpd
apache 14879 5.1 0.5 345208 34116 ? S 19:57 0:32 \_ /usr/sbin/httpd
apache 14837 5.1 0.6 349332 38084 ? S 19:57 0:32 \_ /usr/sbin/httpd
apache 15309 4.9 0.5 290076 33328 ? R 20:05 0:06 \_ /usr/sbin/httpd
apache 14881 4.9 0.5 349364 34900 ? S 19:57 0:31 \_ /usr/sbin/httpd
apache 14876 4.9 0.5 345208 34204 ? S 19:57 0:31 \_ /usr/sbin/httpd
apache 14878 4.8 0.5 345208 33432 ? S 19:57 0:30 \_ /usr/sbin/httpd
apache 14861 4.8 0.6 353460 38684 ? S 19:57 0:30 \_ /usr/sbin/httpd
apache 15145 4.7 0.5 345208 33920 ? S 20:02 0:14 \_ /usr/sbin/httpd
apache 14882 4.7 0.5 344184 33312 ? S 19:57 0:29 \_ /usr/sbin/httpd
apache 14880 4.7 0.6 349304 37768 ? R 19:57 0:30 \_ /usr/sbin/httpd

IOSTAT:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.89 2.11 0.10 1.82 3.14 15.71 19.58 0.01 4.00 0.66 0.13
sda1 0.87 0.00 0.01 0.00 0.88 0.00 186.85 0.00 6.15 2.80 0.00
sda2 0.02 2.11 0.09 1.82 2.25 15.71 18.75 0.01 3.99 0.65 0.13
sdb 19.27 0.04 0.96 0.37 46.99 28.17 113.34 0.01 7.31 1.99 0.26
sdb1 19.27 0.04 0.96 0.37 46.99 28.17 113.35 0.01 7.31 1.99 0.26
dm-0 0.00 0.00 0.11 3.93 2.25 15.71 8.90 0.15 37.62 0.31 0.13
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 6.13 0.23 0.00

NGINX:
Active connections: 743
server accepts handled requests
1716567 1716567 20129839
Reading: 0 Writing: 65 Waiting: 678

APACHE:
Parent Server Generation: 10
Server uptime: 16 hours 7 minutes 24 seconds
Total accesses: 946844 - Total Traffic: 8.2 GB
CPU Usage: u1459.93 s334.5 cu0 cs0 - 3.09% CPU load
16.3 requests/sec - 147.4 kB/second - 9.0 kB/request
38 requests currently being processed, 0 idle workers
WWWWWWWWWWWWWWCWWWWWWWWWWWW.W..W..W.W..W..W.WW.W...W............
................................................................
W...............................................................
................................................................
............................................



There is no pattern i could find when it happens. It happens at the time where there is almost no ppl browsing products but also happens when site load is medium or high. It looks very random but where is the problem?
There are no cronjobs running during these times (backup is being run late at night). Why this darn Apache stops responding?
During these apache problems server and system remains stable and nginx is working fine using backup apache server on other machine.
Apache is currently processing 16.3 requests/sec but other times it works fine with much more req/s.

Last edited by brajan; 04-14-2011 at 07:35 AM.
 
Old 04-14-2011, 06:44 PM   #2
stickman
Senior Member
 
Registered: Sep 2002
Location: Nashville, TN
Posts: 1,552

Rep: Reputation: 53
10 minute intervals from sar probably won't help much. If it's a frequent problem, you might try decreasing that interval. When you see a problem, check your stats and start looking at the requests/errors in the logs. If it happens when almost no one it hitting the server, it should be pretty quick task. You might also try grabbing the long output of server status and a few seconds of top when the problem occurs. Look for resource intensive procs in your top output and cross reference the PID over to your server status data to get the request. Apachetop might also help.
 
Old 04-15-2011, 02:46 AM   #3
brajan
LQ Newbie
 
Registered: Apr 2011
Posts: 2

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by stickman View Post
10 minute intervals from sar probably won't help much. If it's a frequent problem, you might try decreasing that interval. When you see a problem, check your stats and start looking at the requests/errors in the logs. If it happens when almost no one it hitting the server, it should be pretty quick task. You might also try grabbing the long output of server status and a few seconds of top when the problem occurs. Look for resource intensive procs in your top output and cross reference the PID over to your server status data to get the request. Apachetop might also help.
SAR interval ... ok i will change that.
Apache status, i do have full status logged. I just stripped it in my post to save some space. I was analysing these logs for almost 2-3 months but i still can't find anything. I also checked and debugged application code checking for memory and time consumption, IO activity and of course errors but nothing obvious comes up. It may work without any problems for a week or two handling 700-1000 sessions per second but it goes bad the other day having 500 sessions and doing same work all the time.
Is there any chance that it's some sort of some Apache mod or Apache it self?
 
Old 04-16-2011, 11:44 AM   #4
stickman
Senior Member
 
Registered: Sep 2002
Location: Nashville, TN
Posts: 1,552

Rep: Reputation: 53
It's possible that it could be Apache or a mod, but most likely you'll need to identify the specific requests that are being served when it locks up.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
apache load VERY high, disabling server lexthoonen Linux - Server 1 04-03-2011 02:58 AM
Php files don't load in my Apache Server adrianc.grigoras Linux - Software 2 03-09-2009 10:38 AM
Server load problems with Suse 9.3 and Apache 2.0.53 wessie Linux - Newbie 1 08-04-2006 05:14 AM
Server load code Apache+PHP mindfrost82 Linux - Software 1 01-03-2005 10:43 AM
CPU load spikes every ten seconds, causes problems beetlenaut Linux - Software 14 09-04-2004 07:37 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 04:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration