LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 10-01-2012, 06:14 PM   #1
AchimRS
Member
 
Registered: Mar 2005
Location: Germany
Distribution: Debian Wheezy
Posts: 42

Rep: Reputation: 0
HELP: system load on server is very high


My system responds very slow on Apache requests, as far I analyzed top-command it have something to do with high system CPU usage - but how to analyze this further?

Here is the top-output during a standard (small and simple) http request. This runs on another server within 2s, but on the problematic ones it takes minutes:

Code:
top - 01:06:10 up 3 days,  9:54,  3 users,  load average: 2.02, 1.89, 2.28
Tasks:  55 total,   2 running,  53 sleeping,   0 stopped,   0 zombie
Cpu(s): 13.6%us, 83.4%sy,  0.0%ni,  0.2%id,  2.5%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:     61540k total,    58832k used,     2708k free,        0k buffers
Swap:        0k total,        0k used,        0k free,    17632k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
28536 www-data  20   0 46956  11m 5128 R 12.6 19.2   0:18.61 apache2
31953 root      20   0  2820  752  664 S  8.7  1.2   0:00.42 wget
 9370 root      20   0  2536  692  476 R  6.4  1.1   2:32.23 top
 6412 root      20   0  8920  800  304 S  4.3  1.3   0:15.25 sshd
31951 root      20   0  1664  492  436 S  2.9  0.8   0:00.20 sh
The server is a tiny embedded single-core ARM, but this shouldn't matter. On some other servers with identical hardware, the same request runs fine.
The distri ist Debian Squeeze.

I have the gut-feeling that the high system-load is somehow caused by Apache, but can't strengthen this thesis. The apache error.log shows nothing...
How to find the process causing this??

Edit: The best explanation for high %sy in top I found is:
"having higher numbers here may indicate a problem with kernel configs, a driver issue, or any number of other things" here
But this don't help...

Thanks
Achim

Last edited by AchimRS; 10-01-2012 at 07:44 PM.
 
Old 10-02-2012, 04:02 AM   #2
henrycoffin
Member
 
Registered: Dec 2006
Distribution: RHEL Debian
Posts: 42

Rep: Reputation: 15
Is the load average consistently high?

Can you post the output of some other performance tools? vmstat, iostat, free etc
 
Old 10-02-2012, 04:24 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,119

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
Quote:
Originally Posted by AchimRS View Post
I have the gut-feeling that the high system-load is somehow caused by Apache, but can't strengthen this thesis. The apache error.log shows nothing...
Unlikely. I'd be guessing a driver issue.
Very hard to track down - never looked at embedded. Have a look at /proc/interrupts for hints on what may be playing up.
vmstat might indicate abnormal context switches - also a clue that driver/interrupt is the problem.
 
Old 10-03-2012, 04:46 PM   #4
AchimRS
Member
 
Registered: Mar 2005
Location: Germany
Distribution: Debian Wheezy
Posts: 42

Original Poster
Rep: Reputation: 0
Hi all,
the load is not consistently high, it alternates between 2%sy and 90%sy and it looks like it somehow depends from the apache process.
If it is restarted, the %sy is very low for a while, after some hours of requests to apache it will come up more and more on each request and therefore the response become much slower - until it is uselsess. A simple apache restart will than bring back the system to be responsive again... so i Guess it have something to do with PHP or Apache.

Thanks for the good hint with vmstat, here is some output:

The first output was made after Apache processes have been running since several days during a single request which took about 100s:
Code:
~>vmstat -a 2                                                      
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----      
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa      
 2  1      0  11348  20692  19116    0    0     0     0   34   48 21 15 64  0      
 2  1      0  10484  20904  19748    0    0     0     0 2066 4051 24 75  0  2      
 2  1      0  11600  19856  19788    0    0     0     0 1025 1929 54 46  0  1      
 3  1      0  11420  20952  18720    0    0     0     0 2125 4429 19 80  0  0      
 3  1      0  12336  19192  19212    0    0     0     0 2692 5784 12 89  0  0      
 2  0      0  11924  20196  18936    0    0     0     0 3155 6638 13 88  0  0      
 2  2      0  12464  19136  19136    0    0     0     0 3489 7050 16 84  0  0      
 3  0      0  11412  20532  19040    0    0     0     0 3010 5988  8 89  0  3      
 3  2      0  11696  19636  19608    0    0     0     0 3328 6629 16 77  7  2      
 4  1      0  11536  19728  19472    0    0     0     0 2401 5039 11 89  0  0      
 4  1      0  10804  19908  20040    0    0     0     0 2479 5168 15 85  0  0      
 4  1      0   9112  20804  20668    0    0     0     0 2563 5601  8 92  0  0      
 6  1      0   8200  21236  21164    0    0     0     0 2589 5649 11 90  0  0      
 5  0      0   8512  22340  19872    0    0     0     0 2501 5506  7 92  0  2      
 3  1      0   8180  21696  20968    0    0     0     0 2665 5737 23 72  2  3      
 3  2      0   7980  21304  21304    0    0     0     0 2407 5054 14 85  0  0      
 3  3      0   9792  21340  19500    0    0     0     0 2674 5802 11 85  0  4      
 1  1      0   9660  21008  19968    0    0     0     0 2640 5655  9 89  0  3      
 2  1      0   8476  21100  20956    0    0     0     0 2764 5816  7 90  0  3      
 3  1      0   7660  21448  21524    0    0     0     0 2524 5506 13 85  0  3      
 2  1      0   7924  21192  21404    0    0     0     0 2461 5371 11 90  0  0      
 5  1      0   9284  20644  20776    0    0     0     0 2321 4805 22 75  0  2      
 2  1      0   9860  20492  20424    0    0     0     0 2787 5648 14 86  0  0      
 5  1      0  10984  19928  19844    0    0     0     0 2929 5948 18 83  0  0
 4  1      0  11184  19856  19832    0    0     0     0 2924 5830 15 82  0  3
 2  2      0  11356  19748  19528    0    0     0     0 2618 5492  8 88  0  4
 3  1      0  11780  19704  19424    0    0     0     0 3391 6886  5 89  0  6
 2  2      0  11620  19524  19540    0    0     0     0 4273 8572 10 87  0  3
 1  1      0  12184  19356  19096    0    0     0     0 3911 8299  8 89  0  4
 3  1      0  10060  21092  19384    0    0     0     0 2725 5856  2 96  0  2
 3  1      0   8344  22336  19932    0    0     0     0 2725 5785  6 91  0  4
 4  2      0   7652  22336  20652    0    0     0     0 2683 6072  8 91  0  1
 6  1      0   6668  22480  21480    0    0     0     0 2732 5345  8 91  0  0
 6  1      0   5960  22420  22180    0    0     0     0 2480 5363 13 87  0  0
 5  1      0   7332  21800  21500    0    0     0     0 2676 5764 11 89  0  0
 4  2      0   6548  21980  21944    0    0     0     0 2595 5609 11 88  0  2
 4  1      0   6792  21916  21708    0    0     0     0 2716 5954  4 93  0  3
 4  1      0   6624  22180  21908    0    0     0     0 2782 6115  4 95  0  2
 4  1      0   6060  22444  22188    0    0     0     0 2639 5541 13 87  0  0
 3  1      0   7584  22016  21152    0    0     0     0 2531 5328 15 86  0  0
 5  1      0   6756  21964  21820    0    0     0     0 2578 5334 14 86  0  0
 5  1      0   6072  22336  22260    0    0     0     0 2457 5209 13 87  0  0
 4  1      0   5136  22852  22732    0    0     0     0 2420 5178 15 85  0  0
 4  1      0  10992  19920  19872    0    0     0     0 2258 4723 17 84  0  0
 4  1      0   9828  20384  20392    0    0     0     0 1851 3851 32 67  0  1
 4  1      0  11388  19488  19820    0    0     0     0 2316 4901 18 83  0  0
 3  1      0  12276  19012  19440    0    0     0     0 2908 6043 16 81  0  3
 4  1      0  13388  18392  18948    0    0     0     0 2870 5809 14 86  0  1
 2  1      0  15400  17708  17748    0    0     0     0 3771 7527 11 89  0  0
 2  1      0  14328  18548  18060    0    0     0     0 2673 5669 10 90  0  1
 2  0      0  14692  18080  17944    0    0     0     0 2653 5407 18 82  0  0
The next was made immediatelly after a restart of Apache with exactly the same request between line 5 and 10. It took about 10s which is much better:
Code:
~> vmstat -a 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa
 0  0      0  17648  18612  15128    0    0     0     0   39   58 21 15 64  0
 1  0      0  16660  18612  15972    0    0     0     0 1434 2755 40 19 41  0
 0  0      0  17600  18612  15136    0    0     0     0  745 1388 51 25 25  0
 0  0      0  17608  18612  15136    0    0     0     0  460  900  8  9 84  0
 1  0      0  11248  19736  20364    0    0     0     0 1679 3191 52 26 22  0
 2  0      0   4304  20768  26204    0    0     0     0 1904 3622 57 35  8  0
 2  0      0   2560  21572  27124    0    0     0     0 1656 3183 59 41  1  0
 2  0      0   1708  23952  25532    0    0     0     0  525  923 74 27  0  0
 2  0      0   3584  26312  21284    0    0     0     0 3812 7528 18 69 10  4
 0  0      0   2236  26124  22756    0    0     0     0  814 1504 49 22 29  0
 0  0      0   2248  26160  22792    0    0     0     0  837 1625  2  3 95  0
 6  1      0   3264  25412  22460    0    0     0     0 2057 3989 23 30 35 11
Now a memory analysis during a running request, which should also cover a "free" tool call:
Code:
~>vmstat -s
        61540 K total memory
        57356 K used memory
        21888 K active memory
        25180 K inactive memory
         4184 K free memory
            0 K buffer memory
        21096 K swap cache
            0 K total swap
            0 K used swap
            0 K free swap
      9706553 non-nice user cpu ticks
            0 nice user cpu ticks
      6785115 system cpu ticks
     29369798 idle cpu ticks
       122582 IO-wait cpu ticks
          474 IRQ cpu ticks
        40820 softirq cpu ticks
            0 stolen cpu ticks
            0 pages paged in
            0 pages paged out
            0 pages swapped in
            0 pages swapped out
    405451122 interrupts
    801755817 CPU context switches
   1348837902 boot time
      1930240 forks
A "vmstat -d" running during a request only returns zeros
and "iostat" I can't find for my Debian distribution, it seems not to be available, also not in sysstat, but hopefully it's covered in the output above.

The /proc/interrupts look OK, the imx-i2c is due to a sensor polled regulary via i²c:
Code:
~> cat /proc/interrupts
           CPU0
  3:  305541991           -  imx-i2c
  4:          2           -  imx-i2c
  9:         11           -  sdhci
 24:          0           -  imx-keypad
 25:          0           -  rtc-mx25
 32:          2           -  IMX-uart
 33:   44406920           -  mxc_nd
 34:          1           -  mxc-sdma
 35:          0           -  ehci_hcd:usb1
 37:          0           -  fsl-usb2-udc
 40:          2           -  IMX-uart
 45:         71           -  IMX-uart
 54:   56444439           -  i.MX Timer Tick
 57:    1060771           -  fec
164:          0           -  ESDHCI card 0 detect
168:          1           -  phy_interrupt
Err:          0
Is anybody able to see the reason for the problem in above output???
What also is suspicious: I have several of these systems running, same hardware, same processes, hopefully identically installed (never 100% sure, because it is done manually), but this one is so slow by having such high %sy load...

Thanks a lot
Achim
 
Old 10-03-2012, 09:56 PM   #5
pantdk
Member
 
Registered: Oct 2011
Location: New Delhi
Posts: 248
Blog Entries: 3

Rep: Reputation: 17
As per the top output the load is not high & %sys utilization is very high if it is possible to you to install pkg in that box & get the sar report that will give you all essential date to fetch the problem

# iostat -kdx (disk read write performance check the (service time interval & utilization)

# sar -q (give u the load average as per the duration set in cron for the sar logs)


more
http://www.linuxquestions.org/questi...ck-4175427965/


may be required to fine tune the Apache

http://httpd.apache.org/docs/2.2/misc/perf-tuning.html

http://www.supportsages.com/blog/201...r-performance/

Last edited by pantdk; 10-03-2012 at 10:02 PM.
 
Old 10-04-2012, 08:13 AM   #6
AchimRS
Member
 
Registered: Mar 2005
Location: Germany
Distribution: Debian Wheezy
Posts: 42

Original Poster
Rep: Reputation: 0
Meanwhile I came to the idea comparing an OK system with the NOK one, because they have really the same hardware, running the same software, only a different machine. Here is an output responding on exactly the same request within 6s, seen in log from line 4 to 6 marked in red:
Code:
~> vmstat -a 2
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa
 1  0      0   2168  24504  22700    0    0     0     0    2    1 21 14 66  0
 0  0      0   2504  24508  22428    0    0     0     0 1274 2526 14 26 61  0
 1  0      0   2360  24508  22612    0    0     0     0  301  552  9  2 89  0
 1  0      0   1596  23896  23972    0    0     0     0 2673 5244 52 45  4  0
 2  0      0   2060  24520  22736    0    0     0     0 1813 3464 50 49  1  0
 3  0      0   5132  22968  21328    0    0     0     0  473  794 67 33  0  0
 1  0      0   4232  23032  22068    0    0     0     0 1284 2399 17 41 42  0
 0  0      0   7412  21340  20764    0    0     0     0 1761 3439 23 47 30  0
 0  0      0   7404  21356  20764    0    0     0     0 3494 6750  2 21 78  0
I was also able to install iostat, but without result. It seems that the kernel is not maintaining the statistics for the flash devices. So "iostat -kdx" shows nothing, even "iostat -kdx ALL" shows only 0s

sar was running successfully, but I don't see the high load there. In the output below the red colored times are during a slow access, see:

~
Code:
> sar -q 2                                   
Linux 2.6.31 (MyServer-1)        10/04/12        _armv5tejl_     (1 CPU)

15:04:42      runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
15:04:44            4        66      1.66      1.45      1.28
15:04:46            1        64      1.69      1.46      1.28
15:04:48            3        64      1.69      1.46      1.28
15:04:50            3        64      1.69      1.46      1.28
15:04:52            4        64      1.87      1.50      1.29
15:04:54            2        66      1.87      1.50      1.29
15:04:56            6        64      2.28      1.60      1.32
15:04:58            3        64      2.28      1.60      1.32
15:05:00            2        64      2.28      1.60      1.32
15:05:02            5        66      2.58      1.67      1.35
15:05:04            6        70      2.58      1.67      1.35
15:05:06            3        64      2.61      1.69      1.36
15:05:08            3        64      2.61      1.69      1.36
15:05:10            2        64      2.61      1.69      1.36
15:05:12            1        64      2.73      1.73      1.37
15:05:14            3        63      2.73      1.73      1.37
15:05:16            2        64      2.75      1.75      1.38
15:05:18            0        62      2.75      1.75      1.38
15:05:20            1        64      2.75      1.75      1.38
15:05:22            1        64      2.69      1.75      1.38
15:05:24            3        63      2.69      1.75      1.38
15:05:26            1        64      2.63      1.76      1.39
15:05:28            2        64      2.63      1.76      1.39
15:05:30            1        64      2.63      1.76      1.39
15:05:32            1        63      2.42      1.73      1.38
15:05:34            1        64      2.42      1.73      1.38
15:05:36            3        64      2.47      1.75      1.39
15:05:38            0        63      2.47      1.75      1.39
15:05:40            1        64      2.47      1.75      1.39
15:05:42            0        63      2.35      1.74      1.38
15:05:44            1        63      2.35      1.74      1.38
15:05:46            1        64      2.24      1.72      1.38
15:05:48            3        64      2.24      1.72      1.38
15:05:50            1        62      2.24      1.72      1.38
15:05:52            2        65      2.14      1.71      1.38
15:05:54            3        66      2.14      1.71      1.38
15:05:56            2        67      2.14      1.71      1.38
15:05:58            2        64      2.13      1.72      1.38
15:06:00            2        64      2.13      1.72      1.38
15:06:02            4        66      2.28      1.75      1.40
15:06:04            1        66      2.28      1.75      1.40
15:06:06            1        66      2.18      1.74      1.39
So I still struggle totally...

Last edited by AchimRS; 10-04-2012 at 08:20 AM.
 
Old 10-04-2012, 08:17 AM   #7
henrycoffin
Member
 
Registered: Dec 2006
Distribution: RHEL Debian
Posts: 42

Rep: Reputation: 15
Not sure I would recommend running apache with php on 64M RAM??? Certainly with no swap available!!
 
Old 10-04-2012, 09:03 AM   #8
AchimRS
Member
 
Registered: Mar 2005
Location: Germany
Distribution: Debian Wheezy
Posts: 42

Original Poster
Rep: Reputation: 0
Hmmm, it is running on some other systems with 64MB well (so at least much faster). If I look to vmstat I see plenty of inactive and free memory...

Usually there is only 1 user logged in, the parameter MaxCients is configured away from default 150 to 4 only, the StartServer is reduced to 2. Apache is what I know best, that's the reason why I started with it also on the small embedded machine (400MHz, 64MB RAM). You are right, maybe it is time now to switch to a more lightweight server like LIGHTTPD.
But I have th gut-feelingm that there is another problem laying below, because on other systems it is running well. Maybe with the reduced ressource need of LIGHTTPD the problem is only delayed by some days and at the end being on the same state like now :-(

Edit: It seems the problem is solved. By accident I saw three processes in top running with NICE=-6
These self-made processes run regularly and often need top cpu resources. After chenging the back to -1 to be still a little bit better prioritized, everything works fine. Now the %sy is down to about 10 and the system is much more responsive again.

Last edited by AchimRS; 10-06-2012 at 05:34 PM. Reason: Solved
 
  


Reply

Tags
load, system, top


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Responsive system/High load average and hanging ps bagpussnz Linux - General 2 06-01-2010 02:09 PM
udevd - producing high system load - how to track the cause? onufry Linux - Software 0 01-22-2010 06:56 PM
[SOLVED] Very slow system response to user input, variable (and HIGH) CPU load on P4 HT system f4c3l355 Linux - General 3 09-23-2009 08:28 AM
need to create high system load bigtl Linux - General 2 09-29-2005 07:20 AM
High system load while reading from HDD on nforce2 qQsh Linux - Hardware 5 01-31-2005 04:37 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 08:30 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration