LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 03-05-2009, 09:18 AM   #1
permalac
Member
 
Registered: Jul 2007
Location: Barcelona
Posts: 115

Rep: Reputation: 16
High load average for no apparent reason


Hello,

I know that other asked the same, but I checked and seems that the issue was not solved.

I don't know why this happens but if somebody can help me will be nice.

Thanks.

Marc

$> cat /proc/loadavg
31.98 31.25 29.76 1/524 23947

$> uptime
16:13:02 up 189 days, 9:32, 9 users, load average: 29.93, 29.56, 27.95


$> top
Tasks: 355 total, 1 running, 353 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 67.9%id, 32.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 28875892k total, 28301096k used, 574796k free, 608492k buffers
Swap: 7807580k total, 256k used, 7807324k free, 22947048k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2534 root 10 -5 0 0 0 D 0 0.0 14:23.96 kjournald
23903 root 15 0 19212 1540 968 R 0 0.0 0:00.05 top
1 root 18 0 3964 892 624 S 0 0.0 0:03.73 init
2 root 11 -5 0 0 0 S 0 0.0 0:00.19 kthreadd
3 root RT -5 0 0 0 S 0 0.0 0:00.40 migration/0
4 root 34 19 0 0 0 S 0 0.0 0:04.01 ksoftirqd/0
5 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/0
6 root RT -5 0 0 0 S 0 0.0 0:00.62 migration/1
7 root 34 19 0 0 0 S 0 0.0 0:03.97 ksoftirqd/1
8 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/1
9 root RT -5 0 0 0 S 0 0.0 0:01.59 migration/2
10 root 34 19 0 0 0 S 0 0.0 0:03.87 ksoftirqd/2
11 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/2
12 root RT -5 0 0 0 S 0 0.0 0:02.05 migration/3
13 root 34 19 0 0 0 S 0 0.0 0:03.85 ksoftirqd/3
14 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/3
15 root RT -5 0 0 0 S 0 0.0 0:00.36 migration/4
16 root 34 19 0 0 0 S 0 0.0 0:04.40 ksoftirqd/4
17 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/4
18 root RT -5 0 0 0 S 0 0.0 0:00.28 migration/5
19 root 34 19 0 0 0 S 0 0.0 0:04.20 ksoftirqd/5
20 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/5
21 root RT -5 0 0 0 S 0 0.0 0:02.26 migration/6
22 root 34 19 0 0 0 S 0 0.0 0:03.71 ksoftirqd/6
23 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/6
24 root RT -5 0 0 0 S 0 0.0 0:05.95 migration/7
25 root 34 19 0 0 0 S 0 0.0 0:03.78 ksoftirqd/7
26 root RT -5 0 0 0 S 0 0.0 0:00.00 watchdog/7
27 root 10 -5 0 0 0 S 0 0.0 0:00.08 events/0
28 root 10 -5 0 0 0 S 0 0.0 0:00.05 events/1
29 root 10 -5 0 0 0 S 0 0.0 0:00.13 events/2
30 root 10 -5 0 0 0 S 0 0.0 0:00.05 events/3
31 root 10 -5 0 0 0 S 0 0.0 0:00.13 events/4
32 root 10 -5 0 0 0 S 0 0.0 0:00.01 events/5
33 root 10 -5 0 0 0 S 0 0.0 0:00.04 events/6
34 root 10 -5 0 0 0 S 0 0.0 0:00.05 events/7
35 root 14 -5 0 0 0 S 0 0.0 0:00.00 khelper
60 root 10 -5 0 0 0 S 0 0.0 0:00.76 kblockd/0
61 root 10 -5 0 0 0 S 0 0.0 0:00.63 kblockd/1
62 root 10 -5 0 0 0 S 0 0.0 0:00.73 kblockd/2
63 root 10 -5 0 0 0 S 0 0.0 0:00.74 kblockd/3
64 root 10 -5 0 0 0 S 0 0.0 0:00.64 kblockd/4
65 root 10 -5 0 0 0 S 0 0.0 0:00.53 kblockd/5
66 root 10 -5 0 0 0 S 0 0.0 0:00.60 kblockd/6
67 root 10 -5 0 0 0 S 0 0.0 0:00.61 kblockd/7
68 root 11 -5 0 0 0 S 0 0.0 0:00.00 kacpid
69 root 11 -5 0 0 0 S 0 0.0 0:00.00 kacpi_notify
184 root 10 -5 0 0 0 S 0 0.0 0:00.00 kseriod
249 root 10 -5 0 0 0 S 0 0.0 66:12.07 kswapd0
302 root 11 -5 0 0 0 S 0 0.0 0:00.00 aio/0
303 root 12 -5 0 0 0 S 0 0.0 0:00.00 aio/1
304 root 12 -5 0 0 0 S 0 0.0 0:00.00 aio/2


$> vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 3 256 574348 608492 22947824 0 0 133 46 0 0 2 0 97 1

$> cat /proc/ioports
0000-001f : dma1
0020-0021 : pic1
0040-0043 : timer0
0050-0053 : timer1
0060-006f : keyboard
0070-0077 : rtc
0080-008f : dma page reg
00a0-00a1 : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : 0000:00:1f.1
0170-0177 : libata
01f0-01f7 : 0000:00:1f.1
01f0-01f7 : libata
02f8-02ff : serial
0376-0376 : 0000:00:1f.1
0376-0376 : libata
03c0-03df : vga+
03f6-03f6 : 0000:00:1f.1
03f6-03f6 : libata
03f8-03ff : serial
0800-087f : pnp 00:08
0800-0803 : ACPI PM1a_EVT_BLK
0804-0805 : ACPI PM1a_CNT_BLK
0808-080b : ACPI PM_TMR
0814-0819 : ACPI CPU throttle
0828-082f : ACPI GPE0_BLK
0880-08bf : pnp 00:08
08c0-08df : pnp 00:08
08e0-08e3 : pnp 00:08
0c00-0c7f : pnp 00:08
0ca0-0ca7 : pnp 00:08
0ca8-0ca8 : pnp 00:09
0ca9-0cab : pnp 00:08
0cac-0cac : pnp 00:09
0cad-0caf : pnp 00:08
0cf8-0cff : PCI conf1
dca0-dcbf : 0000:00:1d.2
dca0-dcbf : uhci_hcd
dcc0-dcdf : 0000:00:1d.1
dcc0-dcdf : uhci_hcd
dce0-dcff : 0000:00:1d.0
dce0-dcff : uhci_hcd
e000-efff : PCI Bus #10
ec00-ecff : 0000:10:0d.0
fc00-fc0f : 0000:00:1f.1
fc00-fc0f : libata



$> lsof (this is too long, 1.3Mb to upload, if needed I'll search for somewhere to put it.)

Last edited by permalac; 03-05-2009 at 09:33 AM. Reason: add attach
 
Old 03-05-2009, 09:40 AM   #2
stress_junkie
Senior Member
 
Registered: Dec 2005
Location: Massachusetts, USA
Distribution: Ubuntu 10.04 and CentOS 5.5
Posts: 3,873

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
You can use the sar utility to track resource usage. Here is a web page with instructions. Once you know the processes that are consuming resources you can determine whether they are a problem or not.

http://www.ozzu.com/unix-linux-forum...ge-t67717.html
 
Old 03-05-2009, 01:00 PM   #3
permalac
Member
 
Registered: Jul 2007
Location: Barcelona
Posts: 115

Original Poster
Rep: Reputation: 16
Thanks, I've tried sar, and nothing happend, then I've gone to the sysstat cron and was Disabled, so I just enabled and executed the command it has after 'root', but then my sar command say nothing but the kernel.


$> sar -q
Linux 2.6.22-14-server (varoitus) 03/05/2009


My cronjob looks like follows:

$> cat /etc/cron.d/sysstat
# Global variables:
#
# our configuration file
DEFAULT=/etc/default/sysstat
# default setting, overriden in the above file
ENABLED=true
SA1_OPTIONS=""

# Activity reports every 10 minutes everyday
5-55/10 * * * * root [ -x /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [ "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 1 1 ; }

# Additional run at 23:59 to rotate the statistics file
59 23 * * * root [ -x /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [ "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 60 2 ; }



Somebody know whats wrong here?


thanks.
 
Old 03-05-2009, 01:51 PM   #4
salasi
Senior Member
 
Registered: Jul 2007
Location: Directly above centre of the earth, UK
Distribution: SuSE, plus some hopping
Posts: 3,899

Rep: Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774
Quote:
Originally Posted by permalac View Post

$> cat /proc/loadavg
31.98 31.25 29.76 1/524 23947

$> uptime
16:13:02 up 189 days, 9:32, 9 users, load average: 29.93, 29.56, 27.95
Those load averages are not good, so you'd be expecting the box to be slow, very slow.

Quote:
$> top
Tasks: 355 total, 1 running, 353 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 67.9%id, 32.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 28875892k total, 28301096k used, 574796k free, 608492k buffers
Swap: 7807580k total, 256k used, 7807324k free, 22947048k cached
OK, so that's it then (sort of): you wouldn't expect any performance out of a box that is spending 67.9% of its time idle and 32% waiting. That's 99.9% of its time gone without doing anything for you, unless you count waiting, which isn't very productive.

Now, all you've got to do is find out why it prefers waiting/idling to doing anything helpful. My guess: the wait is almost certainly to do with IO somewhere, so finding out what IO is causing this (& what is causing the IO) would be a good start.

PS: If you are going to post long screeds of column-formatted data, please use code tags, it makes it so much easier to read.
Eg;
Code:
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2534 root      10  -5     0    0    0 D    0  0.0  14:23.96 kjournald
23903 root      15   0 19212 1540  968 R    0  0.0   0:00.05 top
    1 root      18   0  3964  892  624 S    0  0.0   0:03.73 init
    2 root      11  -5     0    0    0 S    0  0.0   0:00.19 kthreadd
    3 root      RT  -5     0    0    0 S    0  0.0   0:00.40 migration/0
    4 root      34  19     0    0    0 S    0  0.0   0:04.01 ksoftirqd/0
    5 root      RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/0
Thanks
 
Old 03-05-2009, 01:52 PM   #5
rweaver
Senior Member
 
Registered: Dec 2008
Location: Louisville, OH
Distribution: Debian, CentOS, Slackware, RHEL, Gentoo
Posts: 1,833

Rep: Reputation: 163Reputation: 163
Your kswapd appears to be utilizing a lot more cpu than it should be, you maybe having some problems related to swapping things out to disk. Drives full or drive errors by chance? Re-sized file system recently?

Wait time is bad and you've got a nice chunk of it.

You might wanna run rkhunter or chkrootkit on the machine to decrease the chance that you've been compromised and just can't see the processes that are causing problems, although in this case I don't think it's the issue.

Last edited by rweaver; 03-05-2009 at 01:56 PM.
 
Old 03-05-2009, 04:23 PM   #6
salasi
Senior Member
 
Registered: Jul 2007
Location: Directly above centre of the earth, UK
Distribution: SuSE, plus some hopping
Posts: 3,899

Rep: Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774
BTW (and I should have mentioned this earlier, but I've just accidentally reminded myself of it), I can produce numbers quite like that by running 'updatedb'; in other words causing a lot of disk accesses, which in turn cause lots of waits.

I don't suppose that you have anything that you know will cause lots of I/O.

Do you have any low level diagnostics indications, or as we humans call them, flashing lights (disk, network)?
 
Old 03-05-2009, 04:33 PM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
This comes up continually - try the script I posted in this thread.
 
Old 03-06-2009, 03:54 AM   #8
permalac
Member
 
Registered: Jul 2007
Location: Barcelona
Posts: 115

Original Poster
Rep: Reputation: 16
The script is:
Code:
top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D: "count}'
Code:
# top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D: "count}'
top - 09:08:50 up 190 days,  2:28,  9 users,  load average: 153.97, 152.64, 150.88
Tasks: 1106 total,   1 running, 1104 sleeping,   0 stopped,   1 zombie
Cpu(s):  1.8%us,  0.1%sy,  0.0%ni, 96.8%id,  1.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  28875892k total, 28715736k used,   160156k free,   615440k buffers
Swap:  7807580k total,      256k used,  7807324k free, 23106056k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2534 root      10  -5     0    0    0 D    0  0.0  15:13.10 kjournald
 4646 syslog    16   0 12252  760  572 D    0  0.0   1:07.90 syslogd
11401 root      18   0 71452 7432 1632 D    0  0.0   0:58.09 python
16134 root      15   0     0    0    0 D    0  0.0   0:07.84 pdflush
24580 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24590 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24619 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24624 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24639 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24641 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24643 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24648 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24655 root      18   0 32696  956  648 D    0  0.0   0:00.01 cron
24657 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24658 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24660 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24675 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24677 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24747 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24847 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24851 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24857 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24858 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24859 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24860 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24861 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24863 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
24865 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24871 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24873 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24876 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24877 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24881 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24883 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24915 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24928 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24931 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24932 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24934 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24935 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24940 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24946 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24947 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24948 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24949 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24954 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24958 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24959 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24960 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24961 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24962 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24963 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24965 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24968 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24969 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24971 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24972 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24974 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24975 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24979 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24982 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24983 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24984 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24995 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24996 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24998 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
24999 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25000 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25009 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25010 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25012 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25017 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25023 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25024 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25026 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25033 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25034 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25035 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25039 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25040 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25042 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25043 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25047 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25048 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25053 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25055 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25056 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25058 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25059 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25060 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25075 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25079 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25080 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25084 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25085 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25087 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25090 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25091 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25092 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25093 root      18   0 32696  956  648 D    0  0.0   0:00.00 cron
25098 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25100 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25101 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25103 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25105 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25106 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25115 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25116 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25119 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25121 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25122 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25132 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25133 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25134 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25136 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25137 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25139 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25147 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25151 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25152 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25153 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25156 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25158 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25159 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25160 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25161 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25162 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25167 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25168 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25171 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25175 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25176 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25181 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25182 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25184 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25186 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25187 root      16   0 32696  956  648 D    0  0.0   0:00.00 cron
25191 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25192 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25197 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25198 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25201 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25202 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25222 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25224 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25225 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25226 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25231 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25234 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25235 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25237 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25238 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
25239 root      15   0 32696  956  648 D    0  0.0   0:00.00 cron
Total status D: 153
Ok, looks like cron is not doing well. Just done a killall and alized it.

We just changed from one domain to another and it possible that the sendmail config was messing up with the domain, I have also changed this.

But still now my box is slow:

Quote:
# top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D: "count}'
top - 09:53:17 up 190 days, 3:13, 9 users, load average: 5.81, 17.75, 71.76
Tasks: 162 total, 1 running, 160 sleeping, 0 stopped, 1 zombie
Cpu(s): 1.8%us, 0.1%sy, 0.0%ni, 96.8%id, 1.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 28875892k total, 28379740k used, 496152k free, 615600k buffers
Swap: 7807580k total, 256k used, 7807324k free, 23079748k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2534 root 10 -5 0 0 0 D 0 0.0 15:15.06 kjournald
4646 syslog 16 0 12252 760 572 D 0 0.0 1:07.91 syslogd
11401 root 18 0 71452 7432 1632 D 0 0.0 0:58.22 python
16134 root 15 0 0 0 0 D 0 0.0 0:07.88 pdflush
26137 root 16 0 197m 9236 6668 D 0 0.0 0:00.05 vim
Total status D: 5

What else can be?
 
Old 03-06-2009, 06:41 AM   #9
salasi
Senior Member
 
Registered: Jul 2007
Location: Directly above centre of the earth, UK
Distribution: SuSE, plus some hopping
Posts: 3,899

Rep: Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774Reputation: 774
You are beyond my direct ability to understand what is going on; however answering the following may be informative:

Before killing things, you had 1106 tasks running; you killed the crons and there were 149 of them and now there are 162 left; those numbers don't add up; did you kill more than shows up in the data that you posted?

There are (were) a lot of crons; why is that? Is something, one of your tasks, re-running cron if something hasn't run on time and is this leading to a large number of crons when things go wrong? If so, is that really a sensible thing to be doing (yes, it protects against cron not running to completion, but if the consequence is that the box collapses under the weight of crons that you eventually spawn, then it hasn't helped in the way that it is currently configured...I'm not saying that it can't be reconfigured to work well, just that it isn't helping now)?

Quote:
Cpu(s): 1.8%us, 0.1%sy, 0.0%ni, 96.8%id, 1.2%wa, 0.0%hi, 0.1%si, 0.0%st
Now, the wait is down to a reasonable number and the cpu(s) is/are largely idle; this is a step forward, but why are you saying that the box is slow and it is largely idle? In other words, what exactly are you seeing that is the evidence of slowness, what exactly is the symptom? It couldn't be something else such as networking issues causing the apparent problem, could it?
 
Old 03-06-2009, 10:17 AM   #10
permalac
Member
 
Registered: Jul 2007
Location: Barcelona
Posts: 115

Original Poster
Rep: Reputation: 16
Sorry, I know I'm not easy to understand in English, it's not my best language. I'll try harder.

Let's see, this server has lots of people working on it and may be they just started some apps between one post and another one.

By now the server is still on low performance(whitout cron running). Why I say so, because of 3 reasons:

1 - vim takes many seconds to open or close (:q) files.
2 - it does not let users ssh to it .
3 - the backup said "Read from remote host varoitus: Connection reset by peer
Connection to varoitus closed."



But I'm still conected to it, I'm lucky.





One thing I would like to ask is why syslog takes so mucho VIRT and RES?

Code:
# top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total status D: "count}'
top - 17:14:05 up 190 days, 10:33, 10 users,  load average: 4.05, 4.39, 4.32
Tasks: 173 total,   1 running, 171 sleeping,   0 stopped,   1 zombie
Cpu(s):  1.8%us,  0.1%sy,  0.0%ni, 96.8%id,  1.2%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  28875892k total, 14982132k used, 13893760k free,   607696k buffers
Swap:  7807580k total,      256k used,  7807324k free,  9969148k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2534 root      10  -5     0    0    0 D    0  0.0  15:30.80 kjournald
 4646 syslog    16   0 12252  760  572 D    0  0.0   1:07.93 syslogd
16134 root      15   0     0    0    0 D    0  0.0   0:08.09 pdflush
Total status D: 3


Thanks

Last edited by permalac; 03-06-2009 at 10:18 AM.
 
Old 03-06-2009, 10:25 AM   #11
rweaver
Senior Member
 
Registered: Dec 2008
Location: Louisville, OH
Distribution: Debian, CentOS, Slackware, RHEL, Gentoo
Posts: 1,833

Rep: Reputation: 163Reputation: 163
It sounds like you might have hit a bottleneck on your disk io from what you're describing.
 
Old 03-06-2009, 05:16 PM   #12
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,267

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
Disk I/O almost certainly was the problem - problems now may just be hangovers from that. Check that cache usage for instance. That will also have an effect (that you can't directly see) on the manner that memory is allocated.
Normally I would recommend you go through sar data, but you've already said that is no good. Anything you start now will only see what is happening now - not necessarily what the cause is.
Try vmstat, but let it run for a while with a useful delay (ignore the first line).
At 2.6.22 you should have access to taskstats - go get iotop.py, and see if that shows anything. No likely now that the wait percentage has dropped so low.
collectl is also good, but you'd be just as well to get sar organized.
Hard to help without hard data.

BTW, those syslogd numbers are the same as mine on this laptop that has just been re-booted. Not a problem
 
Old 03-09-2009, 11:13 AM   #13
permalac
Member
 
Registered: Jul 2007
Location: Barcelona
Posts: 115

Original Poster
Rep: Reputation: 16
Hello,

this weekend I just let the box without cron, and everything has gone fine.

Today I just started the cron, expecting that upload will grow without measure, but until now everything is fine.

I really don't get it, I don't know why cron was eating all the resources and I don't get why is not doing it again.

Anyway, manythanks to all and until next time .

Thanks.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Server with high load average and no obvious reason. DotHQ Linux - Server 15 03-06-2009 03:53 AM
Load average stay as high as around 1.00 lawrence_lee_lee Linux - Software 2 09-10-2008 01:22 AM
high load average, low cpu usage ! jimmyjiang Red Hat 8 02-08-2008 12:28 AM
Why is my load average so high when comp. is idle? BrianK Linux - General 1 11-18-2005 12:25 AM
RH8 Load Average High - No CPU Utilization jj91709 Red Hat 2 08-29-2004 12:28 AM


All times are GMT -5. The time now is 07:16 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration