Quote:
Originally Posted by adamwonski
that means your devices are saturated / overwhelmed with requests
is that software RAID? Do you see any drive broken, or the RAID un-synced? syncing?
|
No it is HW Raid, all the drives are showing OK, and the RAID seems working OK.
Quote:
Originally Posted by adamwonski
i think that having constant load of 10 procs for 4 CPUs is not too bad, although if most of them are also blocked all the time (as I understand from your previous post), then either you have a problem with disks, or your applications (or 1 of them) use them extensively. Is your disk space schrinking fast? You can run something like this to observe:
|
I see that especially the /var partition is growing faster than the others but not at a very high rate.
Quote:
Originally Posted by adamwonski
maybe it's swapping?
how do the swap-si/so columns look like?
what does the 'free' command show in Swap line?
|
Most of the time si/so show zero, and free (swap around 160MB used from 3800MB)
total used free shared buffers cached
Mem: 2060388 2030080 30308 0 29260 753480
-/+ buffers/cache: 1247340 813048
Swap: 3895720 167896 3727824
Quote:
Originally Posted by adamwonski
add -p parameter to see easier to understand device names:
which drives/partitions belong to RAID? which are most loaded? does reading or writing prevail? what other interesting numbers can you observe?
do you see anything particular in logs?
when exactly the problems began? did you change anything prior to that time? ANYTHING? even completely unrelated in your opinion?
|
As I said the Raid is HW and the all the hard drives (5 HDD's)are shown as 1 big HDD ~1.3TB, partitioned in several partitions.
I think writing prevail most of the time.
I do mention that I did some changes to /etc/fstab (addedd noatime and nodiratime to the /var and /home partitions)
This increased significantly the performance but although the problems seems not to have gone away completely, the load keeps going 100 but at lower rate.
I did an upgrade of the popd/imapd server (dovecot) suspecting that it was causing the problem, which was showing error logs like segfault and now they have gone away.
The problem began about two weeks ago, and I'm sure that no change was made to the server, as concerning to the configuration or anything else, except that I noticed the partition /var and /home growing (not too much although) and the load kept increasing (but always below 20 - 30) not 100.
Thanks,
Enid