Disk performance causing high Load Avg?
Hi All,
I have a mail/dns/web server that has been having some perfomance issues. Little things will make the load average spike, the system becomes slow and my mail server starts erroring out and won't recover itself. The latest thing that seems to be causing the trouble is the nightly updatedb. CPU util seems fine, mem seems fine, so I'm assuming it has something to do with disk IO. Here are some outputs, normal load avg. is about 2. Usually when this is running I'll get a load of about 15, for some reason, it only went up to about 11 this time Code:
top - 12:21:43 up 28 days, 5:31, 2 users, load average: 10.43, 6.02, 3.59 Running Kernel 2.4.31 on Slackware. Running RAID5 on a mylex acceleraid 352. Code:
Got any advice on where I can start to troubleshoot the performance issues? Thanks. |
The first thing I see is memory usage being topped out. What is running that is using 2gb of RAM? You are entering swap, which is typically a system killer.
|
I thought I was doing ok memory wise, since a lot seems to be cached. Cached memory is still available for system use correct?
Code:
total used free shared buffers cached Here is a top output sorted by % of memory, not sure how to answer Quote:
Code:
top - 13:58:08 up 28 days, 7:07, 2 users, load average: 2.52, 2.26, 2.57 Thanks, Craig |
Have you checked your system logs for disk I/O error messages ? Hard errors on disk can drop the system performance without any other apparent reason.
cheers, |
Go get sysstat, and use iostat. Quick search didn't find a Slack package for it, but just download and install it.
Gotta ask tho', why the updatedb every night ???. What do you use the data for, and how often ???. I know it's the "done thing" with a lot of distros. |
Hi All. Thanks for the replied.
Didn't see anything related to IO in the logs. Especially anything that looked like an Error. Quote:
I installed the sysstat package. Unfortunately, iostat isn't showing me much I don't think. Maybe because of the kernel version... I'm not sure. Here is some output though: Code:
Thanks. |
Never really looked at a 2.4 kernel from a performance aspect - even on Slack the first thing I did was go 2.6.
If you get the kernel sources with Slack (I don't have any Slack now so I can't check), have a look at iostats.txt - else go look online. Has a good discussion on fields in /proc - there are differences between 2.4 and 2.6. Maybe you can knock up a script to pull the numbers direct from there and write them to a file to have a look at later. One would think your problem has to be I/O. I had skipped the fact you are running RAID5 - wonder if that's getting in the way. From Linux point of view, you only have one disk - including swap, which generally isn't a good idea. So it will be trying to manage the I/O based on that - merging I/O and calculating swap slot locality. Your raid card will then be tearing it all apart and spraying it all over the disks. Hardly working together for optimal performance - not that that was ever a promise of RAID5. Maybe in the bad periods you could looks for tasks in I/O wait - (reverse) sort "top" on the "S" field (look for status "D"). |
All times are GMT -5. The time now is 06:41 AM. |