Troubleshooting high load - Possible IO / RAID issue?
Hi Everybody,
This is my first post after quite a while of lurking. To LQ's credit, I can usually find what I need without posting :) I have a Centos 5.3 server: Quote:
Quote:
The server is using software RAID1: Quote:
So I ran iostat -dx 5 Quote:
I haven't used iostat before and I'm not entirely sure how to interpret these results. However, if I'm not mistaken, sdb looks like it is taking up a lot of CPU utilisation and responding much slower than sda. Also, the disks don't seem to be carrying an equal load: Quote:
I really need to track down what is causing this high load. Could anyone give me some guidance? |
Hi.
Yup, that iostat output looks a bit suspect. Is there anything mounted on or otherwise using using sdb2 or sdb3 etc? Dave |
Hi Dave,
Thanks for the response, I ran fdisk -l : Quote:
Are there any other commands I can run to check this? Thanks again, Ked |
Hi again.
I'd check /var/log/messages for any scsi or md errors if you haven't already. Check 'dmesg' too. Might be worth (if you're willing) failing the sdb1 partition to stop md0 using it and see if the load averages drop. Risk involved, obviously, but you'd know if the disk was causing the load issue. If memory serves (don't bank on it) the command would be: # mdadm --fail /dev/md0 /dev/sdb1 Incidentally, what does 'swapon -s' show? From what I can see from your posts you've only got sdX1 in raid, but you've got swap switched on, so one disk failure might still crash your host if the swap is on the raw partitions. Dave |
What have you got in /etc/fstab? As noted above, you've raided your boot/root/data partitions, but not your 2(!) swap partitions, so maybe you're only using one?
|
Hi Dave,
In dmesg I have: Adding 2040244k swap on /dev/sdb2. Priority:-1 extents:1 across:2040244k Adding 2040244k swap on /dev/sda2. Priority:-2 extents:1 across:2040244k In /var/log/messages there are a couple of ata / scsi messages that I don't recognise: Quote:
Quote:
Quote:
Failing the drive sounds like it will be the ultimate test here, although I'm a little reluctant to do so until I've gathered as much info as possible. Thanks for the pointer on the mdadmin command to do this, I'll check it out in a bit more detail. In your opinion, do you reckon I'm dealing with a duff sdb? Thanks for the help and advice so far, I've learnt a lot about troubleshooting IO issues here :) |
Chris / Dave,
I thought I'd seperate my response in to 2 distinct posts rather than one mega post... cat /etc/fstab Quote:
Although looking at swapon -s, it appears only one swap partition is being used: Quote:
Thanks for you help guys, Al |
I don't think I've seem those messages before either but I'd say they're not too healthy.
Could be something as simple as a loose cable, but if this is a production machine I'd get that disk replaced. As far as I'm concerned, if a disk does anything even slightly odd it gets replaced. Hardware support contracts are a beautiful thing. You /really/ need to raid1 those two sdX2 partitions and use that as swap instead of the two raw partitions. As it stands at the moment, your host will very probably crash if either of those disks fails (which is looking increasingly likely). Dave |
Apologies for the extra posts - I'm not bumping, it's just my brain slowly computing what's going on.
I can see what you guys mean now with regard to the swap partitions, lets see if I've got this right: sda1 & sdb1 are mirrored to form md0 sda2 & sdb2 are not mirrored, and it would appear the sdb2 is the currently active swap partition. From what you are saying, this is not a fault tolerent config, as if sdb goes down, the swap goes with it and crash goes the box. Should I consider creating an md1 comprised of sda2 and sdb2 and use it for the swap partition? I realise this is a seperate issue from the health of the sdb drive. Sorry if this feels a bit like pulling teeth, I didn't set this server up and it's a bit of a learning process (which I'm enjoying btw). Al |
Ah, looks like I was drafting my post as you replied.
That's great, you've definately put me on the right track, I'll get that disk replaced and those swap partitions mirrored. Thanks again! |
No worries. As you've gathered, you have to actually raid (mirror in this case) the 2 swaps explicitly. Because your system is using very little swap
Swap: 4080488k total, 120k used, 4080368k free, 2615988k cached its only using one disk, but that is a partition, not a disk, so its draining i/o bandwidth from the data/program i/o in the same physical disk. In a big system eg a major DB, you'd have data and swap on different disks( not partitions). In fact, they'd be on separate i/o busses as well. |
swap priority
a little off topic here but if you give your swap partitions equal priorities then it will stripe acrosss the two disks.
mine is: /dev/sda3 swap swap sw,pri=3 0 0 /dev/sdb3 swap swap sw,pri=3 0 0 you'd need to adjust yours manually. This will ensure the ask for each disk remains the same. what I don't know is what happens if a drive fails and the kernel looses half it's swap. |
Quote:
Dave |
All times are GMT -5. The time now is 09:26 AM. |