Freezing kernel 2.6.8.1 on HP DL380 G3 (high load kswapd)
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Freezing kernel 2.6.8.1 on HP DL380 G3 (high load kswapd)
Hi,
I have HP DL380 G3 P4 with HT with 1GB of RAM, 3x72GB SCSI with RAID 5 on Smart Array 5i. Running vanilla kernel 2.6.8.1 with compiled SMP, HT, IO-APIC, ACPI, NOHIGHMEM, IRQBALLANCE. Server freezes 2x per week, no keyboard and ping responses. After three days I have high iowait load (50-70%) and small system idle (50-20%). Always when I run top the kswapd0 is on the first place with big TIME+. I try also put boot options acpi=noidle but with no effect. Here are few lines from top.
I run memtest and also badblocks and all tests passed OK. It looks like a problem with interrupts because when badblocks was running the iowait has 70-90% load. Maybe the problem is in SMP + HyperThreading and bad IRQ handling.
I do not use default slackware 10 kernel because I want to use 2.6 kernel and other people has the same problem on 2.4 kernels. Hardware seems to be OK.
Just a few thoughts. How is your swap organized? Do you have a small swap partition on each hw disk? I would recommend arranging it like this, not swapping to a RAID device (if you run sw raid of course).
Do you have some oops and panic messages on the screen when it crashes?
Also, check if you don't have PREEMPT_KERNEL enabled. This could cause trouble as well.
There are 3x72GB SCSI disks with hardware RAID 5 and I do not have any other disk where should I move swap. When server crashes it does not respond on ping or keyboard requests (also SYSReq) there is only blank screen. I do not have PREEMPT_KERNEL enabled. In my kernel I have enabled SMP, HT and NOHIGHMEM.
After upgrade of System BIOS (version 2004.06.23) and flashing Smart Array 5i controller (version 2.58 B) there are no freezes, but the a huge iowait is still there. I think it has something to do with performance of SA 5i controller. I have found some forum tips (, but it requires to buy a new array controller or "+" to SA 5i):
I am getting exactly the same behaviour (well close). I have a very similar configured kernel SMP, HT, IO-APIC, ACPI, HIGHMEM, IRQBALLANCE. 2 mirrored SATA disks (using linux software raid)
I rebooted due to a failed raid disk and when I hot-added the second drive the responsiveness went completely down hill. The situation is very curious...
On boot, 100% idle CPU (both CPU's) - load ~ 1.00!!
Heavy disk access ~100% idle CPU (both CPU's) - load ~ 3.0-5.0 (and climbs with time!)
Heavy CPU usage (i.e. bzipping 100's MB's) ~ 50% Idle CPU - load < 1.00 !!!
The odd thing is that when the CPU is inactive the load is always over one and the more hard disk access is done the more the load goes up (but cpu's still claim to be idle) and when you do MORE CPU usage the load goes down!
The more CPU usage there is on the server the more responsive it seems to be too! For example when just restoring one mirror from another it was transferring at ~ 60MB/sec and doing:
time cat /proc/mdstat
returned about 25seconds real and 0.01 sys and user!
However when the CPU is really busy and the mirroring was forced to slow down to ~ 1MB/sec. the cat /proc/mdstat responded instantly!
I suspect that it is something they have changed with IOAPIC's in the new kernels that has made them not good with hyperthreading. I'm going to try recompiling without IOAPIC.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.