Server unresponsive--high load ave, minimal CPU usage
Version: Red Hat Enterprise Linux ES release 3 (Taroon Update 9)
Kernel: Linux 2.4.21-47.0.1.ELsmp i686
Following the new year, users have complained that they have been unable to make a remote connection via SSH. The connection is refused immediately, PuTTY closes, and they don't even make it to the login prompt.
Using VNC, I am able to remote to the desktop (Gnome), start a terminal and run TOP. This will typcally show a load average of 23 or higher (I have seen 52+) with very little CPU usage. There are four processors and they all show 97% idle or higher.
Looking at the memory, I see nothing unusual. No swap memory is being used.
To get the server functioning again I have to force a restart. This fixes the problem for 24 to 48 hours when it arises again.
This server runs a progress database and hosts QAD's MFG/Pro. Last year, we migrated to a different ERP, so this is being kept alive for reference. It gets very little use.
How do I go about trouble shooting this problem? So far, I have shutdown all MFG/Pro batch queues thinking something is running and has been flaked-out by the change in year. No change.
Any assistance would be appreciated.