-   Linux - Enterprise (
-   -   Server high load, testing hardware reliability and limiting process (

stelar 10-19-2006 08:06 PM

Server high load, testing hardware reliability and limiting process

Recently I'm having a problem with my server (SLES 10). For some reason, the load can get very high that it is not possible to ssh into the server or even login to the console.

The high load problem happens 1 hour after I started postfix, amavisd, mysql, clamd, cyrus Imapd, saslauthd, heartbeat and mon. After cold reboot, I've looked at the log/messages and I don't find anything to suspect. Everything seems to be working as per normal. In fact, this server is installed in exactly the same way as the other server (the one that take over when this server failed). The other server can take over without any problem for 1 month now.

I'm trying to test whether it is hardware problem or not. Anyone know a complete set of test software to do that? The only thing I know is memtest86. But how do I test other hardware, e.g. CPU, hard disk, etc? In fact, what is the proper full set of test to do before using a piece of hardware for production?

Another thing is, is there a way in Linux to set - let's say Cyrus Imap - such that if it takes up too much resources, the kernel should just kill it. I've heard of ulimit, but it seems to be for per user. Can it be used in this situation?

leandean 10-20-2006 12:58 AM

I use the Ultimate Boot CD ( for most tests. For HDD's I use the manufacturer's utility disc.

All times are GMT -5. The time now is 06:53 PM.