LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Server keeps locking up - how to find cause? (https://www.linuxquestions.org/questions/linux-networking-3/server-keeps-locking-up-how-to-find-cause-4175468802/)

Vinter 07-07-2013 03:48 PM

Server keeps locking up - how to find cause?
 
Hi!

I've got a server set up that keeps locking up to the point of not even responding to ping, presumably due to excessive CPU usage. Problem is, I can only ssh into it, and the session becomes unresponsive and terminates as soon as this happens. How can I find out what's causing the problem? It's only got busybox, so I can't use top -b. Would renicing sshd help, for instance? Or are there any other recommendable ways of logging CPU usage?

Thanks!
V.

Ser Olmy 07-07-2013 04:12 PM

In my opinion, it's unlikely that a process in userspace is able to hog the CPU to such an extent that the IP stack stops responding.

Are you sure this is not a kernel panic or a hard lock-up? How do you get the server back up and running when this happens?

Vinter 07-07-2013 06:40 PM

I don't do anything to relieve it, it happens regularly - last time I timed it, the lockups happened every minute for about 30 seconds. There's nothing in the crontab to suggest regular activity. It's a NAS, so I'm not exactly sure what could be going on in there...

Ser Olmy 07-07-2013 07:05 PM

I would suspect some kind of hardware failure. Are there any logs on this system?

Have you looked closely at the network interface, cables and switch ports? Have you tried connecting the NIC on the unit directly to a PC?

Mousepad123 07-08-2013 09:58 AM

It might be bad memory, have you tried running Memtest86 on it?

RootMason 07-08-2013 02:19 PM

I agree that it may be faulty hardware. I have had the same problem with a corrupt hard drive before. The system tries & tries to correct the problem but in the end the problem is unfixable by the OS and it ends up just hogging CPU time while it attempts repairs.


All times are GMT -5. The time now is 08:42 PM.