LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (http://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Are kernel panics normal at reboot when servers been running for a long time? (http://www.linuxquestions.org/questions/linux-newbie-8/are-kernel-panics-normal-at-reboot-when-servers-been-running-for-a-long-time-760040/)

rjo98 10-06-2009 07:56 AM

Are kernel panics normal at reboot when servers been running for a long time?
 
I have a bunch of servers that haven't been restarted in over a year. So far every one I've rebooted has had a kernel panic and just sat there during the reboot process until I manually power down and power back up, then it starts normally.

Is that to be expected when a server hasn't been restarted in a long time? Is there a way to make it restart itself gracefully when it has a kernel panic, rather than me hard powering it down?

AlucardZero 10-06-2009 08:20 AM

It's not normal.

onebuck 10-06-2009 08:32 AM

Hi,

No, it's not normal. You should look at the logs in '/var/log' to see if there's anything that will point to the potential problem.

i92guboj 10-06-2009 08:33 AM

Things that sometimes work, sometimes don't, are usually a symptom of broken hardware. You could start by checking your ram sticks with memtest86.

pixellany 10-06-2009 08:44 AM

More than once, I have "fixed" a computer by blowing out all the dust, careful vacuuming, and disconnecting and reconnecting all plugs (including the RAM sticks, but NOT the CPU).

Be sure to use static protection methods and other handling precautions. If you have never worked inside electronics equipment, consider getting some help.

onebuck 10-06-2009 09:29 AM

Hi,

I agree that regular 'PMS' (preventative maintenance schedule/services) should be performed. You should see some of the systems that I repair that would remain in service if only a good 'PMS' was followed. Some of these systems (fileservers included) are just placed on the floor for convenience and never get touched unless the cleaning persons bumps it. Not the best place to put a system. Even if you place the boxen on a small platform then the dust or dirt will be less.

Cleaning a system properly requires more than just blowing out or vacuuming. Card edges, cabling or other connectors may need attention. One should always follow safety procedures whenever handling electronic devices. :hattip:

rjo98 10-06-2009 09:57 AM

Thanks guys. The servers are actually clean in the dirty sense, not dusty or anything. But what should I look for in the logs to find out what caused the panic?

pixellany 10-06-2009 02:37 PM

Quote:

Originally Posted by rjo98 (Post 3709653)
Thanks guys. The servers are actually clean in the dirty sense, not dusty or anything. But what should I look for in the logs to find out what caused the panic?

Even if everything appears clean, it is completely plausible that there are some bad connections. If there is--e.g.--a bad connection to RAM stick, i'm not sure that you will find that in the logs....

De-mate, inspect, and re-mate all connecters and the RAM sticks.

Also, how about temperature? If you monitor CPU temperature and it is higher than normal, you may have a bad heat-sink interface. happened to me just a few months ago.

At the age of 10 I discovered that I could repair lawnmowers by disassembly and re-assembly----but I never knew WHY.
Now it's the same with computers, but at least I know why........;)

rjo98 10-06-2009 03:18 PM

OK. I rebooted the box again and it actually restarted fine this time, so maybe everything's connected ok and clean and this was just a one time thing?

onebuck 10-06-2009 04:00 PM

Hi,

Possibly a one time thing but I would certainly not give up on this. As it may be a prelude for things to come. A 'PMS' entails move than just physical cleaning. Filesystem maintenance falls into that realm along with physical checks on connectors, PSU rails and any head cleaning or lens cleaning should be addressed to name a few.

Most system maintenance should be done on a regular schedule. Within that schedule one should setup diagnostic and physical checks to prevent catastrophic problems.

rjo98 10-06-2009 04:25 PM

I am planning on e2fsck'ing the entire server next time I restart it, since I'm sure that hasn't been done in forever, I just need to figure out how to make that happen. Unless I'm still missing something, I don't see anything in the log that says why it did that.

markush 10-06-2009 05:05 PM

Hello together,

what are you looking for in a log-file? While a kernel-panic there will nothing be logged. In my experience it may happen that while booting a failure in the filesystem is detected which can only be fixed when again rebooting.
To help you out in this special case of a kernel-panic I think it would be necessary to know exactly the message on the screen while the kernel-panic.

Markus

rjo98 10-06-2009 05:07 PM

The exact message I do not recall. I thought it would have been written to a log file somewhere like everything else in Linux seems to be. Guess I was wrong.

markush 10-06-2009 05:11 PM

I think while a kernel-panic (which is at the very beginning of the boot-process) there may be no disk and as well no file accessible to write a log.

Markus

rjo98 10-06-2009 05:12 PM

OK. well I'll pay more attention next time to the lines above the kernel panic message then.


All times are GMT -5. The time now is 05:16 PM.