Are kernel panics normal at reboot when servers been running for a long time?
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Are kernel panics normal at reboot when servers been running for a long time?
I have a bunch of servers that haven't been restarted in over a year. So far every one I've rebooted has had a kernel panic and just sat there during the reboot process until I manually power down and power back up, then it starts normally.
Is that to be expected when a server hasn't been restarted in a long time? Is there a way to make it restart itself gracefully when it has a kernel panic, rather than me hard powering it down?
More than once, I have "fixed" a computer by blowing out all the dust, careful vacuuming, and disconnecting and reconnecting all plugs (including the RAM sticks, but NOT the CPU).
Be sure to use static protection methods and other handling precautions. If you have never worked inside electronics equipment, consider getting some help.
I agree that regular 'PMS' (preventative maintenance schedule/services) should be performed. You should see some of the systems that I repair that would remain in service if only a good 'PMS' was followed. Some of these systems (fileservers included) are just placed on the floor for convenience and never get touched unless the cleaning persons bumps it. Not the best place to put a system. Even if you place the boxen on a small platform then the dust or dirt will be less.
Cleaning a system properly requires more than just blowing out or vacuuming. Card edges, cabling or other connectors may need attention. One should always follow safety procedures whenever handling electronic devices.
Thanks guys. The servers are actually clean in the dirty sense, not dusty or anything. But what should I look for in the logs to find out what caused the panic?
Thanks guys. The servers are actually clean in the dirty sense, not dusty or anything. But what should I look for in the logs to find out what caused the panic?
Even if everything appears clean, it is completely plausible that there are some bad connections. If there is--e.g.--a bad connection to RAM stick, i'm not sure that you will find that in the logs....
De-mate, inspect, and re-mate all connecters and the RAM sticks.
Also, how about temperature? If you monitor CPU temperature and it is higher than normal, you may have a bad heat-sink interface. happened to me just a few months ago.
At the age of 10 I discovered that I could repair lawnmowers by disassembly and re-assembly----but I never knew WHY.
Now it's the same with computers, but at least I know why........
OK. I rebooted the box again and it actually restarted fine this time, so maybe everything's connected ok and clean and this was just a one time thing?
Possibly a one time thing but I would certainly not give up on this. As it may be a prelude for things to come. A 'PMS' entails move than just physical cleaning. Filesystem maintenance falls into that realm along with physical checks on connectors, PSU rails and any head cleaning or lens cleaning should be addressed to name a few.
Most system maintenance should be done on a regular schedule. Within that schedule one should setup diagnostic and physical checks to prevent catastrophic problems.
I am planning on e2fsck'ing the entire server next time I restart it, since I'm sure that hasn't been done in forever, I just need to figure out how to make that happen. Unless I'm still missing something, I don't see anything in the log that says why it did that.
what are you looking for in a log-file? While a kernel-panic there will nothing be logged. In my experience it may happen that while booting a failure in the filesystem is detected which can only be fixed when again rebooting.
To help you out in this special case of a kernel-panic I think it would be necessary to know exactly the message on the screen while the kernel-panic.
The exact message I do not recall. I thought it would have been written to a log file somewhere like everything else in Linux seems to be. Guess I was wrong.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.