Are kernel panics normal at reboot when servers been running for a long time?
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I am planning on e2fsck'ing the entire server next time I restart it, since I'm sure that hasn't been done in forever, I just need to figure out how to make that happen. Unless I'm still missing something, I don't see anything in the log that says why it did that.
You can boot into a 'single' user mode so there is no problem with fsck on the system. You can either pass the 'single' parameter to the kernel or use the install cd/dvd. A LiveCD can be used to insure that the disk is not mounted then perform the fsck.
As others have pointed out, you may not get all the information relative to a problem if the disk or filesystem is not available to the kernel at boot time. Your observations at boot will most likely be the only way to find the error.
I rebooted it again, no e2fsck'ing, and all I could see on the screen that made sense directly above the kernel panic line was a message about modprobe, then what looked like a memory dump or something after it.
Yeah, I saw the man that says it adds and removes modules from the kernel, but I think I need a Linux to English translation haha. Why would you add or remove things from the kernel unless you're working on the open source yourself. Guess I don't under what it does at bootup.
In fact kernelmodules are parts of the kernel which can be added at runtime if they are necessary. modprobe is the programm which add kernelmodules. One can choose during kernelconfiguration if a part is compiled into the kernel or as a module.
In your case I think there is a hardwaredriver which is used early at boottime build as a module.
I'd recommend, learn to build your own kernels. Reading the kerneldocumentations will teach you alot about kernelmodules.
If the problem is a bogus kernel module, yes, an update could fix it. But if it's bogus hardware, no update is going to magically fix it.
You will lose nothing installing an up-to-date kernel and trying to boot it. You don't need to replace your current one, you can have as many kernels as you wish in your boot menu. However, the complexity of the operation will depend in a number of things. In first place we would need to know wether you are using a 2.4 or a 2.6 kernel, the two branches are maintained and servers sometimes use 2.4 kernels. If that's the case, updating to 2.6 would probably not be advisable, and you should instead update to the latest 2.4.x available.
But such kind of bug on an stable kernel release is really not frequent, it's unlikely, what makes me think that it's probably a hardware problem or a problem with a 3rd party driver. We really have no info so we can only guess. It would be useful to know if the segfault happens before init (so the problem is definitely in the kernel or your hardware) or after init has come into scene. Even a photo of the screen at that moment could help if there's no other way.
thanks i92. I think what I'm going to try next is to run the up2date and have it update everything that is not current. The server hadn't been restarted in almost a year, so I'm guessing the updates are that far behind. If after the updating it panics again, I'll try to get more detail of whats up on the screen.
Hi everybody. Well I ran the up2date -u and got everything updated, but then on reboot it panic'd again as it was shutting down. I didn't have a camera to take a picture of all the stuff on the screen, but looking at the memory dump more it looked like it was trying to unload and clean up something, and had ip_conntrack in parenthesis on the top line of the detailed dump. Not sure if that helps at all.
But, since the panic appears to be happening before it shuts down, does that mean it might be logging this stuff somewhere?
I don't see anything in there though. when I do a
grep -i panic *
I don't get anything. When I do a
grep -i ip_conntrack *
I only get the following message from today
messages:Oct 12 07:28:54 ServerName kernel: ip_conntrack version 2.1 (8192 buckets, 65536 max) - 348 bytes per conntrack