LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Debugging kernel panic (https://www.linuxquestions.org/questions/linux-general-1/debugging-kernel-panic-444592/)

rbh123 05-14-2006 02:57 AM

Debugging kernel panic
 
Hi,

I am interested in learning how-to debugg Kernel-panics,but I dont know how to start.So need help in understanding how to debug a kernel-panic ? how to find the code that is causing the panic?.I know that the first thing to do when there is a kernel panic is to capture the Oops.But its difficult for me to understand the Oops.

Thanks,
rbh

pmarques 05-15-2006 01:15 PM

More information
 
To help you, I'll need more information:
- What kernel version are you using?
- Does the machine hang after the panic? I.e., is this just a oops or an actual panic?

If you're using a 2.6 kernel, make sure CONFIG_KALLSYMS is set on your configuration file.

If your machine didn't hang, then the oops is in your syslog somewhere. If this is the case, please post a copy of the trace here so that I can try to point you in the right direction...

rbh123 05-25-2006 04:01 AM

Need help in understanding kernel panic messages
 
I am using a 2.6 kernel.
My question is when there is a kernel panic(System hang or Oops) we see messages like
1. value of EIP and other registers,
2. Call Trace
3. Stack contents
4. Panic message

So how to understand these messages and how to use them to find the code that is causing the kernel panic.

Thanks in advance,
-rbh

pmarques 05-25-2006 08:22 AM

Oops tracing
 
You failed to reply to this part:

"If your machine didn't hang, then the oops is in your syslog somewhere. If this is the case, please post a copy of the trace here so that I can try to point you in the right direction..."

Even if the machine did hang, a picture taken with a digital camera of the oops text posted somewhere would be useful.

Anyway, if you're using a vanilla kernel, you can find all the information you need in the file "Documentation/oops-tracing.txt" that is bundled with th kernel sources.

If you're using a vendor kernel (redhat, suse, mandriva, etc.) you should report it to your vendor. Vendor kernels have specific patches that might cause problems and it is not the kernel developers responsability to track down every bug that every distro inserts into their custom kernels.

You can always try to replace your kernel with a vanilla one (the latest available) and if the problem persists, then inform the relevant kernel developers.

PTrenholme 05-25-2006 09:14 AM

Quote:

Originally Posted by rbh123
Hi,

I am interested in learning how-to debugg Kernel-panics, but I dont know how to start.So need help in understanding how to debug a kernel-panic ? how to find the code that is causing the panic?.I know that the first thing to do when there is a kernel panic is to capture the Oops.But its difficult for me to understand the Oops.

Thanks,
rbh

To which pmarques replied:
Quote:

To help you, I'll need more information:
- What kernel version are you using?
- Does the machine hang after the panic? I.e., is this just a oops or an actual panic?[snip]
and, later,
Quote:

Anyway, if you're using a vanilla kernel, you can find all the information you need in the file "Documentation/oops-tracing.txt" that is bundled with th kernel sources.
Clearly, what pmarques was answering was "How can I solve a specific kernel panic, whilst rbh123 was asking "How can I learn what to do when a kernel panic (or "oops") happens, and what to do about the error?"

The suggestion to look at the documentation that comes with the kernel source code is a place to start, although not all sources have the oops-tracing.txt file. (Note: You need to have downloaded the kernel sources for your distribution for this to be possible.)

Other things to try:

1) Review the log files in /var/logs
2) Google the error message.
3) Look for any bugzilla reports for the distribution and specific kernel that generated the message
4) And, simplest of all, read the error message. Most of the time the cause of the message is fairly clear from the text of the message and any error messages that preceeded it.


All times are GMT -5. The time now is 04:08 AM.