LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Kernel Panic on Lenovo T61 with Debian 8 (stable) (https://www.linuxquestions.org/questions/linux-general-1/kernel-panic-on-lenovo-t61-with-debian-8-stable-4175560350/)

browny_amiga 12-01-2015 07:17 PM

Kernel Panic on Lenovo T61 with Debian 8 (stable)
 
Hi,

I'm having the strangest of problems: Kernel panics with Debian 8 Jessie (stable). A thing that after about 10 years of using Linux has never happened to me.
This is a Lenovo T61 Laptop that I have been running Linux on for years (Debian 7 and 6 before that), without ever having a panic that I remember.
Now when I use the system, with Gnome or KDE, after a while (it feels when I have a lot of windows open), the mouse pointer locks up, freezes and then the system hangs for around 10 seconds and then just restarts. The logs show absolutely nothing, so I don't even know if this is a panic, but I would assume so.

The first thing I want to do is to stop it from restarting and actually display the error, the panic, so I can start to troubleshoot it.
Then I need to find out how to do a memory dump for submitting a bug report.

I have never done this before (or had to do this), and this is a work laptop, a productive system, which makes crashes like this totally unacceptable.
I might have to downgrade to Debian 7 if I can't fix it.

My suspicion so far is the nouveau graphics card driver.
lspci reports: NVIDIA Corporation NV5M64 [RIVA TNT2 Model 64/Model 64 Pro] (rev 15)
I used to use the proprietary driver under Debian 7 and it ran like a pig, was so terribly slow and laggy. But it was rock stable.
Now the nouveau driver under Debian 8 gives me stellar performance and finally makes me appreciate this GPU, BUT it is crashing nonstop now and I suspect that it is the driver.

TxLonghorn 12-01-2015 10:52 PM

The first things I would suspect:
The cpu could be overheating.
The power supply could be failing.
Memory modules could be going bad.

Start monitoring your cpu temperatures.
Run a memory check.
CLEAN inside the computer case.

browny_amiga 12-02-2015 05:44 PM

Quote:

Originally Posted by browny_amiga (Post 5458267)
Hi,

I'm having the strangest of problems: Kernel panics with Debian 8 Jessie (stable). A thing that after about 10 years of using Linux has never happened to me.
This is a Lenovo T61 Laptop that I have been running Linux on for years (Debian 7 and 6 before that), without ever having a panic that I remember.

Hmm, you did not seem to read my post entirely (see above). This is a laptop, so no powersupply issue (running on battery, having the PSU as a backup), CPU gets trottled down, in case it gets to hot, so no danger of it overheating, RAM works fine and most importantly, this laptop has been working perfect since 2008, no panics, no problem, no crashes and I constantly loaded the RAM with Debian 7, so no RAM problems neither, but NOW that I installed Debian 8, I got this problem. So it is clearly a software issue, that much is certain.

Timothy Miller 12-02-2015 07:45 PM

First thing I'd suspect is bad sectors on the hard drive causing the kernel to not be able to read necessary files. Does it have the bios diagnostics?

Emerson 12-02-2015 08:03 PM

Will it act up when booted from USB, SystemrescueCD for instance? You can also run badblocks and fsck from USB.

browny_amiga 12-02-2015 08:33 PM

This system worked 100 reliable for years, every day, and I ran Debian 7 on it just last week, I work with this laptop every day and it worked, no problem. So this is not hardware related at all, I'm 100% sure of this, this is not the first time I would be troubleshooting hardware, I'm quite experienced with it. I ran a RAM test for 5 hours and before installing the new OS, came back clear, I did a format on the HD (a SSD) with checking for bad blocks, as a precautionary measure, all came back negative.
But one way or another, I need to look at the kernel panic message, what does it show? What part of the kernel caused it?
I remember Linux used to display it, like a Windows Bluescreen, but it does not anymore. How can I switch this on?
Without this data, troubleshooting doesn't have a good chance of success.

Emerson 12-02-2015 09:46 PM

If you keep tail -f /var/log/messages running in a foreground terminal window ... maybe you can see some clues.

browny_amiga 12-03-2015 02:25 PM

Quote:

Originally Posted by Emerson (Post 5458951)
If you keep tail -f /var/log/messages running in a foreground terminal window ... maybe you can see some clues.

Hmm, that's something to try.
Thanks for the tip.

browny_amiga 12-03-2015 02:51 PM

Is there a mechanism in Linux to note a crash like this? Where the system notifies the user that there has been a kernel panic and that the system restarted? So far I have not seen a thing and the log does not show anything, not even a mention of a panic, which sounds a little criminal to me:
You might have a server that crashes, restarts automatically and you just have data loss and nobody will ever know about it that it panicked. There must be at least a note in the log.

Emerson 12-03-2015 03:18 PM

I'm not convinced you have kernel panic. It still can be a power issue. There are power supplies on motherboard to supply 12 V, 5 V, 3.3 V. One of these may be suffering from old age. Besides, kernel panic usually results in system hang, not reboot. Unless you have watchdog enabled, it will reboot, but not instantly.

bimboleum 12-03-2015 05:25 PM

Hi,
Suspect the fan ... t61's are notorious for having fan problems as they age ... fixing them is not rocket science .. google thinkpad t61 hardware manual .. there are three of them depending on your model .... fans/cooler assemblies are about $30 on ebay.

As always YMMV.

cheers
pete

browny_amiga 12-04-2015 07:30 PM

1 Attachment(s)
Not a fan problem, not a CPU overheating problem, not a GPU overheating problem. I just checked, used CPU burn and glxgears and run it for 20 minutes, nothing. And that on battery. The system usually crashes of normal use much before that.
For some reason, it does not seem to crash if you just let it sit down. Moving windows around and opening and closing programs, basically working with it, makes it crash 100 sure.
If you don't believe me, please see the screenshot, Fans work really well on this laptop, better than any other I have ever seen, the Laptop is now almost 10 years old and you have to strain your hearing to even hear them spin, not one of these laptops that you think it is going to take off like an airplane when you work with it.

browny_amiga 12-04-2015 07:36 PM

Quote:

Originally Posted by Emerson (Post 5459359)
I'm not convinced you have kernel panic. It still can be a power issue. There are power supplies on motherboard to supply 12 V, 5 V, 3.3 V. One of these may be suffering from old age. Besides, kernel panic usually results in system hang, not reboot. Unless you have watchdog enabled, it will reboot, but not instantly.

I please ask you guys to read what I'm writing, I explicitly mentioned that the system hangs for a while and then restarts, it does not restart instantly.
I just noticed that people keep mentioning things that were clearly written or ruled out via my posts.

browny_amiga 12-04-2015 07:39 PM

One thing that I find shocking is that Linux does not have any crash reporting for Kernel Panics, the System is not even aware of it happening. A server could crash 20 times in a row, always restart and nobody would even know, you check the logs, NOTHING mentioned in there. That is really pathetic. I'm a big fan of Linux and use it on most of my systems, but this is really terrible.
If stuff screws up so bad, you want to at least know about it.

Emerson 12-04-2015 08:51 PM

You never wrote you have watchdog enabled. Too bad people ask questions and do not read answers.

I still think this is an hardware issue, probably graphics part is going haywire. I still do not believe you have kernel panic.


All times are GMT -5. The time now is 07:45 PM.