LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices

Reply
 
Search this Thread
Old 06-14-2012, 04:03 PM   #1
clifford227
Member
 
Registered: Dec 2009
Distribution: Slackware 14
Posts: 282

Rep: Reputation: 64
New kernel - computer freezing - boot time CPU error message


Hello,

A few days ago I was messing around with slackpkg for the first time. Along with various other packages, I upgraded the kernel, unfortunately my computer has started freezing when programs max out the CPU, and Im also getting the following error (but only occasionally) in the boot messages:

CPU 0: Machine Check Exception: 004
Kernel panic -not syncing: CPU context corrupt.

I guess there's a strong possibility that this isnt just a coincidence, and that the new kernel has screwed something up.

Am I OK just upgradepkg-ing back to my previously installed kernel? (which was an offical Slackware packaged kernel 2.6.27.31 smp i686)

Last edited by clifford227; 06-14-2012 at 04:06 PM.
 
Old 06-14-2012, 05:08 PM   #2
guanx
Senior Member
 
Registered: Dec 2008
Posts: 1,014

Rep: Reputation: 146Reputation: 146
Do not overclock any part of your system.

Clean your heatsinks. If that doesn't work, get a new computer.
 
Old 06-14-2012, 10:33 PM   #3
ReaperX7
Senior Member
 
Registered: Jul 2011
Distribution: LFS-SVN, FreeBSD 10.0
Posts: 3,197
Blog Entries: 15

Rep: Reputation: 826Reputation: 826Reputation: 826Reputation: 826Reputation: 826Reputation: 826Reputation: 826
Did you run lilo after you installed the new kernel?
 
Old 06-15-2012, 03:58 AM   #4
ReggiePerrin
LQ Newbie
 
Registered: May 2010
Posts: 9

Rep: Reputation: 0
I would try running the OS from the dvd drive using the "live disc", as it is called.
This is (usually) the same Linux install disc. Instead of installing, just run the OS from there, and try to max out the CPU as before. See if the same symptoms happen.
You won't have the same programs installed as in your own final install, but you should be able to "max out" your cpu at any rate.
This is part of a way of diagnosing what has gone wrong with a system.

Can you post here your system specifications, please?
It is a good thing to know if your system is too lowly specified to run a modern operating system, or there is a hardware issue (drivers inadequate or not available, etc).

If it was running OK, and you simply updated, and the problems suddenly occurred, this would tend to indicate (but not rule out) a software problem and not a sudden coincidental hardware failure, like a PSU or RAM.
There are many things which can cause your stated symptoms. I would definitely clean the heat sinks (do NOT use a vacuum cleaner!).
Modern OSs are usually quite able to run with the cpu "maxxed out". All that happens is that the system slows down somewhat, especially if you dont have much RAM.

Hope this helps.
 
Old 06-15-2012, 09:43 AM   #5
Perromuerto
LQ Newbie
 
Registered: Oct 2009
Location: Venezuela
Distribution: Slackware
Posts: 7

Rep: Reputation: 2
Unhappy A description in wikipedia

There is a detailed explanation in wikipedia:

http://en.wikipedia.org/wiki/Machine_Check_Exception

Essentially your hardware is toasted!
 
Old 06-15-2012, 04:30 PM   #6
Ratamahatta
LQ Newbie
 
Registered: Feb 2012
Location: Koblenz, Germany
Distribution: aptosid linux
Posts: 16

Rep: Reputation: 2
I support ReggiePerrins post. Here are just some additions.

On my old openSUSE 10.3 I had a packet installed that was called "lmsensors". That may help to actually check whether it really is a heat problem or not.

You might want to check the syslog after this happened. /var/log/messages.X or /var/log/kern.log.X (this file doesn't exist on openSUSE 10.3, but on current aptosid and ubuntu) contain logs (dmesg and more) from previous sessions. The kern.log.X often contains quite detailed information (e.g. "Null pointer dereference" and a stack trace).

The page linked by Perromuerto mentions that it might be a software bug or wrong kernel architecture just as well. So double check that your CPU architecture matches the kernel's! If you tried the things ReggiePerrin suggested and the architectures match, I'd say do go back to the old kernel if you can.

Last edited by Ratamahatta; 06-15-2012 at 04:49 PM. Reason: Clarified some things.
 
Old 06-16-2012, 09:00 AM   #7
purevw
Member
 
Registered: Jan 2007
Location: Texas
Distribution: OpenSuSE 13, Kernel 3.13.3
Posts: 88

Rep: Reputation: 43
It does sound like a possible heating problem. As far as your system being too old to run a modern OS, that can cause your system to take forever to do a job (at which point you should consider a new computer), but it should not cause an MCE. lmsensors can give you temperature info as stated by Ratamahatta, assuming that your CPU is modern enough to have temp diodes. hddtemp can give your hard drive temps, if the drives are SMART capable. Gkrellm is offered by most flavors of Linux also. It runs on your desktop and can give you a real time visual of everything your system is doing, what the temps are, and what PIDs are using the most resources, network, hard drive throughput, RAM, etc.. It is a fairly small window that can be placed out of the way. It runs on my system 24/7.

The following advice involves you getting inside your computer and working. If you are not comfortable with that, then you should find someone who is.

The CPU cooler, as well as all other coolers and power supply should be cleaned at least every few months. I use an air compressor, but compressed air from a can can work also. Do not allow the compressed air to spin the cooling fans as the fans can fly apart or bearings can be damaged if spun too quickly. After blowing it out, allow it to set for a couple of hours. If your compressor does not have a moisture trap, there is a chance that tiny amounts of water were sprayed on the components. It will need time to dry.

One last thing to consider is the thermal compound or pad below the CPU cooler, and the cooler itself. Many manufacturers would let you think that their product will last forever, but this is never the case. I replace my thermal compound every couple of years.
 
Old 06-22-2012, 06:46 AM   #8
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269
First thing to do would be switch to the older, good kernel. I would also run memtest86 to be sure it's not a RAM issue.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Error message at the time of system boot rohitchauhan Linux - Newbie 3 04-27-2011 11:01 AM
no /boot/loader and no /boot/kernel/kernel error message on a free bsd install chownuseradd Linux - Newbie 1 02-09-2008 08:05 PM
Kernel Panic Error Message on boot for FC5 gotrojan Fedora 6 02-07-2007 12:40 PM
my computer keeps freezing kernel panic please help 1702fp Debian 3 03-18-2005 10:50 PM
Computer freezing while booting kernel in Grub anorman Linux - Software 2 05-03-2004 11:52 AM


All times are GMT -5. The time now is 03:22 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration