LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Laptop and Netbook
User Name
Password
Linux - Laptop and Netbook Having a problem installing or configuring Linux on your laptop? Need help running Linux on your netbook? This forum is for you. This forum is for any topics relating to Linux and either traditional laptops or netbooks (such as the Asus EEE PC, Everex CloudBook or MSI Wind).

Notices


Reply
  Search this Thread
Old 03-23-2007, 07:25 PM   #1
sancho
Member
 
Registered: Sep 2003
Distribution: Ubuntu 9.04/9.10 (64-bit)
Posts: 149

Rep: Reputation: 15
FC6 Won't Boot: "Critical Temperature Reached (128 C), shutting down"


This is in regards to my Compaq V2000 notebook PC (AMD Turion64 3800+, 1GB RAM) running Fedora Core 6 (x86_64). Shortly after the kernel is loaded--and before the X server starts for graphical startup--I will see a kernel message that says:

Code:
Critical temperature reached (128 C), shutting down.
Naturally, it follows through and does indeed shut down. The problem is that the CPU is nowhere near that temperature--in fact, I've booted into Windows XP and run some temp/fan monitoring software to check it.

In fact, this is a "sometimes but not always" problem. I have used this laptop on a daily basis for months now and noticed that this has gone from not happening at all; to happening sometimes and being annoying (i.e. if I'm persistent and try to start it 5 times, it finally works); to now it won't boot at all (i.e. 15+ reboots and it still won't go).

Another interesting trait: once the machine gets past init and into GDM, I will never have any problems with it until I shut it off. I can use it for 3+ hours until the battery dies and never have any such problems. So it's just a small window of time (before init?) when this occurs.

Last tidbit: This only happens in Fedora. This laptop triple boots Windows XP (i386), Fedora Core 6 (x86_64), and Ubuntu Edgy (x86_64)--and none of these other installs have had any problems with this whatsoever.

I have been able to locate a kernel bug (Bug 3584) which shows that other people are having the problem, too, and that kernel devs are aware of the problem. However, I'm wondering why this only happens in FC6.

Does anybody know of any workaround to this? Am I going to have to recompile my kernel and, if so, how can I go about doing that on a non-Fedora box (since all of my desktops are Ubuntu Edgy)?

Thanks
 
Old 03-23-2007, 08:14 PM   #2
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Does the FC6 machine use lm_sensors or a similar tool for monitoring hardware sensors?

I don't know much about FC6 except what I read around here, and that seems to be a load of bugs however I am wondering if either the sensors monitoring program OR the kernel, is mis-configured and is either monitoring the wrong hardware sensor, or the calculations in the sensors.conf file are out of whack, causing weird results???

If by some stroke, this happens to be along the lines of the problem, you could adjust or comment out the offending calculation or sensor in the config file as a temporary workaround.
In reality, I think it's safe to say that if a CPU reached 120'C it would be dead..
 
Old 03-23-2007, 08:32 PM   #3
inspiron_Droid
Member
 
Registered: Dec 2006
Distribution: Debian (Wheeze)
Posts: 391

Rep: Reputation: Disabled
I'd suggest that you open up your tower and do a through vacuuming job, including removing the heat sink and fan assembly off of your processor and blowing some compressed air through the fins on the heatsink as they could be really dusty. You will need a tube ove heat transfer jell which is avaliablre at your local radioshaq or other electronics parts store that is also where you will find the compressed air.
 
Old 03-23-2007, 10:28 PM   #4
sancho
Member
 
Registered: Sep 2003
Distribution: Ubuntu 9.04/9.10 (64-bit)
Posts: 149

Original Poster
Rep: Reputation: 15
Funny you should mention that, GrapeFruiTgirl: In fact, after skimming through the kernel bug report, that's almost exactly what is goning on. Although I don't fully understand the issue (and certainly can't help to resolve it at the kernel level), lm_sensors and ACPI are conflicting on my system and causing errorneous results. One of the kernel devs doesn't seem to have good things to say about ACPI, but it also seems to be a "necessary evil" that cannot be eliminated as long as hardware is still using it.

As an interim solution, I've added the parameter "acpi=off" the GRUB kernel line to disable ACPI. The machine boots, but I got a feeling this isn't going to be good for battery power. Not only that, but I can't even monitor the charge any more. So I'm still looking for a better solution; but at least I can boot.

flanksteak: Thanks for the suggestion, but this is a laptop and the cooling hardware definitely is working fine. The other 2 OS installations on this machine don't have any problems at all.
 
Old 03-23-2007, 11:17 PM   #5
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
@ Sancho: I have done some work on my own lm_sensors setup, so I am comfortable with that, but as for ACPI, while I know 'what' it does, I don't know enough about it to make any recomendations about how to use it AND circumvent this problem. I can say I have read many threads and different websites and bug reports I have come across about many devices which don't work right alongside ACPI.
Nice move turning ACPI=off. I wonder, do you think that machine could use the deprecated APM system instead of ACPI, to do the same job, atleast as far as allowing you to monitor the battery. I use lm_sensors on my machine, and compiled in the necessary sensor device into my kernel, but I really don't know if it is infact ACPI that allows me to read the data. Perhaps I will disable ACPI either in LILO or in my BIOS, and see if my sensors still are readable.
Perhaps this is a solution -- you will need to recompile the kernel if so. Can you do that, or is it beyond your knowledge? It's not too big a deal generally. I don't know MEPIS, but I can help with the kernel if it comes to it.
I gotta get to bed. Will check in tomorrow and see what's what.
 
Old 03-24-2007, 12:59 PM   #6
sancho
Member
 
Registered: Sep 2003
Distribution: Ubuntu 9.04/9.10 (64-bit)
Posts: 149

Original Poster
Rep: Reputation: 15
Quote:
Originally Posted by GrapefruiTgirl
I wonder, do you think that machine could use the deprecated APM system instead of ACPI, to do the same job, atleast as far as allowing you to monitor the battery.
I'm not sure, but given that it's less than 2 years old, I would hope it's not using anything "depreciated"! What I do know, however, is that when I pass the parameter "acpi=off" to the kernel boot line, I can find no evidence that Fedora "thinks" it's running on a laptop anymore. For example, the battery charge reports 0% even when running on batteries (as per the GNOME battery applet) and the GNOME Power Management Preferences no longer has a "Running on Battery" tab (only a "Running on AC" tab).

As for recompiling the kernel: I'm familiar with this process when using the "stock" kernel source tarball (i.e. from kernel.org). However, I've found that this gets me into nothing but trouble when using such a kernel in a modern distro. Usually I run into all sorts of problems at init with modules not being able to load and such. Also, when it comes to using third-party kernel modules (i.e. such as for display drivers from livna.org or ndiswrapper), these modules expect a certain "stock" Fedora RPM version number such as "2.6.20-1.1234"; however, when I recompile the kernel, I'll end up with some number different than that, causing any such modules to be incompatible.

In other words... I could recompile the kernel, but I'm afraid to.

Assuming I could circumvent the problems listed above, what are you suggesting that I do?
 
Old 03-24-2007, 01:39 PM   #7
james_jenkins
Member
 
Registered: Feb 2007
Location: Missouri - USA
Distribution: Usually Suse or SLED, usually.
Posts: 35

Rep: Reputation: 15
By chance have you checked for BIOS updates? If you perform a few billion of them you will notice that "Power Management" is one of the top reasons for them if you read the change logs. On another note, have you tried resetting the BIOS back to it's defaults? I have seen many extended battles with machines caused by one jacked up setting in the BIOS.

As far as the other two OSs not having a problem, that is a moot point. Just because an issue doesn't seem to affect them doesn't mean that you don't have it. It is good diagnostic information, but it isn't absolutely conclusive.

And yes, wrong settings in the BIOS can cause unpredictable, and intermittent glitches. A logical person would think that it would not, but it happens.

On the other hand, you might just have a wrong setting in a config file. Or maybe it IS just a Fedora bug, that wouldn't shock me...

James

Last edited by james_jenkins; 03-25-2007 at 11:41 PM.
 
Old 03-24-2007, 03:33 PM   #8
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
LOL, well, re: the OP's last line of his last thread, *IF* you were to recompile the kernel (and it may not be worth it, I don't know) I would try using the APM power management routines in the kernel, rather than the ACPI system. It's generally an either/or situation. I have used APM on my machine when I first started into Linux, and it worked just as well as ACPI, with the advantage that there were more places I could use power-off and standby modes, like for example: Using APM, I could use the Monitor standby/shutdown when configured from my screensaver. Now I am using ACPI, and those settings don't shut the monitor off, it only blanks it after a while; I have to use DPMS in xorg.conf to manage the shutdown of the monitor.
Besides something like that, I believe the 2 systems are working to the same end (and of course I stand to be corrected by someone more knowledgeable).
The other obvious difference, while not really a 'functional' one, is that if I compile APM into the kernel, I get a warning that I am using a 'deprecated' power management function., Now, IMO, it may be 'deprecated' by someone's standard, but if that's what my particular hardware responds to and likes, then it isn't too deprecated after all.
The only reason I switched to ACPI was because I leanred that for me, it does the same thing, with the exception of the monitor shut-off, and because I don't like the deprecated message from the compiler .
And to comment to James above: LOL, I agree, while I know very little about Fedora, I am sure glad I use Slackware! I've seen more really-screwed-up-issues from users of Fedora X than any other one OS, around here.
 
Old 03-24-2007, 04:59 PM   #9
james_jenkins
Member
 
Registered: Feb 2007
Location: Missouri - USA
Distribution: Usually Suse or SLED, usually.
Posts: 35

Rep: Reputation: 15
I used RedHat up until the RedHat 9/Fedora WTF? incident. I STILL have one RH 9 server running just because it took forever to get just perfect, and it has not needed to even be LOOKED AT in YEARS. I have gone months without realizing that it was still in there running. But, when I walked, I didn't look back. One of these days it is likely to quit working, and that will be my last sad RH day.

Now, in RedHat's defense, they don't have a monopoly on the "Upgraded AND Broken"_TM department. Lately I have been noticing some things coming full circle. First we didn't have items, then we did, but not everything worked right, or at all. Next things started working much better, then they seemed to be fixed. Then there was a period of time that things were just peachy. Then a new version comes out, IE: openSuse 10.2, and all of a sudden hardware that has worked flawlessly for years, FLAT-WILL-NOT-WORK. Most noticable to me was laptop wireless cards. OTOH, cards that had never worked, or worked well, were recognized and online in about 3 seconds, out of the box. IE:Linksys 54g pcmcia card.

That is the reason I went to SLED 10 on my main ThinkPads. LONG support cycles, no quicky updates just because something new has just been announced. No six month reloads. EVERYTHING just works the way it is SUPPOSE to. I just don't have time or patience for some of the STUPID problems I have seen being released. I just mentioned network issues in openSuse 10.2. Anyone remember the cluster that "Updating" was in Suse 10.1? As far as "Power Management" goes, I would almost be scared to close the lid of my laptops after a new version came out because I didn't know what was going to happen. There were some issues a while back that would "KILL" your ThinkPad if you were one of the misfortunate ones.

Anyway, enough ranting. Personally I would like to see a feature freeze on everything and spend the next 12 months fixing bugs. I think that would do more for Linux than most anything else.

Oh, and I don't like recompiling kernels either. There, now I feel better.

James
 
Old 03-26-2007, 04:13 PM   #10
sancho
Member
 
Registered: Sep 2003
Distribution: Ubuntu 9.04/9.10 (64-bit)
Posts: 149

Original Poster
Rep: Reputation: 15
Thanks for your posts.

Based on what I'm hearing here, I think that ultimately the issue boils down to something specific to the way that the Fedora packagers choose to configure their stock kernels. It wouldn't be the first time that the Fedora folks chose to go "against the grain" and use nonstandard settings--and it would also explain why neither Ubuntu or Windows XP exhibit the same behavior.

That being said, I don't think there's going to be any way around this without recompiling the kernel. I know it's not that difficult to do, but I've never recompiled a kernel into a "distro-native" package without having at least some new annoyances. Therefore, I've decided to just switch over to Ubuntu Feisty on my laptop. I run Ubuntu on my desktops and server anyways, so it's a natural change for me.

Anyways, I know that's not much help to other people who may be having this problem, so feel free to continue the thread. Thanks again, GrapefruiTgirl and james_jenkins for your responses.
 
Old 04-15-2007, 10:14 AM   #11
jgordon
LQ Newbie
 
Registered: Apr 2007
Posts: 1

Rep: Reputation: 0
Latest Kernel Update Fixed This

I've been having this issue on and off with various kernel updates and the latest (2.6.20-1.2944.fc6) fixed it for me. I've been booting with acpi=off but now I removed it and I get my battery times back and everything. I did "yum remove lm_sensors" and probably won't add it back.

- Jeff Gordon
 
Old 04-23-2007, 06:01 AM   #12
nx5000
Senior Member
 
Registered: Sep 2005
Location: Out
Posts: 3,307

Rep: Reputation: 57
Since this morning and without changing ANYTHING (no kernel, no driver, nothing), I get this:
Quote:
Apr 23 11:39:56 debian kernel: ACPI: Critical trip point
Apr 23 11:39:56 debian kernel: Critical temperature reached (5155 C), shutting down.
Hot hot!! I will cook some eggs on my laptop now.

The laptop was off for the whole night so I think 5155degC is a bit excessive from Linux Probably 51 or 55 degrees.
The only bad thing was a hard reset yesterday due to a kernel lockup coming from the video driver. I've fscked all my disks and haven't seen anything abnormal...

I'm using 2.6.20.3 on Debian Unstable. I will try to disable this automatic shutdown but I need ACPI on if I want to investigate.
Currently writing on LQ with acpi=off...

If anybody has a link to a bug report from the kernel mailing list or any idea while I'm looking on my side, it would be much appreciated.

Regards

edit: In my case where I am sure that the warning is wrong, moving
/sbin/poweroff to /sbin/poweroff- doesn't automatically reboot anymore.
Code:
mv /sbin/poweroff /sbin/poweroff-
/!\ DON'T DO THIS IF YOU ARE SURE TEMPERATURE ARE ABOVE LIMITS!!

edit2:
after running 20 minutes without this alarm and checking physically the temperature was not increasing and the fans were working, the temperature went down from 5000 degres to 49 degres. Since then I have put back my /sbin/poweroff and rebooted, everything is find. Kinda strange.

Last edited by nx5000; 04-23-2007 at 06:38 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Lost "Shutdown" and "Restart" From system menu in FC6 Nader1 Linux - Software 3 02-12-2007 04:40 PM
"KLauncher could not be reached via DCOP" - Error in KDE itz2000 Linux - Newbie 5 06-02-2006 05:11 AM
Searching for a "S.M.A.R.T." and a temperature-fan monitor program bomberb17 Linux - Software 8 08-24-2005 02:57 PM
critical temerature reached shutting down dthancock Linux - General 1 04-28-2005 11:20 PM
nslookup gives "connection timed out; no server could be reached" hello321_1999 Linux - Networking 3 11-26-2004 11:23 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Laptop and Netbook

All times are GMT -5. The time now is 11:52 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration