Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
The other day my box uncleanly shut down. It was easy to diagnose, because upon restarting I saw a message telling me of a processor thermal trip. So of course now I'm learning all about how to monitor my hardware and how to run a cooler box.
Aside from my learning, I want to implement a means of cleanly and automatically shutting down the box if the processor overheats. While of course I should keep an eye on the temperature monitor, I also want to be able to leave the box running unattended, secure in the knowledge nothing is going to fry in my absence. Some bells and whistles would be nice -- first try throttling back the processor to avoid the need to shut down, hibernate instead of shut down, watch the hard drive temperature too, notify the user why the box is going down -- but for now I can settle for the basics.
I've googled "+overheat +shutdown +script" but only find other users looking for the same thing I am. I must be looking in the wrong place; I can't believe that this issue hasn't been solved by someone somewhere.
I have lm_sensors and smartmontools giving me temperatures. I use Xfce, but ideally hope to find an environment-agnostic solution. I don't care if the solution is a script or a GUI, and I don't mind learning how to hack someone else's code or even roll my own.
I'm running Mandriva 2008.1, 2.6.26. I'd certainly consider moving up to a more modern release if needed; my box just works so well that so far I've seen no need to change.
Any suggestions as to what I might try to get this working?
Last edited by wpost; 05-28-2010 at 10:13 PM.
Reason: Mark as solved
"Overheating" really means "insufficient cooling".
Having the system just shut down, even gracefully when it gets too hot isn't the answer: You need to provide better cooling.
I see you live in Honduras, where it may be very hot, and humid. If you want to run a high powered PC there, you may need to invest in air-conditioning or much more powerful fans in your PC. Otherwise perhaps you can do something as simple as moving your PC to an area with better ventilation.
You are quite right. My original post is merely the third prong of a three-prong response I am taking: (1) improve cooling, (2) put the temperature and other data on my desktop where I can see it, and (3) do an automatic clean shutdown in an emergency. Points #1 and #2 are the important ones, but #3 will save my butt if (for example) my processor fan fails while the box is running unattended.
Per #1, I opened the case, blew out the dust bunnies, and used compressed air to take care of the embedded dust in the PSU and the processor heat sink that the vacuum couldn't suck off. I pulled the heat sink and replaced the dried out old thermal paste. I rerouted the cables in the case to improve airflow. Noticing that the case lacked an intake fan, I installed one.
The box wouldn't reboot; the root filesystem was damaged by the unclean shutdown. No problem; Mandriva Flash + xfs_repair took care of that. Gory details here; feel free to take a look and let me know if I forgot something:
Per #2 I set up lm_sensors, hddtemp, and xfce4-sensors-plugin to watch my temperatures and fans. This is a work in progress; I am still working on a desktop monitor for hard disk temperature. Again, take a look and remind me of anything I may have forgotten to do:
Watching my sensors, I notice that the processor's temperature is now in the low 50's doing just about anything, but spikes to the 70's during video playback. My guess is that I have a video driver problem -- something I've long suspected for other reasons -- and I am looking into that.
And of course, the original subject of this thread. Once I get all this done my box should be pretty well protected.
Last edited by wpost; 05-28-2010 at 08:59 AM.
Reason: Update URL
Per #2 I set up lm_sensors, hddtemp, and xfce4-sensors-plugin to watch my temperatures and fans. This is a work in progress; I am still working on a desktop monitor for hard disk temperature.
As per your link, I use lmsensors and hddtemp to feed data to GKrellM for display. I use Xfce but chose GKrellM instead of xfce4-sensors-plugin, simply because I was using GKrellM before migrating to Xfce. For each temperature limit, GKrellM can be configured to run a command when it is exceeded.
Answering my own question, but to bring closure to this thread: I have gkrellm monitoring my processor and hard disk temperaures. If a component's temperature reaches its rated maximum, it runs a simple script I wrote that calls pm-hibernate after mailing me the output of lm_sensors and hddtemp for diagnostic purposes.