Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Yesterday I was encoding a video using the x264 codec using 4 threads so it can use all 4 cores of my CPU (Intel(R) Core(TM)2 Quad CPU Q9300 @ 2.50GHz), and sure enough it was using all 4 cores. The only problem came when the CPU temperature, as reported by lm_sensors via gkrellm, went to about 74 C, which is over the maximum temperature for this CPU of 71.4° C. At this point I stopped the encoding fearing that the CPU would lock up. However, upon further investigation I found some clues from messages and syslog:
1) At the time of the supposed overheat, there was no sign of any warning that is usually emitted by the kernel upon overheat. The CPU usually sends the kernel a special signal to tell it that it is overheating and throttling down. This never happened, so really it's unlikely there was any overheat.
2) On every boot in the syslog I get this:
Code:
Nov 15 10:37:52 demonslayer kernel: coretemp coretemp.0: Using relative temperature scale!
Nov 15 10:37:52 demonslayer kernel: coretemp coretemp.1: Using relative temperature scale!
Nov 15 10:37:52 demonslayer kernel: coretemp coretemp.2: Using relative temperature scale!
Nov 15 10:37:52 demonslayer kernel: coretemp coretemp.3: Using relative temperature scale!
What is this supposed to mean ? I've searched for this issue and there are reports that coretemp may give inaccurate temperatures, is this what it's saying ?
3) I checked the BIOS, but it doesn't give the CPU temperature only the difference between the max CPU temperature and current. Usually it is around 43, which means that if I start my computer and the lm_sensors tells me the temperature is around 45 C, this really means that the maximum temperature is 88 C, right ?
Any thoughts on the likeliness of the CPU actually overheating. I have 1x 120 mm rear fan, and 2x 90 mm side case fans, one near the HDD and the other near the CPU, and the CPU fan is the one that came with the processor. If it is really overheating, I'm willing to buy a new CPU fan or more case fans, but I doubt that it is overheating. The case itself feels cool, and the air being put out by the fans is cool as well. I also changed the arrangement of the fans several times in order to get the coolest temperature possible. Best arrangement seems to be having all fans as output fans and two input vents one in front and one on the side.
This kinda has to do with both hardware and software, but maybe more with hardware.
Unfortunately 'sensors' all give different numbers and to get the actual temperature you usually have to enter some conversion parameters for your particular hardware. I remember reading some time ago about laptops shutting down all the time because lmsensors was claiming a hight temperature, but correct configuration fixed the temperature readings (and prevented the senseless shutdown).
Where to go from there? Who knows. I would read through the data sheet for the particular sensor chip (so I have some clue what the hell I'm doing), look through the MoBo manual for any clues, and try to find magic numbers in any sensor driver package provided by the manufacturer. For example, my machine has a w83627hf - so I'd check the *.inf file to see if it has any registry entries that give away the magic numbers.
Ok, thanks for the links, but I still will not be certain of the temperatures unless I buy some type of temperature gauge, maybe infrared one.
Maybe my original question was too complex, what I would like to know is 2 things:
1) Did my CPU overheat ?
Probably not, unless the heat sink fins are clogged with dust and the fan is running slow because the bearings are seizing. If you want to check: while it's running this task, touch your fingertip to the heatsink. Is it warm, hot, or did you burn your finger? If you didn't burn your finger it's probably fine.
Probably not, unless the heat sink fins are clogged with dust and the fan is running slow because the bearings are seizing. If you want to check: while it's running this task, touch your fingertip to the heatsink. Is it warm, hot, or did you burn your finger? If you didn't burn your finger it's probably fine.
lol, great test
CPU burns finger = CPU overheat
CPU does not burn finger = CPU working normally
well, I'll try it, I guess 60 C won't burn my finger but 80 C or 90 C might.
I do clean the case of dust on a regular basis. This box is quite new, less than 6 months old, so there isn't much dust at all. The fan is new so it should be in working order, I made sure several times that it fits tightly over that CPU as I had problems installing it, but I followed all instructions as stated in the manual, it's just that it took some pressure to attach the fan. Still it is very securely fastened as it should be.
Also, I know from past experience that the kernel emits a warning when CPU overheats and starts to throttle itself down (for Pentium 4), does this happen for newer Intels too, like Core 2 ? I'm assuming it should, right ?
Ok, thanks for the links, but I still will not be certain of the temperatures unless I buy some type of temperature gauge, maybe infrared one.
An external temperature gauge cannot give you core temperature; core temperature is considerably higher.
An infrared thermometer won't even give you the correct reading for external temperature unless you know the emissivity of the surface you're looking at and the temperature of the background. One trick I used to do is heat up two pans until they're hot enough to fry an egg. One pan has very low emissivity and the other has an emissivity of >0.8. The two pans are heated to the same temperature but on an infrared camera one looks 'cooler' than the other. If you point an IR thermometer at each pan you get the same results - one appears to be cool enough to touch (definitely isn't) and the other one is hot hot hot. Then I crack an egg over each pan just to show that they're really both hot (in fact, both about the same temperature).
I did a test and I think I may understand what is going on. I went into the BIOS and left the computer on idle for a while until all the temperatures equalized, I got 43 C for the processor thermal margin which is the degrees left to the threshold temperature of 75 C (according to board manufacturer), which means that the actual CPU temperature is 32 C. Kinda low, but it may be accurate. There is also a motherboard temperature on there of 41.5 C for the CPU voltage regulator, this may be a threshold value, I'm not sure. Then I left it on idle in Linux and I got an average of 46.5 C. So, I think this means the temperatures are off by at least 14.5 C. This thermal margin stuff is kinda weird, here's the best explanation I found (from Intel, the board manufacturer):
Quote:
Please keep in mind that the new Intel(R) processors, like the Intel(R) Core(TM) 2 Quad processor Q6600, are using what we call the thermal margin.
As its name states this is no longer a measurement of the temperature level the processor is running at; it is actually the total temperature left before the processor reaches its maximum recommended temperature or thermal design. The shorter the margin is the closer the processor gets to its thermal limit. If by any chance the thermal margin reaches 0 degrees Celsius the system should still not freeze but it will alarm you of overheating problems with in the processor area.
For instance, if a processor thermal spec is 60 degrees Celsius and the thermal margin reports only 20 degrees Celsius, it means that the actual processor is only running at 40 degrees Celsius.
Tex, I just have to add that if I had a machine for which I had to monitor the processor temps, it would be more of a relief than not if the !@#%! thing just blew up.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.