Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
edit: sorry the photos did not show rotated for some reason.
I've recently flashed my BIOS and noticed something strange. My CPU is an Intel i7 920 2.66Ghz.
That was the frequency that I had with the previous BIOS verions (v. F2)
I downoaded the new firmware from here: http://www.gigabyte.com/products/pro...?pid=3251#bios
I upgraded it from F2 to F7. I get the .exe files, unpack it with 7z and flash BIOS from a USB stick. The reason I upgraded it to F7 is that the most recent versions F8/F9 don't seem to work. I mean the QFlash in BIOS complains about incorrect file size and it won't flash BIOS.
Anyway, back to the issue, the BIOS doesn't seem very stable. Sometimes it hangs when I browse through the settings. The reason I flashed the BIOS in the first place is that apparently the old firmware had some issues with SATA3 controllers (I'm still waiting for my ssd drive). I managed to configure most of the BIOS settings as it was before apart from the CPU frequency.
I have done the memtest which didn't give me any errors. I have also done a cpu test from an Arch live cd and that's the output. To be honest I don't know how to interpret it.
I've been testing my 64-current with this configuration for a few days. It doesn't crash with normal desktop usage. I also tried to stress test it with an online bitcoin generator - the system always freezes after less than 30 seconds. Before it freezes the output of 'top' shows that java uses between 600 and 700% of CPU.
I also started testing it with kernel compilation. It compiles fine until up to -j6. I've tried j6 without gui and it completed successfully every time. When I startx and run some programs (thunderbird/firefox) it also crashes the system. With -j7 and 8 it always crashes the system.
That's the error I get: http://s1092.photobucket.com/albums/...or-compile.jpg
Please not that I've been having the "kernel hardware error no human readable mce decoding support on this cpu type" error regularly before I flashed the bios. Every now and then it just pops up on the console. Also, compiling with higher -j flags used to crash my system as well before I upgraded the BIOS firmware so it might not be directly connected. Besides I found the following information on the mcelog website: http://mcelog.org/faq.html#13
Whenever this error pop up, I'd try 'mcelog --asci' but it didn't show anything. So I'm not sure if the mcelog error is a real hardware fault or just a bug in newer kernels (as the mcelog page suggests). I also don't know if any of the things described here by me are related.
I'd like to sort it out before I install my ssd drive to eliminate any stability issues.
When I updated the bios, I download the same file from different servers to see if their md5sum is the same. What I mean is that you may have lowered corrupted file ..
I'll need to look for answers on some gigybyte/ocz forums as it seems like an issue with incorrect BIOS settings. For 2 days I've been running the system with 'safe' BIOS settings and it can generate bitcoins and compile the kernel (-j8) at the same time without any problems. Also I haven't received any errors from mcelog yet. According to the wikipedia entry, machine exception errors may result from overclocking, but my previous BIOS settings had RAM frequency BELOW the manufacturer's specs and CPU was exactly 2.66GHz (which is an advertised frequency for i7 920). Mind you, I don't know anything about voltages and a great number of other settings in BIOS so I guess it wasn't optimally configured.
I realise it's highly unlikely but has anyone got the very same specs by any chance?
I'll need to look for answers on some gigybyte/ocz forums as it seems like an issue with incorrect BIOS settings. For 2 days I've been running the system with 'safe' BIOS settings and it can generate bitcoins and compile the kernel (-j8) at the same time without any problems. Also I haven't received any errors from mcelog yet. According to the wikipedia entry, machine exception errors may result from overclocking, but my previous BIOS settings had RAM frequency BELOW the manufacturer's specs and CPU was exactly 2.66GHz (which is an advertised frequency for i7 920).
I dont think that it was due to incorrect BIOS settings.
The i7s (along with most other CPUs that have ever come out of intel factories) should be multiplier locked, in the case of the i7 920 at x 20. Your 'overclock' was from bumping the multi to x21, which shouldnt be possible. (though IIRC that is how 'turbo' mode works, by bumping the multi).
I wouldnt be suprised if that was the only symptom to some more serious problem, 58C in BIOS seems *very* high.
I'd be trying to reflash the BIOS myself.
BTW, going from 2.66GHz to 2.8GHz shouldnt cause any serious problems in itself. If there was some other hidden/non-obvious problem (like the CPU vlotages being too high) that could cause problems.
Quote:
Originally Posted by afreitascs
When I updated the bios, I download the same file from different servers to see if their md5sum is the same. What I mean is that you may have lowered corrupted file ..
I know, double posting is bad. I would have just edited my last post but I dont think that poster/the OP get updates if I just edit a post.
Anyway, high CPU temps in BIOS, and a multi that makes me think you are running turbo? I'd add both of them together, and that sure looks like your CPU is under load in BIOS.
I feel really silly for not getting this yesterday. I'll blame the head injury I gave myself a few days ago.
I know, double posting is bad. I would have just edited my last post but I dont think that poster/the OP get updates if I just edit a post.
Anyway, high CPU temps in BIOS, and a multi that makes me think you are running turbo? I'd add both of them together, and that sure looks like your CPU is under load in BIOS.
I feel really silly for not getting this yesterday. I'll blame the head injury I gave myself a few days ago.
I managed to install the latest BIOS firmware (F9a). It hasn't changed much. Yes, you were right I was running it in a turbo mode. I switched it back to the standard mode and it shows as 133x20 now.
Now I'm inclined to believe that there's something wrong with my RAM. The only variable that makes some difference in BIOS is the RAM multiplier. If I set it to 14 (=1866MHz) as I used to run it for 2 years, the cpu temperature in bios fluctuates between 55-58C and the computer crashes when doing some memory intensive tasks. When, however, I change the multiplier to 9 (1066Mhz) the temperature in bios drops to around 46C and it doesn't crash when compiling/generating bitcoins. The DRAM voltage is set to 1.64V (the specifications for the RAM show 1.65V, but I have never been able to set it in BIOS to exactly 1.65 - I can do 1.64 or 1.66, but there's a red warning next to 1.66 so I've never tried it. I have done the memtest86 and it didn't show any errors.
Now I'm inclined to believe that there's something wrong with my RAM. The only variable that makes some difference in BIOS is the RAM multiplier. If I set it to 14 (=1866MHz) as I used to run it for 2 years, the cpu temperature in bios fluctuates between 55-58C and the computer crashes when doing some memory intensive tasks. When, however, I change the multiplier to 9 (1066Mhz) the temperature in bios drops to around 46C and it doesn't crash when compiling/generating bitcoins. The DRAM voltage is set to 1.64V (the specifications for the RAM show 1.65V, but I have never been able to set it in BIOS to exactly 1.65 - I can do 1.64 or 1.66, but there's a red warning next to 1.66 so I've never tried it. I have done the memtest86 and it didn't show any errors.
Two things:
1. The maximum voltage for RAM on the Intel i7 for socket 1366 is 1.65V, Intel warns that the memory controller may be damaged when going higher.
2. AFAIK, the clockspeed of the memory controller on that CPU is directly related to the clockspeed of the RAM. If I remember correctly, the multiplier for the Uncore-part of the CPU (which, besides other things, contains the memory controller) must always be double the value than the memory multiplier. Intel states that the maximum speed for memory for that CPU is DDR3-1066. Running that memory controller with 1866 settings is a massive overclock for the meory controller, which may have lead to a degradation in your memory controller. But it may also be that you simply forgot to adapt the Uncore multiplier and that causes your issues. The default value on an i7 920 for that multiplier is 16, which resembles 2133 MHz (16*133MHz), which is exactly what you need for DDR3-1066. To run your machine with DDR3-1866 you logically have to set that to 28 (3732 MHz, 174% of normal clock speed).
Two things:
1. The maximum voltage for RAM on the Intel i7 for socket 1366 is 1.65V, Intel warns that the memory controller may be damaged when going higher.
2. AFAIK, the clockspeed of the memory controller on that CPU is directly related to the clockspeed of the RAM. If I remember correctly, the multiplier for the Uncore-part of the CPU (which, besides other things, contains the memory controller) must always be double the value than the memory multiplier. Intel states that the maximum speed for memory for that CPU is DDR3-1066. Running that memory controller with 1866 settings is a massive overclock for the meory controller, which may have lead to a degradation in your memory controller. But it may also be that you simply forgot to adapt the Uncore multiplier and that causes your issues. The default value on an i7 920 for that multiplier is 16, which resembles 2133 MHz (16*133MHz), which is exactly what you need for DDR3-1066. To run your machine with DDR3-1866 you logically have to set that to 28 (3732 MHz, 174% of normal clock speed).
Tobi, I think you've nailed the problem. I did not think that RAM clockspeed can be limited by the CPU. It makes sense now. The uncore multiplier has always been set to auto, which as I just checked defaults to 16 (giving 2133MHz), as you said.
I hope I haven't done much damage to my hardware by overclocking it for 1.5 year.
Tobi, I think you've nailed the problem. I did not think that RAM clockspeed can be limited by the CPU. It makes sense now. The uncore multiplier has always been set to auto, which as I just checked defaults to 16 (giving 2133MHz), as you said.
I hope I haven't done much damage to my hardware by overclocking it for 1.5 year.
The 'tweaky' 'overclockers' RAM has always been more fiddly than 'normal' RAM. I've worked on a number of machines owned by friends with high rated DDR2/DDR3, and when I've checked the BIOS I've found that they are running at DR2-667/800 or DDR3-1333, not the much faster speed the RAM is rated at.
Its mainly a SPD problem.
I'd doubt you've done any damage to your memory controller, I know people who have pushed just as hard as that for longer periods. I'd be far more worried about the whole 'CPU under load in BIOS' than overclocking the memory controller.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.