LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (http://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Memory error: extended error chipkill ecc error (http://www.linuxquestions.org/questions/linux-hardware-18/memory-error-extended-error-chipkill-ecc-error-773963/)

rajivdp 12-07-2009 06:52 AM

Memory error: extended error chipkill ecc error
 
Hi All

I am getting memory error for system AMD Opteron 248(SMP) with RHEL3.
Linux Kernel version:2.4

Error Details:

kernel: CPU 0: Silent Northbridge MCE
kernel: Northbridge status 9402400021080a13
kernel: ECC syndrome bits 2104
kernel: extended error chipkill ecc error
kernel: link number 0
kernel: corrected ecc error
kernel: error address valid
kernel: error enable
kernel: previous error lost
kernel: error address 00000000f80cb060

I dont have any idea about this error.
Please help me in figure out where the problem is?

Thanks in Advance.

Regards
Rajiv

bob_day 12-07-2009 08:26 AM

It's an error in a single ECC check bit at the address in your post. It's not a data bit, but I don't think there's any way to fix it except to replace the memory module that caused it. Since it's an error in a Chipkill syndrome bit, you might temporarily "fix" it by disabling ECC checking in the BIOS, but, frankly, unless it's a dire emergency, that would be a bad idea in my view.

-- Bob Day


All times are GMT -5. The time now is 06:42 AM.