Machine Check Exception 0000000000000004
We have a Dell PowerEdge 6650 with quad xeon 2.8GHz processors and 2Gb of DDR SDRam. We randomly frezze, receiving the following error:
CPU1: Machine Check Exception: 0000000000000004 Kernel panic: Unable to continue In interrupt handler – not syncing This happens approximately 3 to 4 times per week, at all different times of the day/night. I saw a lot of threads where people thought it may be memory or CPU related. At first we believed it to be a processor problem. Dell swapped out the processors and the mother board. We still had the same problems. Eventually Dell was kind enough to send us an entire new system. We still have the same problem, leading us to believe that it cannot be hardware related. Has anyone seen this problem, and been successful in fixing it? |
What are you running on this box? What kernel version?
|
i've only seen that error when the system had a bad processor. Have you tried updating the BIOS by chance? Maybe you should try loading a new Intel microcode on boot to fix a possible processor eratta. Have you tried running the system on only 1 or 2 processors?
|
We did upgrade the BIOS about three months ago. Prior to that Dell swapped the processors. Since that time the entire machine has been swapped. That's why we don't believe it is hardware related. This is the third set of processors and we still have the same problem.....
|
The same message I see when I, mostly, copy large files.
When I have logged in to x server and I try to copy large files between my hard disks or between my computers (I have local network), my computer freezes. When I havent logged in and just sending files from another computer through ftp, at the login promt I get the above error and computer freezes. Does anybody have a clue whats going on? Is there a chance that these errors happen since I use 32bit describution? My hardware: K8V, AMD 3200+ 1 GB memory 2x250 GB HD Geforce video card Mandrake 10.1 and 10.2 32bit |
I am getting the same error with a dual opteron server. The machine has two Broadcom/Raidcore 8-channel raid cards and is running Fedora Core 2 (32-bit). I'm getting the error while attempting to rsync a 1TB raid on another RH9 server to the raid on this server. It's got me completely stumped.
Originally, when both processors were in, I wasn't getting the error message, but the machine would spontaneously reboot or hang. Now, I've removed a processor and swapped processor 1 into the processor 0 slot and I'm seeing the error message. The problem seems to have just started happening in the past week. Prior to that, the server was rock solid. |
It's looking like my error was caused by a faulty cpu.
-Andy |
I resolved it!!!
The message I was seeing (Machine Check Exception: 0000000000000004, etc) was from a faulty fan cooler.
I discovered it by mistake when I took everything off my PC and put them back in piece by piece. pbs, check your fan coolers, maybe there could be the problem. |
All times are GMT -5. The time now is 04:12 PM. |