LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 06-13-2005, 08:22 AM   #1
pbs
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Rep: Reputation: 0
Machine Check Exception 0000000000000004


We have a Dell PowerEdge 6650 with quad xeon 2.8GHz processors and 2Gb of DDR SDRam. We randomly frezze, receiving the following error:

CPU1: Machine Check Exception: 0000000000000004
Kernel panic: Unable to continue In interrupt handler not syncing

This happens approximately 3 to 4 times per week, at all different times of the day/night.

I saw a lot of threads where people thought it may be memory or CPU related. At first we believed it to be a processor problem. Dell swapped out the processors and the mother board. We still had the same problems. Eventually Dell was kind enough to send us an entire new system. We still have the same problem, leading us to believe that it cannot be hardware related.

Has anyone seen this problem, and been successful in fixing it?
 
Old 06-13-2005, 08:30 AM   #2
jtshaw
Senior Member
 
Registered: Nov 2000
Location: Seattle, WA USA
Distribution: Ubuntu @ Home, RHEL @ Work
Posts: 3,892
Blog Entries: 1

Rep: Reputation: 66
What are you running on this box? What kernel version?
 
Old 06-13-2005, 10:22 AM   #3
anonobomber
Member
 
Registered: Aug 2003
Location: Seattle
Distribution: Debian, Fedora, CentOS, FreeBSD, OpenSolaris
Posts: 138

Rep: Reputation: 16
i've only seen that error when the system had a bad processor. Have you tried updating the BIOS by chance? Maybe you should try loading a new Intel microcode on boot to fix a possible processor eratta. Have you tried running the system on only 1 or 2 processors?
 
Old 06-15-2005, 06:34 AM   #4
pbs
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Original Poster
Rep: Reputation: 0
We did upgrade the BIOS about three months ago. Prior to that Dell swapped the processors. Since that time the entire machine has been swapped. That's why we don't believe it is hardware related. This is the third set of processors and we still have the same problem.....
 
Old 06-17-2005, 10:04 AM   #5
volman
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Rep: Reputation: 0
The same message I see when I, mostly, copy large files.

When I have logged in to x server and I try to copy large files between my hard disks or between my computers (I have local network), my computer freezes.

When I havent logged in and just sending files from another computer through ftp, at the login promt I get the above error and computer freezes.

Does anybody have a clue whats going on? Is there a chance that these errors happen since I use 32bit describution?

My hardware:
K8V, AMD 3200+
1 GB memory
2x250 GB HD
Geforce video card
Mandrake 10.1 and 10.2 32bit
 
Old 06-19-2005, 04:57 PM   #6
jonandy
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Rep: Reputation: 0
I am getting the same error with a dual opteron server. The machine has two Broadcom/Raidcore 8-channel raid cards and is running Fedora Core 2 (32-bit). I'm getting the error while attempting to rsync a 1TB raid on another RH9 server to the raid on this server. It's got me completely stumped.

Originally, when both processors were in, I wasn't getting the error message, but the machine would spontaneously reboot or hang. Now, I've removed a processor and swapped processor 1 into the processor 0 slot and I'm seeing the error message. The problem seems to have just started happening in the past week. Prior to that, the server was rock solid.
 
Old 06-19-2005, 09:59 PM   #7
jonandy
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Rep: Reputation: 0
It's looking like my error was caused by a faulty cpu.

-Andy
 
Old 06-26-2005, 12:33 PM   #8
volman
LQ Newbie
 
Registered: Jun 2005
Posts: 2

Rep: Reputation: 0
I resolved it!!!

The message I was seeing (Machine Check Exception: 0000000000000004, etc) was from a faulty fan cooler.

I discovered it by mistake when I took everything off my PC and put them back in piece by piece.

pbs, check your fan coolers, maybe there could be the problem.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Kernel Panic, Machine Check exception tinksmartbstupi Linux - Software 5 11-16-2005 03:18 PM
Machine Check Exception 000000000000004 AND CPU context corrupt RCbeta Linux - Hardware 1 10-08-2005 01:58 PM
kernel: CPU 0: Machine Check Exception: 0000000000000004 Toadman Linux - General 4 05-27-2005 10:52 PM
kernel:CPU0:machine check exception:0000000000000004 madhabendra Red Hat 0 06-10-2004 11:49 PM
CPU#0:Machine Check Exception karamboul Linux - Software 1 03-29-2002 10:33 PM


All times are GMT -5. The time now is 03:53 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration