Kernel message - bad memory? bad cpu?
Slackware 14 64 bit, amd phenom 6 core 3200MHz cpu.
The box has a tendency to kernel panic or freeze on occasion. I was getting ready to look at it, and it was just sitting idle on my desk, when I got this in the console: Message from syslogd@ook_Winbloze at Sun Mar 3 21:13:15 2013 ... ook_Winbloze kernel: [ 897.903094] [Hardware Error]: CPU:2^IMC1_STATUS[-|CE|-|-|AddrV]: 0x9400000000000151 Message from syslogd@ook_Winbloze at Sun Mar 3 21:13:15 2013 ... ook_Winbloze kernel: [ 897.905679] [Hardware Error]: ^IMC1_ADDR: 0x000000000068edd0 Message from syslogd@ook_Winbloze at Sun Mar 3 21:13:15 2013 ... ook_Winbloze kernel: [ 897.908235] [Hardware Error]: Instruction Cache Error: Parity error during data load. Message from syslogd@ook_Winbloze at Sun Mar 3 21:13:15 2013 ... ook_Winbloze kernel: [ 897.910825] [Hardware Error]: cache level: L1, tx: INSN, mem-tx: IRD Is it telling me that the cache memory on the cpu die is bad? L1 cache is on the die, and parity error in the L1 cache would basically mean bad L1 memory, throw the cpu out and get another one. Anyone care to concur or otherwise comment on what it is trying to tell me? This box has a history of instability, and this would explain it if that is what is actually wrong. |
I've had these on my own 6 core Phenom II a couple of times, evidently caused by dust buildup in the CPU heat sink. If that looks clean (and as you say, the box has a history of instability) it might also be possible that the thermal grease wasn't applied evenly.
It could also be a defective CPU, but I'd look into the cooling issues first. Even if nothing looks wrong, a better quality heat sink might be all you need. |
I just had that problem a couple weeks ago. I tried everything (IIRR I had another thread about it), but nothing was stopping it. Not a cooling problem, not bad RAM, nothing I or anyone else, could figure out, so I called up AMD (it was an Athlon II X2 260) on their website and did the whole warranty thing there. I got the RMA a few days ago and Monday I'm sending it to them so they can check it out to see if it *is* the cpu. If it is, they're gonna send me a new one since it was only a couple of weeks old.
It's going to keep happening to you, I guarantee it. There will be *no* pattern. You won't be able to know when it will happen at all. Sometimes it may go a whole day and you'll never get a 'warning' window, and then some days you'll get them every 2 or 3 minutes. So, that's 2 people now that this has happened to...I'm betting just a bad batch of cpu's. |
Try running GIMPS:
http://www.mersenne.org/freesoft/#source with test option 1 to try to diagnose CPU hardware error. Yes, the program can also be used to search for prime numbers, but it has a good CPU hardware test to make sure you actually get a good prime number. I would also run memtest86+ because it could also be bad RAM. |
Quote:
This cpu is a few years old. I think I'll see if I can get a replacement, it's an am3 board, so I should be able to get one fairly inexpensively. To be continued... |
Okay, my old cpu, the one messing up like the OP's, got sent in the mail yesterday. It'll take a while before I hear anything back from AMD as they give this in their e-mail correspondence with warranty fixes such as this:
<quote>We recommend using a track able carrier, such as UPS, Federal Express, DHL, etc. Be advised that packages sent through the Postal service are directed through AMD's mail sorting facility, which can cause undue or lengthy delays in processing your RMA.</quote> I couldn't afford UPS or the others, so there's no telling how long it will take before I hear back from them. Could be 3 or 4 days, could be a couple weeks <mutter>. I'll post whatever happens when it happens. |
Maybe check your capacitors on the MoBo? All of my AMDs usually eat Motherboard caps for lunch.
|
Quote:
|
Put a new cpu in, and it's been running fine ever since. Problem solved.
|
Quote:
|
All times are GMT -5. The time now is 04:39 PM. |