LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Hardware error? (https://www.linuxquestions.org/questions/slackware-14/hardware-error-4175578677/)

coralfang 04-30-2016 05:01 AM

Hardware error?
 
1 Attachment(s)
I'm using slackware current x86_64 here, just had this pop up in kde's tray notifications (see attached image)

Dmesg shows me this:
Code:

[24302.080277] mce: [Hardware Error]: Machine check events logged
[24302.080282] [Hardware Error]: Corrected error, no action required.
[24302.080286] [Hardware Error]: CPU:4 (15:2:0) MC2_STATUS[-|CE|MiscV|-|-|-|-|CECC]: 0x98484000010c0176
[24302.080288] [Hardware Error]: MC2 Error: VB Data ECC or parity error.
[24302.080290] [Hardware Error]: cache level: L2, tx: DATA, mem-tx: EV

Never seen this before, any ideas what could have caused it? i'm baffled. It says the "error" is corrected, but i am curious what caused it.

55020 04-30-2016 06:18 AM

The L2 cache on your cpu caught a parity error and corrected it automatically.

Possible causes:

(1) your processor just had a random glitch

(2) your processor is a bit unhappy due to low power/overheating/overclocking

(3) your processor is slowly starting to die.

Possible actions:

(1) wait and see if it happens again; if it does, use the 'mcelog' command from root to see if it shows more information; if it starts happening regularly, escalate to (2) or (3) below.

(2) check your power supply voltages in the bios (if bad, replace psu, or check mobo for bad capacitors and fix or replace it) / check your temperatures (if too high, clean the fan or replace the heatsink paste) / reduce overclock, or underclock, or increase voltage a bit, if possible.

(3) replace your processor.

coralfang 04-30-2016 09:24 AM

Quote:

Originally Posted by 55020 (Post 5538588)
The L2 cache on your cpu caught a parity error and corrected it automatically.

Possible causes:

(1) your processor just had a random glitch

(2) your processor is a bit unhappy due to low power/overheating/overclocking

(3) your processor is slowly starting to die.

Possible actions:

(1) wait and see if it happens again; if it does, use the 'mcelog' command from root to see if it shows more information; if it starts happening regularly, escalate to (2) or (3) below.

(2) check your power supply voltages in the bios (if bad, replace psu, or check mobo for bad capacitors and fix or replace it) / check your temperatures (if too high, clean the fan or replace the heatsink paste) / reduce overclock, or underclock, or increase voltage a bit, if possible.

(3) replace your processor.

Thanks, that's helpful, i'll look into that.

volkerdi 05-02-2016 10:29 PM

It is most likely a CPU fan dust bunny. That's the signal from the kernel to clean those out.

DeMus 07-30-2017 01:21 PM

I know this is an old thread but I just had the same error on my machine. I'm rendering a movie and suddenly the whole computer stopped, had to use the reset button to bring it back to life.
Heat problem is out of the question, I was looking at my Conky and it showed a CPU temperature of 39C, CPU load is around 50%, memory usage is, including Chrome, 1.3GB.
I'm using Manjaro 17.02 KDE-64 bits on a home built computer with an M5A97 Asus motherboard, an FX-8350 AMD 8-core CPU, 16GB RAM and an GTX760 videocard.
Is my CPU slowly dying on me?

volkerdi 08-01-2017 04:35 PM

I'll forgive the necropost and the fact that you aren't even using Slackware. :)

Is there a newer BIOS available for your motherboard? Once we moved to the 4.9 kernel my machine began throwing similar errors. A flash to a newer BIOS fixed this.


All times are GMT -5. The time now is 09:53 AM.