LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 12-06-2014, 12:18 PM   #1
Slacktivist
LQ Newbie
 
Registered: Oct 2014
Distribution: Slackware
Posts: 11

Rep: Reputation: 6
CMCI Storm Detected and MCE Hardware Error


Computer Specs:
Quote:
BIOS Information
Vendor: American Megatrends Inc.
Version: 1303
Release Date: 05/31/2010
BIOS Revision: 8.15
Motherboard: ASUS P7H55-M PRO
Power Supply: Seasonic 55-520GB (520 W)
Processor: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
Communication controller: Intel Corporation 5 Series/3400 Series Chipset HECI Controller (rev 06)
PCI bridge: Intel Corporation Core Processor PCI Express x16 Root Port (rev 12)
Host bridge: Intel Corporation Core Processor DRAM Controller (rev 12)
GPU: NVIDIA GeForce GTX 460
CD/DVD/BD: iHES108 2 (Power has been disconnected to simplify troubleshooting)
HDD: Samsung 204UI
RAM: CORSAIR XMS 3 DDR3 1333Hz - 2 double bank dimms at 2048MB each
Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller (rev 03)
Peripherals attached: USB Keyboard and USB mouse. (I had a bluetooth dongle, keyboard and mouse attached but I have temporarily disconnected to simplify things)
BACKGROUND
So I have a Slackware 14.1 media center and server at my in-laws (killing their internet and not mine). Two days before the Thanksgiving holiday the computer went offline during a power outage. I will admit that for a few months prior to this I would note that the second hard drive (Western Digital) was acting up and would intermittently freeze the computer when data was being copied.

I arrived at my in-laws for the holidays and discovered upon booting that the Hard drive smart status was bad on the Western Digital hard drive. I removed the western digital drive, booted off the Samsung and found that I was getting a "MCE hardware error" and a report of a "CMCI storm detected"

My /var/log/messages has an endless loop of CMCI Storm errors
Quote:
Dec 6 13:34:38 livingroomtv kernel: [ 5475.329597] CMCI storm detected: switching to poll mode
Dec 6 13:35:08 livingroomtv kernel: [ 5504.887601] CMCI storm subsided: switching to interrupt mode
I will output /var/log/messages to the following pastebin:
http://pastebin.com/SxRX0jg7

You will notice towards the end of the file that this CMCI storm is being reported several times a second.

My syslog seems to be reporting an equal number of errors of the following:
Quote:
hid-generic 0003:0A5C:4502.0005: can't reset device, 0000:00:1d.0-1.5.1/input0, status -32
Here is my /var/log/syslog pastebin
http://pastebin.com/8nGExtEr

/var/log/dmesg has an MCE hardware error, CMCI Storm and ACPI warning
Quote:
[ 3.809619] mce: [Hardware Error]: Machine check events logged
[ 3.809843] mce: [Hardware Error]: Machine check events logged

[ 3.838073] devtmpfs: mounted
[ 3.839240] Freeing unused kernel memory: 1272k freed
[ 3.839605] Write protecting the kernel read-only data: 16384k
[ 3.840932] Freeing unused kernel memory: 552k freed
[ 3.842410] Freeing unused kernel memory: 792k freed
[ 3.881006] CMCI storm detected: switching to poll mode
[ 4.593463] loop: module loaded
[ 4.736720] udevd[194]: starting version 182
[ 5.441489] microcode: CPU0 sig=0x20652, pf=0x2, revision=0x9
[ 5.455270] microcode: CPU1 sig=0x20652, pf=0x2, revision=0x9
[ 5.455485] microcode: CPU2 sig=0x20652, pf=0x2, revision=0x9
[ 5.455749] microcode: CPU3 sig=0x20652, pf=0x2, revision=0x9
[ 5.456025] microcode: Microcode Update Driver: v2.00 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
[ 5.699829] ACPI Warning: 0x0000000000000828-0x000000000000082f SystemIO conflicts with Region \PMRG 1 (20130328/utaddress-251)
[ 5.700289] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

[ 5.700649] ACPI Warning: 0x0000000000000540-0x000000000000054f SystemIO conflicts with Region \GPS1 1 (20130328/utaddress-251)
[ 5.701107] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[ 5.701466] ACPI Warning: 0x0000000000000530-0x000000000000053f SystemIO conflicts with Region \GPS1 1 (20130328/utaddress-251)
[ 5.701921] ACPI Warning: 0x0000000000000530-0x000000000000053f SystemIO conflicts with Region \GPS0 2 (20130328/utaddress-251)
[ 5.702374] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[ 5.702734] ACPI Warning: 0x0000000000000500-0x000000000000052f SystemIO conflicts with Region \GPS1 1 (20130328/utaddress-251)
[ 5.703188] ACPI Warning: 0x0000000000000500-0x000000000000052f SystemIO conflicts with Region \GPS0 2 (20130328/utaddress-251)
[ 5.703641] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
Here is the entire /var/log/dmesg pastebin
http://pastebin.com/iGNvRfEX



What I've Tried So Far:
I'm not all that familiar with hardware problems but here is what I have done:
-Physically inspected the motherboard - no swollen capacitors or any obvious problems there.
-I have disconnected the bluetooth dongle and CD/DVD/BD drive to simplify trouble shooting a bit.
-I have tried swapping out the RAM with some spare Crucial DDR3 240 pin 1333Hz RAM - This RAM is known to be functional and comes from a recent RAM upgrade of another computer.
-I ran check disk from gparted on the current Samsung HDD without problems.
-I have changed the SATA HDD cable
-I noted that the CPU was 150-160 F on all 4 cores after a fresh boot so I replaced the arctic silver and it boots up at under 100 degrees F on all 4 cores.

Last edited by Slacktivist; 12-06-2014 at 01:55 PM.
 
Old 12-06-2014, 09:59 PM   #2
Slacktivist
LQ Newbie
 
Registered: Oct 2014
Distribution: Slackware
Posts: 11

Original Poster
Rep: Reputation: 6
Well I unplugged my front panel lcd display/usb/audio jack ports the following human interface device error resolved.

Quote:
hid-generic 0003:0A5C:4502.0005: can't reset device, 0000:00:1d.0-1.5.1/input0, status -32
I still have the ACPI and CMCI Storm errors however.
 
Old 12-07-2014, 02:48 PM   #3
Slacktivist
LQ Newbie
 
Registered: Oct 2014
Distribution: Slackware
Posts: 11

Original Poster
Rep: Reputation: 6
A few additional updates:

After consulting Google and these forums it seems the ACPI error may be a benign warning that I need not get too worried about. The CMCI Storm is a different story.

So I swapped my power supply from my other computer and booted with the onboard graphics to rule out power supply problems and problems from the GPU. However, the CMCI Storm continues so I guess I've narrowed it down to my motherboard and the CPU. I'll probably just end up buying a new motherboard.

Last edited by Slacktivist; 12-07-2014 at 02:51 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
mce Hardware Error Machine check events logged - botzko Linux - Hardware 4 06-30-2014 12:55 AM
mce: Hardware error problem Micik Linux - Hardware 1 03-17-2013 04:31 AM
LXer: why Linux MCE is superior to windows MCE LXer Syndicated Linux News 0 02-23-2009 09:02 PM
HOWTO: Supermicro X7db8+ MCE hardware errors Adaptec SCSI card mossy Linux - Hardware 0 09-24-2007 12:42 PM
Kernel Panic, MCE messages, and an Error Code tvynr Linux - Software 1 06-02-2005 01:57 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 11:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration