LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 01-27-2017, 10:02 AM   #1
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Rep: Reputation: Disabled
[Hardware Error]: System Fatal error


Hi,

yesterday when my PC was starting up, before it had completed the booting process (i.e. before I got the login prompt), it rebooted autonomously. I not sure how far it went but the second time the startup was completed successfully and it has been working normal since then. When I was checking the syslogs, found these error messages:

Jan 26 08:46:10 epg-hp kernel: [ 3.839827] [Hardware Error]: System Fatal error.
Jan 26 08:46:10 epg-hp kernel: [ 3.839938] [Hardware Error]: CPU:0 (15:60:1) MC4_STATUS[Over|UE|MiscV|PCC|AddrV|-|-]: 0xfe00000000070f0f
Jan 26 08:46:10 epg-hp kernel: [ 3.840214] [Hardware Error]: MC4 Error Address: 0x00000000d0d00e50
Jan 26 08:46:10 epg-hp kernel: [ 3.840314] [Hardware Error]: MC4 Error (node 0): Watchdog timeout due to lack of progress.
Jan 26 08:46:10 epg-hp kernel: [ 3.840510] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out)

Googled a bit and found some people saying this could be RAM errors, however after ~8 hours running memtest didn find any errors.

So... What next? Any ideas of what could have caused this error??

Running Slackware64-14.2, kernel 4.4.38

Thank you!
 
Old 01-27-2017, 11:46 AM   #2
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
it might be one of them things that only take place when you're not looking.
I'd keep an eye on it and perhaps get a new set of RAM chips just in case it's going down. Or at least stash some just in case money away for it.

this may help give you a little more info on this

How to identify defective DIMM from EDAC error on Linux
 
1 members found this post helpful.
Old 01-30-2017, 09:38 AM   #3
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
Thank you BW-userx for your reply and for the link; very useful...

I just ran memtest over the weekend (48+ hours) and still no errors were found. So I guess I can only wait and monitor if it'll come again.
 
1 members found this post helpful.
Old 01-30-2017, 09:42 AM   #4
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
Quote:
Originally Posted by epg View Post
Thank you BW-userx for your reply and for the link; very useful...

I just ran memtest over the weekend (48+ hours) and still no errors were found. So I guess I can only wait and monitor if it'll come again.
You're welcome, happy to have helped.
 
Old 02-01-2017, 08:40 AM   #5
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 15.0, Slackwarearm 14.2
Posts: 1,156

Rep: Reputation: 234Reputation: 234Reputation: 234
I'm no expert on the subject, but aren't these errors related to the CPU cache and not the RAM? I agree with BW-userx that it could be a one-time thing.
 
Old 02-01-2017, 02:42 PM   #6
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
Unhappy

Thks for the feedback... Yeah, you could be right. Anyway, I tried to run mcelog to capture proper logs if this issue happens again, but unfortunately AMD cpus are not supported. :-(
 
Old 02-08-2017, 09:05 AM   #7
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
It just happened again:

[ 3.839717] [Hardware Error]: System Fatal error.
[ 3.839828] [Hardware Error]: CPU:0 (15:60:1) MC4_STATUS[Over|UE|MiscV|PCC|AddrV|-|-]: 0xfe00000000070f0f
[ 3.840069] [Hardware Error]: MC4 Error Address: 0x00000000d0d00e50
[ 3.840208] [Hardware Error]: MC4 Error (node 0): Watchdog timeout due to lack of progress.
[ 3.840406] [Hardware Error]: cache level: L3/GEN, mem/io: GEN, mem-tx: GEN, part-proc: GEN (timed out)

Same error message, same address... Is it fair to say it's a hardware issue? Any suggestions on how to troubleshoot this further??

Thank you
 
Old 02-08-2017, 09:18 AM   #8
BW-userx
LQ Guru
 
Registered: Sep 2013
Location: Somewhere in my head.
Distribution: Slackware (15 current), Slack15, Ubuntu studio, MX Linux, FreeBSD 13.1, WIn10
Posts: 10,342

Rep: Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242Reputation: 2242
Quote:
Originally Posted by epg View Post
It just happened again:

Same error message, same address... Is it fair to say it's a hardware issue? Any suggestions on how to troubleshoot this further??

Thank you
go to a store that takes returns, buy some hardware, swap it out, then see if the problem persists if yes, replace with the old, then swap out another piece of hardware with a new one then do the same.

repeat until the problem is no longer there.

take back everything that did not fix the problem and get your money back.

Last edited by BW-userx; 02-08-2017 at 09:28 AM.
 
Old 02-08-2017, 10:39 AM   #9
Skaendo
Senior Member
 
Registered: Dec 2014
Location: West Texas, USA
Distribution: Slackware64-14.2
Posts: 1,445

Rep: Reputation: Disabled
Quote:
Originally Posted by epg View Post
It just happened again:

Same error message, same address... Is it fair to say it's a hardware issue? Any suggestions on how to troubleshoot this further??

Thank you
Are you overclocking in any way?
 
Old 02-08-2017, 11:22 AM   #10
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
Not at all, no overclocking...

And the idea of changing HW until the problem disappears, I'm afraid it's not going to work for me. First, this is a company-owned laptop so I can't/shouldn't change the parts myself. And second, it's still under warranty so I'm gonna void it if I open the laptop.

I could just call warranty and see what they're gonna say, but I wanted to be sure this is indeed a HW issue...
 
Old 02-08-2017, 11:50 AM   #11
Skaendo
Senior Member
 
Registered: Dec 2014
Location: West Texas, USA
Distribution: Slackware64-14.2
Posts: 1,445

Rep: Reputation: Disabled
Quote:
Originally Posted by epg View Post
Not at all, no overclocking...

And the idea of changing HW until the problem disappears, I'm afraid it's not going to work for me. First, this is a company-owned laptop so I can't/shouldn't change the parts myself. And second, it's still under warranty so I'm gonna void it if I open the laptop.

I could just call warranty and see what they're gonna say, but I wanted to be sure this is indeed a HW issue...
Are you are using a AMD CPU?

Is input–output memory management unit (IOMMU) available in the BIOS, and is it on?
 
Old 02-08-2017, 12:00 PM   #12
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
Yes, it's an AMD CPU on an HP 745 G3. I didn't see any iommu option in bios, don't think my pc supports that.
 
Old 02-08-2017, 04:33 PM   #13
Skaendo
Senior Member
 
Registered: Dec 2014
Location: West Texas, USA
Distribution: Slackware64-14.2
Posts: 1,445

Rep: Reputation: Disabled
Quote:
Originally Posted by epg View Post
Yes, it's an AMD CPU on an HP 745 G3. I didn't see any iommu option in bios, don't think my pc supports that.
I am at a loss. I kind of think that it's a software or driver issue.
 
Old 02-08-2017, 06:23 PM   #14
glorsplitz
Senior Member
 
Registered: Dec 2002
Distribution: slackware!
Posts: 1,304

Rep: Reputation: 368Reputation: 368Reputation: 368Reputation: 368
How long you have company-owned laptop? How long laptop worked before this error started happening? Did you do something with the system since it was installed?

Check out this LINK and the link in the answer, seems to be cpu problem
 
Old 02-08-2017, 06:44 PM   #15
epg
LQ Newbie
 
Registered: Jan 2017
Posts: 23

Original Poster
Rep: Reputation: Disabled
Thank you for replying!

It's a brand new PC, got it just a couple of months ago. First time I noticed this error was around two weeks ago, when I started this thread. Yesterday it happened again... And no, no changes were done since I installed slackware.

And I had seen that link you shared, but unfortunately I couldn't run mcelog, it seems (correct me if I'm wrong) that it doesn't support AMD cpus.

Last edited by epg; 02-08-2017 at 06:45 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Fatal error in MPI_Init: Other MPI error, error stack:gethostbyname failed(errno 1) turbo67 Red Hat 1 06-15-2014 05:53 AM
Error 502 : Display Fatal Error Message, Error pushing image, dbpaCT failed! HaloCheng Linux - Newbie 1 09-12-2012 12:02 PM
FATAL system error. epsilon72 Linux - Software 3 03-18-2007 08:28 AM
Graphics Install fails with a fatal media/hardware error rraghuram Fedora - Installation 4 08-10-2005 12:27 PM
persistant non fatal hardware error, help? SLaCk_KiD Linux - Hardware 5 02-08-2004 11:53 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 05:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration