LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 11-18-2015, 03:22 AM   #16
moesasji
Member
 
Registered: May 2008
Distribution: Slackware Current / OpenBSD
Posts: 322

Rep: Reputation: 104Reputation: 104

Quote:
Originally Posted by Sylvester Ink View Post
I'm fairly sure those errors are unrelated to the problem, as they occur at every boot up, whether the freeze occurred previously or not.
Your call, but I'm fairly confident that this is connected. Keep in mind that if you have heavy internet-activity things need to be written to disk, often in small chunks.
 
Old 11-18-2015, 07:00 AM   #17
bassmadrigal
LQ Guru
 
Registered: Nov 2003
Location: West Jordan, UT, USA
Distribution: Slackware
Posts: 8,792

Rep: Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656
I agree with moesasji that this seems to look like a hardware failure/incompatibility. If you have a separate hard drive, it may be worth trying it in your system.
 
Old 11-21-2015, 03:36 PM   #18
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
This:
Code:
Nov 17 07:43:12 inkwell kernel: [  967.715743] INFO: rcu_sched self-detected stall on CPU { 0}  (t=127540 jiffies g=37861 c=37860 q=4)
Please read:
https://www.kernel.org/doc/Documenta.../stallwarn.txt
 
Old 11-28-2015, 10:42 PM   #19
Sylvester Ink
Member
 
Registered: Jun 2010
Distribution: Slackware
Posts: 112

Original Poster
Rep: Reputation: 35
I tried running from an Xubuntu 15.04 live cd, and the issue still occurs, even when no hard drives are mounted. When they are mounted, it has occurred when I view media from my storage drive. (Not my Slackware install drive.) The issue seems fairly reproducible by just viewing a video on Youtube, although it's occurred at other times as well. (Including while writing this post.) Fortunately, when I hit the shutdown button on my computer, it will bring up XFCE's shutdown menu, from which I can cancel and continue what I was doing. This seems to confirm a hardware issue, though it's unlikely to be the hard drive.
It could be the CPU, as syslog will make mention of a non-responsive cpu if I wait long enough. (I'll post the message when I see it again.)
It may be the memory. My last memory check didn't reveal an error, but I can run again to be sure.
It is possible it's the videocard. My current videocard is new, but I got it earlier this summer when the previous card died. Both cards have run the Catalyst driver, to some degree, although I don't believe Ubuntu is using it on the livecd.
I doubt the issue is with my motherboard, as I replaced it with a new one recently as well.
Now that I'm running a live cd, I'll try swapping between the network card I'm using currently, and the built-in networking on the motherboard, to see if that's the issue.

Metaschima, although that message doesn't occur for every instance of the freeze, it does seem to coincide with the error I'm getting through Xubuntu.

Last edited by Sylvester Ink; 11-28-2015 at 10:43 PM.
 
Old 11-28-2015, 11:27 PM   #20
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 492Reputation: 492Reputation: 492Reputation: 492Reputation: 492
See:
http://docs.slackware.com/howtos:har...re_diagnostics

For the CPU, try:
http://www.mersenne.org/download/#source

Check the mobo voltages and see if they are within limits.

The graphics card is nearly impossible to test, but it definitely can fail, and I had one fail with similar symptoms to yours. The only hint was a low voltage for card, if that is a hint at all.
 
Old 11-29-2015, 11:02 AM   #21
MarcT
Member
 
Registered: Jan 2009
Location: UK
Distribution: Slackware 14.2
Posts: 125

Rep: Reputation: 51
I suspect you may have bad sector(s) on your disk. The ATA bus error can be due to 1 bad 512byte block (ie a "host" sector) on a drive which uses 4k native sectors. I have seen this with Western Digital 2TB and 3TB drives. The good news is if you can find and re-write the bad sector - the drive will carry on and still have a useful working life.

These bad sectors can be difficult to detect if NCQ (Native Command Queueing) is enabled for the disk. Try disabling NCQ with:

Code:
echo 1 > /sys/block/sdX/device/queue_depth
...and you may then get a more comprehensive error message, possibly with a sector number (LBA address).


There is a way to decode the sector address from your messages - see my notes/example below:
Code:
First Format:

[836525.355778] ata5.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0
[836525.355782] ata5.00: irq_stat 0x48000000
[836525.355786] ata5.00: failed command: READ FPDMA QUEUED
[836525.355792] ata5.00: cmd 60/00:08:e0:3f:30/01:00:05:01:00/40 tag 1 ncq 131072 in
[836525.355792]          res 41/40:00:18:40:30/00:00:05:01:00/40 Emask 0x409 (media error) <F>
[836525.355795] ata5.00: status: { DRDY ERR }
[836525.355797] ata5.00: error: { UNC }
[836525.368219] ata5.00: configured for UDMA/133
[836525.368233] ata5: EH complete

cmd 60/00:08:e0:3f:30/01:00:05:01:00/40
             ^6 ^5 ^4       ^3 ^2 ^1

Sector = 0x 00 01 05 30 3f e0

printf "d\n" 0x000105303fe0
4382015456




Second Format:

[837918.771224] sd 4:0:0:0: [sde] Unhandled sense code
[837918.771226] sd 4:0:0:0: [sde]  Result: hostbyte=0x00 driverbyte=0x08
[837918.771229] sd 4:0:0:0: [sde]  Sense Key : 0x3 [current] [descriptor]
[837918.771233] Descriptor sense data with sense descriptors (in hex):
[837918.771234]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 01 
[837918.771240]         05 30 40 18 
[837918.771243] sd 4:0:0:0: [sde]  ASC=0x11 ASCQ=0x4
[837918.771245] sd 4:0:0:0: [sde] CDB: cdb[0]=0x88: 88 00 00 00 00 01 05 30 40 18 00 00 00 08 00 00
[837918.771252] end_request: I/O error, dev sde, sector 4382015512
[837918.771268] ata5: EH complete

CDB: cdb[0]=0x88: 88 00 00 00 00 01 05 30 40 18 00 00 00 08 00 00
>> Sector                     ^1 ^2 ^3 ^4 ^5 ^6

printf "%d\n" 0x000105304018
4382015512
Once you have the host sector number, turn off NCQ and try to read that sector directly with something like:
Code:
dd if=/dev/sdX of=/dev/null bs=512 skip=Nsector count=1
...try varying the sector number to determine the range of bad sectors. If it is a drive with 4k native blocks, then you'll find the whole 4K (ie 8x 512byte blocks unreadable).

Confirm the bad sector location(s) with the "--read-sector" flag of hdparm(8).
If you do have bad sector(s), the next step will be to work out what (if any) file or meta-data is using those sectors.

Finally, you may be able to "repair" the bad sector with the "--repair-sector" parameter of hdparm. This writes zeros over the sector contents, but once done (for WD drives anyway, in my experience) the drive will be OK. I've not seen bad sectors re-appear in the same locations, and the SMART remapped sectors count did not increase.

Of course, YMMV.

KRs,
Marc
 
Old 11-29-2015, 03:47 PM   #22
bassmadrigal
LQ Guru
 
Registered: Nov 2003
Location: West Jordan, UT, USA
Distribution: Slackware
Posts: 8,792

Rep: Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656
@MarcT, if he is seeing the same problem when running a live distro that isn't touching the harddrive, it seems unlikely that the harddrive is to blame for his problem.
 
Old 11-29-2015, 06:13 PM   #23
MarcT
Member
 
Registered: Jan 2009
Location: UK
Distribution: Slackware 14.2
Posts: 125

Rep: Reputation: 51
True - although some live distros will activate swap partitions if found, and could scan md raid and LVM partitions. There could be unexpected or unintended disk access.
The only way to rule it out with a live distro would be to disconnect the drive.

I still think it's worth disabling NCQ and trying to reproduce this freeze. If there is a disk problem it will often lead to a more accurate I/O error message.

However, apparent bad sectors could also be a symptom of some other hardware issue (eg CPU core failing, bad RAM, bad SATA cable, marginal power supply, etc).
 
Old 12-14-2015, 01:31 AM   #24
Sylvester Ink
Member
 
Registered: Jun 2010
Distribution: Slackware
Posts: 112

Original Poster
Rep: Reputation: 35
Sorry for the slow updates, but the holiday season plus work projects have limited my available time to diagnose these issues.

I ran memtest for about 9 hours and didn't get a single error, so I think it's pretty safe to say the memory is fine.
I ran the Mesernne stress test for 15 minutes on mode 1, and about 6 minutes on mode 2. Neither showed errors, but I'll set it to run overnight, just in case. Otherwise, I can rule out CPU issues.
I think the next area to check is whether the network card is the issue. I'll pull it and use the integrated networking on the motherboard (currently disabled), and see if I still get the freeze.

[EDIT]
After 7 hours of the stress test, I have no errors, so I think it's safe to say that the CPU is fine as well.
[/EDIT]

Last edited by Sylvester Ink; 12-14-2015 at 08:49 AM.
 
Old 12-24-2015, 07:05 PM   #25
Sylvester Ink
Member
 
Registered: Jun 2010
Distribution: Slackware
Posts: 112

Original Poster
Rep: Reputation: 35
Okay, it seems the problem was with the network card. I have one on my motherboard and one in a PCI slot. I use the PCI card and have the motherboard networking disabled due to a previous issue on an older motherboard. So it seems like the PCI card was either damaged or conflicting with some other hardware. In any case, once I pulled it and switched to the motherboard, everything has been running just fine after about a week usage. I'll post an update if the problem occurs again, but for now I think I can mark this solved.

Thanks for your help, and I apologize for the delayed responses on my part. (Again, it's that time of year.)
 
  


Reply

Tags
crash, downloads, freeze



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
System freeze Dark Ares Linux - Hardware 3 09-28-2012 04:19 AM
system freeze sharky Linux - Software 2 10-10-2011 07:46 PM
System freeze! Please help! mahdif Linux - Hardware 9 11-27-2009 02:55 PM
System Freeze Snigger Linux - Hardware 6 06-20-2009 12:12 AM
System freeze Tony Empire Linux - General 2 12-24-2004 09:22 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 07:38 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration