SlackwareThis Forum is for the discussion of Slackware Linux.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Rep:
Hard drive error on different boxes.
I keep getting these errors:
Code:
[23828.353436] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[23828.353445] ata7.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 15 pio 16392 in
opcode=0x4a 4a 01 00 00 10 00 00 00 08 00res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[23828.353447] ata7.00: status: { DRDY }
[23828.353455] ata7: hard resetting link
[23828.814436] ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[23828.818477] ata7.00: configured for UDMA/100
[23828.819012] ata7: EH complete
When copying a large amount of data to a hard drive. Sometimes the hard drive disconnects, and running blkid will not show it as being present at all until I reboot the box. After a reboot it is fine - for a while. If it sits idle I have no problems, it's just when I'm copying a large of data to the drive.
So I moved the process to a different box, thinking I might have a bad hard drive, and I get the same exact error. Different box, different drive, the only thing they really have in common is they are both running Slackware 14.2 64 bit.
This is something new - I've been doing this process for years. It started maybe a month ago. I run slackpkg to keep the systems current, is it possible there is some kernel/driver bug that was recently introduced? It seems that if this were the case, it would not be just me.
Does this make sense, and does anyone recognize the above error and maybe have some ideas where to go? I'm pretty much stumped here.
What is the SMART status of the drives? (use smartctl or gsmartcontrol)
If the drives report no errors, try changing the SATA cables. If the errors still appear after changing cables, well, it could be a hardware problem in the motherboard. But first things first, check the drives for errors, then replace cables.
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Original Poster
Rep:
smartctl <device> -H always showed passed. I even did a badblocks scan destructive, no errors. smartctl shows no reallocated sectors at all. And a long self test passed. I've done whatever diags I can on both drives, and they always passed. I'm tending to rule out hardware failure because I did this on two separate boxes, and got the same error on both of them. Either that or this is an amazing coincidence in that the same failure is occurring on both mobos and hard drives at the same time...
I'm going to play some more with this - move the drives, replace cables, try again. This is yet another of those weird things that happens from time to time where you want to go check the phase of the moon or check cosmic ray count...
So I moved the process to a different box, thinking I might have a bad hard drive, and I get the same exact error. Different box, different drive, the only thing they really have in common is they are both running Slackware 14.2 64 bit.
Then likely the drives are not the problem.
Quote:
This is something new - I've been doing this process for years. It started maybe a month ago.
That was about when the kernel was updated to 4.4.74 and then a few days later to 4.4.75. If you know how, would not hurt to restore the 4.4.38 kernel and see what happens.
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Original Poster
Rep:
Quote:
Originally Posted by upnort
Then likely the drives are not the problem.
That was about when the kernel was updated to 4.4.74 and then a few days later to 4.4.75. If you know how, would not hurt to restore the 4.4.38 kernel and see what happens.
I think the 4.4.75 came out about June 30, and that is about the right timing. I will indeed revert back to the 4.4.38 and try again. Thanks for that, I didn't think about reverting to an older kernel, though you would think if I suspected a kernel update, I would do so...<sigh>... retirement is in 5 and a half years, and I'm going to head south of the border and leave it all behind...can't happen soon enough...
In the meantime, I'll do that, test it, and report back.
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Original Poster
Rep:
I reverted to 4.4.38 kernel, and it still happened. I then disabled NCQ, and the frequency dropped to about 5% as often as it was happening, and it recovers fully each time. Very strange.
A bit of research shows that this particular error tends to pop up from time to time, and has done so for many years. It is infrequent and I haven't seen where anyone has come up with a definitive fix, but most blame driver specific compatibility issues with various sata controllers as the primary culprit.
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Original Poster
Rep:
This continues to happen on any drive that gets a lot of write activity, but it no longer drops the drive offline. IDK why, but it doesn't seem to actually cause any problems, so I think I'm just going to sit back and ignore it for now....
Hi, question, is this an external drive stuff gets copied to? Maybe something is not good in the enclosure or the cable is easily loosened when the cable gets touched; the tiny usb3-B connectors that go into an external drives are awful; the newer usb3-c is much tighter. One of my external drives gave up after too many write-interruptions due to such a wonky usb3-b connection....
Distribution: Slackware 14.2 soon to be Slackware 15
Posts: 699
Original Poster
Rep:
Internal drives, all of them. Checked it today, and kernel log has these:
[11002.859967] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[11002.859976] ata7.00: cmd a0/00:00:00:08:00/00:00:00:00:00/a0 tag 25 pio 16392 in
opcode=0x4a 4a 01 00 00 10 00 00 00 08 00res 40/00:03:00:00:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[11002.859977] ata7.00: status: { DRDY }
[11002.859980] ata7: hard resetting link
[11003.320968] ata7: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[11003.325000] ata7.00: configured for UDMA/100
[11003.325520] ata7: EH complete
So after all this time, I just realized that ata7 is the DVD player, not the drives involved in the copy. And there is no disk in the drive. Yet it only does this when a copy process is occurring,
Shame on me for not noticing this sooner! DOH!
But why does a copy process from one drive to another cause this to happen to the DVD player?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.