LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 10-13-2017, 11:01 PM   #1
Nz17
LQ Newbie
 
Registered: Oct 2017
Location: America
Distribution: Debian
Posts: 5

Rep: Reputation: Disabled
Exclamation libata / SATA Error Messages During Linux Boot Sequence (exception Emask, interface fatal error, hard resetting link, status DRDY)


I've been having troubles with my computer for the past few weeks. More than 70% of the time, when I attempt to boot into Linux, the console's boot screen "hangs" after having loaded initramfs and the Linux kernel. This is after about 4 seconds from being past the GRUB boot menu. The computer isn't actually hung, though, as it still shows keyboard output on the screen if I type something. However, the booting doesn't proceed. I figure that it is finally time I relent and seek some advice.

Sometimes I can get the computer to boot using the recovery mode entry from the GRUB boot menu; however, that's not a guarantee as sometimes the computer is frozen when choosing that path too.

I'm running the newest 64-bit Debian stable, FreeDOS, and Windows XP SP3. Things were running well until recently when all of these libATA errors began. The error messages showed up after upgrading from the previous Debian stable to the current stable with systemd. However, I do not believe that this is caused by the upgrade, as these error messages appear even when booting from CD-R or USB, and even when using other distros.

Strangely, Windows XP, FreeDOS, and the disk utility SpinRite don't report any troubles, yet I believe that's due to my belief that whatever these problems are, they are not problems directly with my HDD, but indirectly. Partly this belief is due to the facts that these troubles started while I was still using my 1 TB HDD; I've copied my files to a new 2 TB HDD and get the same problems even while the old drive is disconnected; when my DVD drive is disconnected, the problems are present; when both of the HDD's are disconnected from the controller card, the same problems are reported while I'm running from a live CD or live USB distro; when my USB devices with on-board storage are physically disconnected from the computer, I still get libATA errors and the other symptoms.

I wish I had another desktop PC to try these SATA drives (HDD1, HDD2, DVD drive) to see how they behave with it, yet I only have the one. However I believe that the problmes must originate with either the mainboard or the PCI-e SATA controller card, as at this time, they are the only things which I think could be the cause. I know that my mainboard was giving problems to me with my old SATA HDD due to what is probably a faulty SATA controller on the mainboard. That's when I installed the PCI-e SATA controller. Things have been well with that setup for a long time (1 - 2 years), but could the problem from the mainboard have encroached to the point where it is affecting the PCI-e ports or the SATA controller card?

Before I started using a SATA HDD, I was using a (P)ATA HDD, and I switched to a SATA drive due to the mainboard's ATA controller chip going bad. For a time, I used a PCI-e ATA controller card to replace the on-board ATA controller, but it seems that the two ATA DVD drives which were previously conncted to the mainboard's ATA controller had gone bad. The drives going bad might have been caused by the ATA controller being bad, yet I don't know for certain.

Thoughts and advice? What should I try next?

Thank you for your time.

Links
---
Code:
dmesg from a "good boot"
http://www.nz17.com/tmp/comp-trouble.../boot_good.txt

dmesg from a "half-good boot" (has the errors yet boots to the GUI desktop)
http://www.nz17.com/tmp/comp-trouble...t_halfgood.txt

dmesg from another "half-good boot"
http://www.nz17.com/tmp/comp-trouble..._halfgood2.txt

lsmod
http://www.nz17.com/tmp/comp-trouble...0-13/lsmod.txt

Computer hardware profile
http://www.nz17.com/tmp/comp-trouble...re-report.html
Code:
Example errors (quoted from dmesg output of "boot_halfgood.txt")
---
[    1.712056] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.716330] ata2.00: ATA-8: WDC WD20EARS-00MVWB0, 51.0AB51, max UDMA/133
[    1.716333] ata2.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[    1.721326] ata2.00: configured for UDMA/133
[    1.721558] scsi 1:0:0:0: Direct-Access     ATA      WDC WD20EARS-00M AB51 PQ: 0 ANSI: 5
[    1.745170] usbcore: registered new interface driver usbhid
[    1.745172] usbhid: USB HID core driver
[    1.749166] input: Plantronics Plantronics .Audio 655 DSP as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1:1.3/0003:047F:C008.0001/input/input3
[    1.766936] sr 0:0:0:0: [sr0] scsi3-mmc drive: 48x/12x writer dvd-ram cd/rw xa/form2 cdda tray
[    1.766940] cdrom: Uniform CD-ROM driver Revision: 3.20
[    1.767286] sr 0:0:0:0: Attached scsi CD-ROM sr0
[    1.767404] sd 1:0:0:0: [sda] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[    1.767450] sd 1:0:0:0: [sda] Write Protect is off
[    1.767453] sd 1:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    1.767473] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.808230] plantronics 0003:047F:C008.0001: input,hiddev0,hidraw0: USB HID v1.00 Device [Plantronics Plantronics .Audio 655 DSP] on usb-0000:00:1d.0-1/input3
[    1.809613]  sda: sda1 sda2 sda3 sda4
[    1.810119] sd 1:0:0:0: [sda] Attached SCSI disk
[    1.820065] ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0x6 frozen
[    1.820111] ata1.00: irq_stat 0x08000002, interface fatal error
[    1.820152] ata1.00: failed command: IDENTIFY PACKET DEVICE
[    1.820200] ata1.00: cmd a1/00:01:00:00:00/00:00:00:00:00/00 tag 12 pio 512 in
                        res 50/00:03:00:24:00/00:00:00:00:00/a0 Emask 0x10 (ATA bus error)
[    1.820244] ata1.00: status: { DRDY }
[    1.820278] ata1: hard resetting link
[    1.980044] usb 1-2: new high-speed USB device number 3 using ehci-pci
[    2.129355] usb 1-2: New USB device found, idVendor=05e3, idProduct=0608
[    2.129358] usb 1-2: New USB device strings: Mfr=0, Product=1, SerialNumber=0
[    2.129361] usb 1-2: Product: USB2.0 Hub
[    2.129938] hub 1-2:1.0: USB hub found
[    2.130223] hub 1-2:1.0: 4 ports detected
[    2.296045] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    2.298762] ata1.00: configured for UDMA/133
[    2.300019] ata1: EH complete
 
Old 10-14-2017, 09:38 AM   #2
Keruskerfuerst
Senior Member
 
Registered: Oct 2005
Location: Horgau, Germany
Distribution: Manjaro KDE, Win 10
Posts: 2,199

Rep: Reputation: 164Reputation: 164
1. SATA cable defective
2. Drives defective
 
Old 10-14-2017, 09:56 AM   #3
Nz17
LQ Newbie
 
Registered: Oct 2017
Location: America
Distribution: Debian
Posts: 5

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by Keruskerfuerst View Post
1. SATA cable defective
But I've swapped the cables. For this to be the case, both cables would have to be defective.

Quote:
Originally Posted by Keruskerfuerst View Post
2. Drives defective
But that would require both the old drive and my recently purchased new drive to be defective. Plus the problem happens even when all the SATA drives are disconnected.

Last edited by Nz17; 10-14-2017 at 09:57 AM.
 
Old 10-14-2017, 10:21 AM   #4
jsbjsb001
Senior Member
 
Registered: Mar 2009
Location: Earth, unfortunately...
Distribution: Currently: OpenMandriva. Previously: openSUSE, PCLinuxOS, CentOS, among others over the years.
Posts: 3,881

Rep: Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063Reputation: 2063
Quote:
Originally Posted by Nz17 View Post
But I've swapped the cables. For this to be the case, both cables would have to be defective.
...
But that would require both the old drive and my recently purchased new drive to be defective. Plus the problem happens even when all the SATA drives are disconnected.
If the following is from your PCIe add-in SATA card, either that and/or the PCIe motherboard port maybe going bad;

Code:
[    1.820111] ata1.00: irq_stat 0x08000002, interface fatal error
Your motherboard doesn't sound too healthy, generally speaking, to me. And it's either your PCIe add-in SATA card and/or your motherboard that are, in all likelihood, failing you, in a nutshell.

If you can find another PC, you could try adding your PCIe SATA add-in card to that and testing it in that, and see if you have the same problems/messages. If so, it may just be the add-in card at fault, but that said, your integrated SATA/ATA ports don't sound healthy (on your motherboard).

Last edited by jsbjsb001; 10-14-2017 at 12:17 PM. Reason: additions/re-worded post
 
Old 10-14-2017, 10:46 AM   #5
kilgoretrout
Senior Member
 
Registered: Oct 2003
Posts: 2,987

Rep: Reputation: 388Reputation: 388Reputation: 388Reputation: 388
A few questions:
1. Are all drives properly detected in your bios setup?
2. Can you consistently boot to windows and does it run without issues? It's not really clear from your post.
3. What is the make and model of your power supply(psu)?
4. Your hardware profile shows only one sata controller, an Asmedia 106x SATA Controller. Is your onboard sata controller disabled in your bios? If so what is it?
5. On those occasions when you can boot into linux, does it run properly without errors or issues?
6. Are you overclocking? If you are, set everything back to standard and see if you still have problems.
7. I assume this is not a home built system. Who is the manufacturer of your computer? Certain ones are known to use cheap components leading to erratic problems over time.

In my experience, erratic problems tend to be caused by either a bad or marginal psu or ram problems. Ram can be easily checked by running memtest overnight and see if it reports any errors. A psu can cause erratic problems if voltages are going out of spec at random times; it tends to show up most on bootup when the psu is under heavy load. Certain lower end computer manufacturers use cheap psus to cut costs; those manufactured by Bestec are notoriously bad. You can usually check your voltages in your bios setup. But even if it checks out, the psu may still be malfunctioning under load. Your sata controller could be having issues but even if that's the case, it's hard to see how that would effect your ability to boot from a usb flash drive. Also, the fact that you already lost your onboard sata controller may be an indication larger motherboard issues; check for bulging capacitors.
 
Old 10-17-2017, 04:03 AM   #6
Nz17
LQ Newbie
 
Registered: Oct 2017
Location: America
Distribution: Debian
Posts: 5

Original Poster
Rep: Reputation: Disabled
Thumbs up

Thanks for the suggestions and advice. The mainboard probably needs to be replaced, however I'll try reseating the SATA card just to be sure. I'll post again when I've tried another PCI-e socket with my SATA expansion card and I've seen the results.
 
Old 10-21-2017, 05:25 AM   #7
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,292

Rep: Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322Reputation: 2322
In addition to the good advice already offered, I would make a few points:

1. When people ask you questions, it's because they need the answers to help, and it's politeness on your part to address them. For numbered questions, give numbered replies.

2. From what's been said, I can't confirm or deny your diagnosis on the ATA controller, but let's take it as correct. Disk controllers are usually part of some major ASIC, and failure often causes localized heat distortions in the IC leading to further failure of the device. I'd start shopping if an ASIC starts failing. There's so many individual components in modern circuitry that frankly even electronics guys (like me) find it impossible to predict failure modes.

3. Things huff and puff especially at startup, but it's of little consequence unless we're talking about a mission-critical 24/7 server. A faulty asic on board multiplies this big time. In times past, I had a board with the infamous Via MPV3 complete with hardware fault (It was actually misconfigured to cope with crappy Creative soundblasters). It would declare a major crisis over the disk, and the crappy SiS 6326 video chip (Another Disaster area) on boot, but then would work away fine. Those messages don't look good, but I don't see the need to do much, except back up regularly.

4. Back then, there was a generic pci driver, and a driver each for various ropey chipsets. If the generic driver was present, it was used. That had to be absent, so that it would use the correct driver. That usually meant a kernel recompile. I'm not up to date enough to know if a similar situation exists today, but it might. You certainly don't want a kernel driver for your failed ata chip. Disk Drivers (often motherboard chipset drivers) have to be compiled into the kernel, not modules, as modules get caught in a Catch-22 situation: You need the driver to see the module :-/.
 
Old 10-22-2017, 05:37 AM   #8
davcefai
Member
 
Registered: Dec 2004
Location: Malta
Distribution: Debian Sid
Posts: 863

Rep: Reputation: 45
Ideally you need to test in another computer.

Don't compare Windows to Linux too closely. They are fussy about different things so that what you can get away with on one may not work on the other. Can't remember specific instances.

There is a further test you could try with the power supply:

Connect a multimeter to an ATA power connector and watch it closely during boot up. Any fluctuations should lead you to suspect the PSU. If possible use an older, moving coil meter. It will respond faster.

Is the PSU powerful enough for the PC. There are sites where you can calculate the required wattage. ASUS or AMD spring to mind.

Have you tried disconnecting the DVD drive and any other drives, leaving only the boot drive connected?

Also have you tried connecting the boot drive to another SATA port?

These are all "desperate" measures. FWIW I think you are looking at a new motherboard

Good Luck.
 
Old 02-13-2018, 03:15 AM   #9
Nz17
LQ Newbie
 
Registered: Oct 2017
Location: America
Distribution: Debian
Posts: 5

Original Poster
Rep: Reputation: Disabled
Lightbulb [Solution] IDENTIFY PACKET DEVICE ASMedia 106x 1062 Debian Linux Kernel

I found the solution! After four months, it is finally solved!

The Problem
The ASMedia 106x SATA controller chipset is buggy. The command "lspci -nn" lists mine as a "03:00.0 SATA controller [0106]: ASMedia Technology Inc. ASM1062 Serial ATA Controller [1b21:0612] (rev 01)" though the description of yours might vary.

It appears that this is buggy hardware, and has been so since at least 2013-01-31. The ASMedia 106x series, for some odd reason, doesn't accept ATAPI commands. When the ASMedia 106x series is sent an ATAPI command (such as during boot), the defectively designed hardware causes the kernel, and thus the computer, to stall.

This hardware-based bug affects the ASMedia 1061 and 1062 and occurs with any ATAPI drive. (CD, DVD, Blu-Ray, etc.)

(My guess is that the driver for Microsoft Windows for the ASMedia 106x series has a work-around in the driver so that this kernel-freezing result isn't triggered with Windows.)

Evidence of the Bug
https://bugzilla.redhat.com/show_bug.cgi?id=906532 (Thanks for posting, Reartes Guillermo - I found your post first.)
https://askubuntu.com/questions/2303...device-at-boot

The Solution
Linux kernel boot parameter "libata.atapi_passthru16=0"

Implementation (for Debian Linux)
  1. As root, or as a member of a group with read/write permissions, or using a form of "sudo," edit the file "/etc/grub.d/10_linux"
  2. Change line 113 from
    Code:
    args="$4"
    to
    Code:
    args="$4 libata.atapi_passthru16=0"
  3. Save the file "/etc/grub.d/10_linux"
  4. Run the command
    Code:
    update-grub
  5. Reboot
If your computer won't boot in the first place, then edit the kernel options at the boot menu before Linux is loaded. For example, with the GRUB2 boot manager/loader, select the line for your Linux, press the "E" key, and use the Emacs-like editor to add the kernel option of "libata.atapi_passthru16=0" alongside other kernel options such as "ro" or "quiet". A boot disk might help with this if your existing installation won't work. Once you can boot, follow the numbered steps above to implement a permanent installation of the kernel option into your boot loader's boot menu.

Happy hacking!

Last edited by Nz17; 02-13-2018 at 03:17 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
HDD/BUS problem ata1.00: status: { DRDY } ata1.00: hard resetting link inachomsky Linux - Hardware 1 08-13-2014 08:20 AM
How to fix Grub rescue - "exception Emask" error? yangiss Ubuntu 7 10-31-2011 06:44 AM
SATA status {DRDY} NX-01 Fedora 4 09-28-2009 06:55 AM
ata2.00: exception Emask and EXT3-fs Jornaling error on Fedora 11 Box masterDL Linux - Software 1 08-22-2009 10:25 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 04:32 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration