LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   "SATA link down" on non-existing sata channel (https://www.linuxquestions.org/questions/linux-hardware-18/sata-link-down-on-non-existing-sata-channel-694937/)

johan162 01-04-2009 03:24 PM

"SATA link down" on non-existing sata channel
 
I run SuSE 11.1 with kernel 2.6.27.10 and basically everything works fine. So why post here?

Well, I noticed that my var/log/messages started growing very rapidly and discovered that my kernel is pumping out SATA channel errors.

Code:

Jan  4 22:11:00 lambda kernel: ata10: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
Jan  4 22:11:00 lambda kernel: ata10: irq_stat 0x00000040, connection status changed
Jan  4 22:11:00 lambda kernel: ata10: SError: { DevExch }
Jan  4 22:11:00 lambda kernel: ata10: hard resetting link
Jan  4 22:11:00 lambda kernel: ata10: SATA link down (SStatus 0 SControl 300)

The first thing that made me surprised is that I don't have 10 SATA channels on the MB. I have only 6+1.

Looking in the boot.msg I can see that the physical 3 drives on the first three out of 6 SATA channels are in fact numbered from 3 and up and seemingly correctly setup as

Code:

<6>ata3: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc100 irq 216
<6>ata4: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc180 irq 216
<6>ata5: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc200 irq 216
<6>ata6: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc280 irq 216
<6>ata7: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc300 irq 216
<6>ata8: SATA max UDMA/133 abar m2048@0xf7ffc000 port 0xf7ffc380 irq 216
<6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
<6>ata3.00: ATA-8: A-DATA SSD 300 Series, 080826, max UDMA/100
<6>ata3.00: 125206528 sectors, multi 0: LBA
<6>ata3.00: configured for UDMA/100
<6>ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
<6>ata4.00: ATA-8: WDC WD5001AALS-00L3B2, 01.03B01, max UDMA/133
<6>ata4.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32)
<6>ata4.00: configured for UDMA/133
<6>ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
<6>ata5.00: ATA-8: WDC WD5001AALS-00L3B2, 01.03B01, max UDMA/133
<6>ata5.00: 976773168 sectors, multi 16: LBA48 NCQ (depth 31/32)
<6>ata5.00: configured for UDMA/133
<6>ata6: SATA link down (SStatus 0 SControl 300)
<6>ata7: SATA link down (SStatus 0 SControl 300)
<6>ata8: SATA link down (SStatus 0 SControl 300)

Since I don't have enough knowledge about the (S)ATA/IDE kernel structure I'm not sure how to pinpoint the SATA10 problem since this seems to be a "ghost" channel. All my real drives are working just fine and the only "real" problem this is causing is a very fast growing messages file. Since I started to observe this problem recently I suspect this could be a real MB issue.

Can anyone suggest a good next step to pinpoint this (and perhaps point to some ATA channel info so I can read up some more on how this actually works)?

My lspci looks like

Code:

00:00.0 Host bridge: Intel Corporation X58 I/O Hub to ESI Port (rev 12)
00:01.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 1 (rev 12)
00:03.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 3 (rev 12)
00:07.0 PCI bridge: Intel Corporation X58 I/O Hub PCI Express Root Port 7 (rev 12)
00:10.0 PIC: Intel Corporation X58 Physical and Link Layer Registers Port 0 (rev 12)
00:10.1 PIC: Intel Corporation X58 Routing and Protocol Layer Registers Port 0 (rev 12)
00:13.0 PIC: Intel Corporation X58 I/O Hub I/OxAPIC Interrupt Controller (rev 12)
00:14.0 PIC: Intel Corporation X58 I/O Hub System Management Registers (rev 12)
00:14.1 PIC: Intel Corporation X58 I/O Hub GPIO and Scratch Pad Registers (rev 12)
00:14.2 PIC: Intel Corporation X58 I/O Hub Control Status and RAS Registers (rev 12)
00:14.3 PIC: Intel Corporation X58 I/O Hub Throttle Registers (rev 12)
00:1a.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1b.0 Audio device: Intel Corporation 82801JI (ICH10 Family) HD Audio Controller
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 1
00:1c.2 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 3
00:1c.4 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 5
00:1c.5 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Port 6
00:1d.0 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB Controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
02:00.0 VGA compatible controller: nVidia Corporation G94 [GeForce 9600 GT] (rev a1)
04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
05:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 03)
05:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 03)
06:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 12)
08:02.0 FireWire (IEEE 1394): VIA Technologies, Inc. VT6306 Fire II IEEE 1394 OHCI Link Layer Controller (rev c0)


Can anyone suggest a good next step to pinpoint this (and perhaps point to some ATA channel info so I can read up some more on how this actually works)?

Rgds
Johan

Bassy 01-04-2009 04:58 PM

Wow! I just bought a new S-ATA 160 gb hard drive + I'm downloading Open SuSE 11.1.

Was it a bad idea? Does SuSE 11.1. works fine with S-ATA or doesn't it?

:(

johan162 01-04-2009 05:47 PM

Possible caused by badly connected SATA wires
 
Since I remembered that this problem only started after I installed the third HDD I was wondering if I somehow had disturbed some cables by mistake.

I took the server apart again and made very sure all SATA cables, USB cables etc were firmly attached and rebooted.

The problem now has disappeared.

However, I'm still very curious to what device/unit/block devide etc. the sata10 channel was assigned to (since I don't have more than 6+1 real SATA channels). Is it possible for some USB block devices (like SM/MMC card reader) to show up masqueraded as a SATA channel?

I just don't know enough about the kernel architecture regarding handling of block/HDD/SATA devices to really now in detail what really was going on.

Regarding your question on SuSE 11.1 I would say that in general it is a fairly good release and the issues most people have seen is when they use KDE 4.1 (even though SuSE has backported a fair amount of fixes from the 4.2).

Remember though that if you use it as a server (perhaps running a IMAP server that uses the inotify mechanism) you will want to think twice since the standard kernel shipped 2.6.27.7 has a serious bug in the inotify handling that can cause un-stoppable runaway processes that requires a hard reboot of the server. You need to upgrade to a 2.6.27.10 kernel yourself (or wait until SuSE has a suitable update)

Johan

Bassy 01-04-2009 07:03 PM

Sorry for me being lazy for not searching this online but since you seem to know: which KDE comes up with SuSE 11.1?

johan162 01-05-2009 04:17 AM

too quick to declare victory
 
.. actually I was a bit too quick to declare that this problem is solved. It apparently had nothing to do with loose cables.

After ~4 minutes after the logs starts too fill up again every 2s with blocks of

Code:

Jan  4 23:22:03 lambda kernel: ata10: exception Emask 0x10 SAct 0x0 SErr 0x4000000 action 0xe frozen
Jan  4 23:22:03 lambda kernel: ata10: irq_stat 0x00000040, connection status changed
Jan  4 23:22:03 lambda kernel: ata10: SError: { DevExch }
Jan  4 23:22:03 lambda kernel: ata10: hard resetting link
Jan  4 23:22:04 lambda kernel: ata10: SATA link down (SStatus 0 SControl 300)
Jan  4 23:22:04 lambda kernel: ata10: EH complete

How can I find out what physical device/chip is connected as ata10? (There are only 6+1 real ATA channel on the Intel controller ICH10 on my MB)

Johan


All times are GMT -5. The time now is 05:54 PM.