SATA drive I/O fails under high load (ICH9)

Dria · 08-31-2009, 11:58 AM

It happened again under the same circumstances; therefore, this solution is not valid for this particular problem. I hope it helps someone else.

-------------

See this post in this thread for my resolution to the problem.

-------------

I'm running Debian Lenny 2.6.26-2 on a brand new HP server, running a SATA soft RAID 1 on an Intel ICH9 controller. I've found at times of high disk load (apparently), the physical drive being written to will throw an error and knock the partition out of the RAID. Both drives are subject to this. It first occurred when I was attempting to take a full backup of a 7 GB imported database on the server, and has happened a few times since during periods of high disk activity. I dd'ed zeroes to the drive for about 45 minutes without a problem, but deleting a ~72 GB file triggered it. Most recently, the error occurred again without any provocation I can see -- it was 4:30 AM and the server was under no load to speak of. There were no new or unusual cron jobs running, and as far as I can tell there was absolutely nothing happening.

I suspect it's a driver issue, but I'm pretty lost. Both drives' SMART data gives no hint of a problem. I'm posting to cover my bases before I bug the kernel devs.

Following is some relevant system information. I will be quite happy to provide anything else necessary.

Code:

rpt-mail:~# uname -a
Linux rpt-mail 2.6.26-2-686 #1 SMP Sun Jul 26 21:25:33 UTC 2009 i686 GNU/Linux

lspci:

Code:

00:1f.5 IDE interface: Intel Corporation 82801I (ICH9 Family) 2 port SATA IDE Controller (rev 02) (prog-if 85 [Master SecO PriO])
        Subsystem: Hewlett-Packard Company Device 31f4
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 18
        I/O ports at 1c68 [size=8]
        I/O ports at 1c5c [size=4]
        I/O ports at 1c60 [size=8]
        I/O ports at 1c58 [size=4]
        I/O ports at 1c30 [size=16]
        I/O ports at 1c20 [size=16]
        Capabilities: [70] Power Management version 3
        Capabilities: [b0] PCIe advanced features <?>
        Kernel driver in use: ata_piix
        Kernel modules: ata_piix

Most recent spontaneous failure:

Code:

Aug 29 04:31:35 rpt-mail kernel: [3173292.745338] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 29 04:31:35 rpt-mail kernel: [3173292.745338] ata1.00: BMDMA stat 0x25
Aug 29 04:31:35 rpt-mail kernel: [3173292.745338] ata1.00: cmd ca/00:08:88:ff:96/00:00:00:00:00/e0 tag 0 dma 4096 out
Aug 29 04:31:35 rpt-mail kernel: [3173292.745338]          res 51/10:08:88:ff:96/10:00:11:00:00/e0 Emask 0x81 (invalid argument)
Aug 29 04:31:35 rpt-mail kernel: [3173292.745338] ata1.00: status: { DRDY ERR }
Aug 29 04:31:35 rpt-mail kernel: [3173292.745338] ata1.00: error: { IDNF }
Aug 29 04:31:35 rpt-mail kernel: [3173293.053497] ata1.00: configured for UDMA/133
Aug 29 04:31:35 rpt-mail kernel: [3173293.053549] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Aug 29 04:31:35 rpt-mail kernel: [3173293.053639] sd 0:0:0:0: [sda] Sense Key : Aborted Command [current] [descriptor]
Aug 29 04:31:35 rpt-mail kernel: [3173293.053733] Descriptor sense data with sense descriptors (in hex):
Aug 29 04:31:35 rpt-mail kernel: [3173293.053790]         72 0b 14 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Aug 29 04:31:35 rpt-mail kernel: [3173293.053903]         00 96 ff 88
Aug 29 04:31:35 rpt-mail kernel: [3173293.053967] sd 0:0:0:0: [sda] Add. Sense: Recorded entity not found
Aug 29 04:31:35 rpt-mail kernel: [3173293.054031] end_request: I/O error, dev sda, sector 9895816
Aug 29 04:31:35 rpt-mail kernel: [3173293.054083] end_request: I/O error, dev sda, sector 9895816
Aug 29 04:31:35 rpt-mail kernel: [3173293.054135] md: super_written gets error=-5, uptodate=0
Aug 29 04:31:35 rpt-mail kernel: [3173293.054187] raid1: Disk failure on sda2, disabling device.
Aug 29 04:31:35 rpt-mail kernel: [3173293.054187] raid1: Operation continuing on 1 devices.
Aug 29 04:31:35 rpt-mail kernel: [3173293.054292] ata1: EH complete
Aug 29 04:31:35 rpt-mail kernel: [3173293.078355] RAID1 conf printout:
Aug 29 04:31:35 rpt-mail kernel: [3173293.078355]  --- wd:1 rd:2
Aug 29 04:31:35 rpt-mail kernel: [3173293.078355]  disk 0, wo:1, o:0, dev:sda2
Aug 29 04:31:35 rpt-mail kernel: [3173293.078355]  disk 1, wo:0, o:1, dev:sdb2
Aug 29 04:31:35 rpt-mail kernel: [3173293.078357] RAID1 conf printout:
Aug 29 04:31:35 rpt-mail kernel: [3173293.078399]  --- wd:1 rd:2
Aug 29 04:31:35 rpt-mail kernel: [3173293.078438]  disk 1, wo:0, o:1, dev:sdb2
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930] ata1.00: BMDMA stat 0x25
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930] ata1.00: cmd ca/00:08:e8:1d:52/00:00:00:00:00/e9 tag 0 dma 4096 out
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930]          res 51/04:08:e8:1d:52/10:00:11:00:00/e9 Emask 0x1 (device error)
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930] ata1.00: status: { DRDY ERR }
Aug 29 04:31:40 rpt-mail kernel: [3173301.075930] ata1.00: error: { ABRT }
Aug 29 04:31:47 rpt-mail kernel: [3173309.614242] ata1.00: both IDENTIFYs aborted, assuming NODEV
Aug 29 04:31:47 rpt-mail kernel: [3173309.614247] ata1.00: revalidation failed (errno=-2)
Aug 29 04:31:47 rpt-mail kernel: [3173309.614296] ata1: failed to recover some devices, retrying in 5 secs
Aug 29 04:31:52 rpt-mail kernel: [3173316.547752] ata1: hard resetting link
Aug 29 04:31:52 rpt-mail kernel: [3173317.788161] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Aug 29 04:31:52 rpt-mail kernel: [3173317.812276] ata1.00: configured for UDMA/133
Aug 29 04:31:52 rpt-mail kernel: [3173317.812335] ata1: EH complete
Aug 29 04:31:52 rpt-mail kernel: [3173317.812276] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Aug 29 04:31:52 rpt-mail kernel: [3173317.812276] sd 0:0:0:0: [sda] Write Protect is off
Aug 29 04:31:52 rpt-mail kernel: [3173317.812276] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 29 04:31:52 rpt-mail kernel: [3173317.903208] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
Aug 29 04:31:52 rpt-mail kernel: [3173317.903318] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
Aug 29 04:31:52 rpt-mail kernel: [3173317.903413] sd 0:0:0:0: [sda] Write Protect is off
Aug 29 04:31:52 rpt-mail kernel: [3173317.903459] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Aug 29 04:31:52 rpt-mail kernel: [3173317.910393] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA

amani · 08-31-2009, 12:50 PM

drive + driver details

Dria · 08-31-2009, 02:21 PM

Code:

rpt-mail:~# hdparm -I /dev/sda

/dev/sda:

ATA device, with non-removable media
        Model Number:       GB0160EAPRR
        Serial Number:      WCAT25064510
        Firmware Revision:  HPG1
        Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5
Standards:
        Used: ATA/ATAPI-7 T13 1532D revision 4a
        Supported: 7 6 5 4 & some of 8
Configuration:
        Logical         max     current
        cylinders       16383   16383
        heads           16      16
        sectors/track   63      63
        --
        CHS current addressable sectors:   16514064
        LBA    user addressable sectors:  268435455
        LBA48  user addressable sectors:  312581808
        device size with M = 1024*1024:      152627 MBytes
        device size with M = 1000*1000:      160041 MBytes (160 GB)
Capabilities:
        LBA, IORDY(can be disabled)
        Queue depth: 32
        Standby timer values: spec'd by Standard, with device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 0
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
             Cycle time: min=120ns recommended=120ns
        PIO: pio0 pio1 pio2 pio3 pio4
             Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
        Enabled Supported:
           *    SMART feature set
           *    Power Management feature set
                Write cache
           *    Look-ahead
           *    WRITE_BUFFER command
           *    READ_BUFFER command
           *    NOP cmd
           *    DOWNLOAD_MICROCODE
                Power-Up In Standby feature set
           *    SET_FEATURES required to spinup after power up
           *    48-bit Address feature set
           *    Mandatory FLUSH_CACHE
           *    FLUSH_CACHE_EXT
           *    SMART error logging
           *    SMART self-test
           *    General Purpose Logging feature set
           *    WRITE_{DMA|MULTIPLE}_FUA_EXT
           *    64-bit World wide name
           *    IDLE_IMMEDIATE with UNLOAD
           *    WRITE_UNCORRECTABLE_EXT command
           *    {READ,WRITE}_DMA_EXT_GPL commands
           *    Segmented DOWNLOAD_MICROCODE
           *    SATA-I signaling speed (1.5Gb/s)
           *    SATA-II signaling speed (3.0Gb/s)
           *    Native Command Queueing (NCQ)
           *    Phy event counters
                DMA Setup Auto-Activate optimization
           *    Software settings preservation
           *    SMART Command Transport (SCT) feature set
           *    SCT Long Sector Access (AC1)
           *    SCT LBA Segment Access (AC2)
           *    SCT Error Recovery Control (AC3)
           *    SCT Features Control (AC4)
           *    SCT Data Tables (AC5)
                unknown 206[12] (vendor specific)
                unknown 206[13] (vendor specific)
Logical Unit WWN Device Identifier: 50014ee11cbdf28
        NAA             : 5
        IEEE OUI        : 14ee
        Unique ID       : 11cbdf28
Checksum: correct

The controller is using the ata_piix driver. Is there any other specific information you need?

Dria · 12-30-2009, 12:17 PM

The issues stopped when I stopped poking it and started again when I did.

Observations:

Backup of 7G database fails (I can't remember what kind of operation this was)
Deletion of 72G file fails
SFTP transfer of 11G file to remote host fails
Creation of said 11G file succeeds
45 minutes of dd'ing (drive write without read) succeeds
Copy of directory with numerous small files adding up to 11G succeeds

I believe this is an issue with high drive read load, not something with writing. At the time of the most recent failure I was SFTPing an 11G file to a remote host -- it got 1.6G into the transfer and failed. The file was located on the /var partition, but both /var and / partitions were knocked out of the array. I hard-rebooted the server while the issue was going on and found that there was only one line in syslog about it although I saw many errors printed to the console, so those writes never made it. (If I had let the system recover I would have had those log entries, but the whole system locks up while it's happening. I could switch vtys, but SSH sessions failed and I couldn't actually type anything into the vtys.)

I plan to test whether a large file copy from partition to partition (both RAIDed) and from drive to drive (unRAIDed partitions) fails. It's running the same kernel as before, so my next step will probably be a kernel upgrade.

Dria · 02-08-2010, 12:34 PM

Okay. It's done it again, with new and exciting things. The first drive, which is the one that failed this time, has now logged SMART errors. Of interest is SMART attribute 188, "Command Timeout: A number of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero and if the value is far above zero, then most likely there will be some serious problems with power supply or an oxidized data cable." When checking the specifications to see if this low-end server is really low-end enough to not have a beefy enough power supply to handle two drives, I discovered this little gem in the specs: "NOTE: Transfer Rate: 1.5 Gb/s SATA"

Well. My drives are being detected at 3.0 Gb/s.

The libata force=1.5Gbps options should be my friend if I can't get into the box to change the jumpers. I do not know if this is the problem but it seems a much more likely candidate than anything else.

H_TeXMeX_H · 02-08-2010, 01:34 PM

Quote:

Originally Posted by Dria

Okay. It's done it again, with new and exciting things. The first drive, which is the one that failed this time, has now logged SMART errors. Of interest is SMART attribute 188, "Command Timeout: A number of aborted operations due to HDD timeout. Normally this attribute value should be equal to zero and if the value is far above zero, then most likely there will be some serious problems with power supply or an oxidized data cable." When checking the specifications to see if this low-end server is really low-end enough to not have a beefy enough power supply to handle two drives, I discovered this little gem in the specs: "NOTE: Transfer Rate: 1.5 Gb/s SATA"

Well. My drives are being detected at 3.0 Gb/s.

The libata force=1.5Gbps options should be my friend if I can't get into the box to change the jumpers. I do not know if this is the problem but it seems a much more likely candidate than anything else.

Yup, I too think this is the problem, a number of chipsets have this problem with drives set at 3.0 GB/s, so using a jumper to lower the speed would solve the problem.

dalai lama · 02-10-2010, 10:04 PM

The firmware on the disk is running old. I would suggest to upgrade the firmware to version HPG2

http://h20000.www2.hp.com/bizsupport...eriesId=397642

You can run it from the command line which should be easy

Dria · 03-08-2010, 10:34 AM

There are no jumpers on the drives and no BIOS option to set, but putting libata force=1.5Gbps in my initrd did successfully force it to 1.5 and seems to have solved the problem.

dalai lama, thanks for the tip on the firmware -- I'll look into it

Dria · 03-22-2010, 10:50 PM

It did the same thing, so it wasn't forcing the SATA speed. Weird, since I figuratively hammered on it to test it and it did fine. My next options are the firmware, the power supply and/or cables, and a technique involving gravity and the roof.

catkin · 03-23-2010, 03:30 AM

Quote:

Originally Posted by jonusb

Only when we have our own ideals, can nyc asian escort we find the origin of energy and enthusiasm in life, and become active and perseverant. Whatever nyc asian escorts your ideal is, careful plan and preparation is vital to its realization. Of course, the path from nyc escort where you are to where you want to get is not always smooth and straight. Therefore, an optimistic, positive nyc escorts mind is indispensable in the process of your persevering your ideal.

Off-topic post reported

H_TeXMeX_H · 03-23-2010, 03:55 AM

Quote:

Originally Posted by Dria

It did the same thing, so it wasn't forcing the SATA speed. Weird, since I figuratively hammered on it to test it and it did fine. My next options are the firmware, the power supply and/or cables, and a technique involving gravity and the roof.

When it happened again what was the speed reported in dmesg ? 1.5 or 3.0 ?

Dria · 03-25-2010, 04:32 PM

[ 3.233030] ata1: FORCE: PHY spd limit set to 1.5Gbps

If only it were that simple

I have not had a chance to do the firmware or power/cable checks, but I will update when I have.