SATA timeout problems in 2.6.19
Hi,
I'm having some problems with my hard disks, I'd be grateful of any help. This has been going on for a while now -- I was using 2.6.15.6, with 3 250GB Maxtor SATA HDs on an Intel ICH6 controller (ata_piix), using software RAID. It used to be okay, but then my computer crashed a couple of times (this was a couple of months ago, sorta hazy...), and when it came back, there was no /home or any other partition that was on the RAID array (/ is on a separate SCSI disk). I eventually managed to reconstruct the array, but began getting kernel messages like this: Code:
Dec 5 23:02:49 violator kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen This repeats for all three SATA disks, with the 'configured for XXX' line changing through UDMA/133, UDMA/100, UDMA/66, UDMA/44, UDMA/33, UDMA/25, UDMA/16, PIO4, PIO3, PIO2, PIO1, PIO0. After PIO0 you get messages like this: Code:
Dec 6 01:21:46 violator kernel: ata1.00: speed down requested but no transfer mode left Code:
Dec 6 02:08:18 violator kernel: raid5: Disk failure on dm-2, disabling device. Operation continuing on 2 devices Since then, I have replaced the SATA cables, and the disks. I also tried replacing the controller (for a Silicon Image 3124 PCI-X card (sata_sil24 driver)), but this gave slightly different error messages (ata HSM violation), and although it seemed more stable (didn't crash), it turns out the data was corrupted when writing to it (I know this my virtue of the fact that FLAC files I copied across no longer decompressed, and many files had different MD5 sums to their counterparts on the older array (I had them both in, side by side, on ICH6 and sata_sil24)). I tried upgrading to 2.6.19 (through various 2.6.17-18...), and using ahci instead of ata_piix for the ICH6 controller, but I still get the error messages above (in fact, the above error messages are 2.6.19/ahci, so they're indicative of my problem as it is now, disregarding using the sata_sil24 controller). Here's the final twist: the disks seem pretty stable using sysresccd, which is 2.6.16.10 (and ahci), and when booted into an older kernel (2.6.9, using ata_piix). Performance still suffers, as a lag develops when logging in, but I don't see those error messages in the kernel (I think they were introduced with 2.6.18), and it more or less stays up. I would really like to use 2.6.19, but this problem is really vexing me, especially as I don't really know what to do anymore -- I think I've ruled out any hardware problems, but basically I'm flummoxed. I've tried searching, there is some stuff on LKML with similar error messages, but none quite like my problem. If anyone has any suggestions, I'd be most grateful. |
I second Ian's question....
I am experiencing a very similar problem; maybe it's arcmsr, maybe not.... Although because you're hw raid is intel; it suggest that areca may not be the cause. If anyone has any suggestions....
|
Same problem here. I have Sony VGN-S580 laptop. Bought it new around 2 years ago and since then I haven't been able to install ANY linux distro on it. F.... SATA problem. I have no idea why, but kernel developers and libata module developers just don't do anything about it. There seems to be quite a lot of people experiencing this with different sata controllers, mostly on laptops. This thing has been driving me crazy. I tried all solutions proposed on different forums - disabling acpi, passing some other parameters to the kernel - nothing worked for me. Somehow I managed to install Fedora Core 8 on it - installation went smoothly but now the system has the same problem. Very surprising - the kernel used during the installation is exactly the same as the one installed... I just ran updatedb on it and it didn't hang... However it froze couple of times already. I give up.
|
All times are GMT -5. The time now is 12:54 AM. |