I have Promise and VIA SATA controllers on my motherboard with 1 and 2 drives connected, respectively, as well as a SI3114 raid card with 4 eSATA connections, 2 of which are used in an external housing. The system is installed on a normal PATA drive. As of yesterday, all of these drives were working fine.
I had been having a problem with the system crashing when copying some files from one drive to another and watching a video file on the drive being copied to at the same time (didn't do it seperately, but I didn't have much time to try independently). Log showed a series of these before crashing:
Code:
Jul 22 02:02:01 kaoru ainit: Memory: Failed to release semaphore
Jul 22 02:02:01 kaoru ainit: Error: No such file or directory
Jul 22 02:02:01 kaoru ainit: Memory: Failed to release SHM segment
Was told to run memtest, so I did, and it found 68 errors in just over 2 hours. Today, however, I realized that I had bumped the base clock speed up from 200 MHz to 205 MHz probably a few weeks ago for no reason at all (11x multiplier, so 55 MHz gain. Again, no reason). Set it back to auto, and no errors in memtest after 3 hours (going to let it run overnight once I finish this post). When I exited memtest and restarted the system, it had problems booting normally. It took 3-4 times as long to start services and get to the login screen. Fluxbox booted fine, but booting Gnome would also boot much more slowly (never finished b/c it took too long for me to wait.) No problems that I could see but same problem for several reboots. I went through serviceconf (this is FC4), and disabled services that weren't being used. I'm not sure if this did it, but it eventually started booting normally again. However, another problem came up.
It starts when it begins to unload the kernel. Something about Red Hat Nash 4.something, and then it says ata9 disabled. Once it boots, the drive I had been copying to and watching video on during crashes wouldn't load (I had even thought that drive might have been going bad to begin with earlier. I'd run fsck on it and had some errors come up, so I backed it up, reformatted, and reloaded the data). All the other drives load, including another from the same raid pci controller. Here is what dmesg|grep ata brings up.
Code:
libata version 1.20 loaded.
sata_via 0000:00:0f.0: version 1.1
sata_via 0000:00:0f.0: routed to hard irq line 10
ata1: SATA max UDMA/133 cmd 0xC000 ctl 0xB802 bmdma 0xA800 irq 16
ata2: SATA max UDMA/133 cmd 0xB400 ctl 0xB002 bmdma 0xA808 irq 16
ata1: SATA link up 1.5 Gbps (SStatus 113)
ata1: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01 87:4023 88:407f
ata1: dev 0 ATA-7, max UDMA/133, 781422768 sectors: LBA48
ata1: dev 0 configured for UDMA/133
scsi0 : sata_via
ata2: SATA link up 1.5 Gbps (SStatus 113)
ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01 87:4023 88:407f
ata2: dev 0 ATA-7, max UDMA/133, 781422768 sectors: LBA48
ata2: dev 0 configured for UDMA/133
scsi1 : sata_via
sata_sil 0000:00:0d.0: version 0.9
ata3: SATA max UDMA/100 cmd 0xF882A080 ctl 0xF882A08A bmdma 0xF882A000 irq 17
ata4: SATA max UDMA/100 cmd 0xF882A0C0 ctl 0xF882A0CA bmdma 0xF882A008 irq 17
ata5: SATA max UDMA/100 cmd 0xF882A280 ctl 0xF882A28A bmdma 0xF882A200 irq 17
ata6: SATA max UDMA/100 cmd 0xF882A2C0 ctl 0xF882A2CA bmdma 0xF882A208 irq 17
ata3: SATA link down (SStatus 0)
scsi2 : sata_sil
ata4: SATA link up 1.5 Gbps (SStatus 113)
ata4: dev 0 cfg 49:2f00 82:746b 83:7f01 84:4023 85:7469 86:3c01 87:4023 88:207f
ata4: dev 0 ATA-7, max UDMA/133, 781422768 sectors: LBA48
ata4: dev 0 configured for UDMA/100
scsi3 : sata_sil
ata5: SATA link down (SStatus 0)
scsi4 : sata_sil
ata6: SATA link up 1.5 Gbps (SStatus 113)
ata6: dev 0 cfg 49:2f00 82:746b 83:7f61 84:4023 85:7469 86:3c41 87:4023 88:207f
ata6: dev 0 ATA-7, max UDMA/133, 781422768 sectors: LBA48
ata6: dev 0 configured for UDMA/100
scsi5 : sata_sil
sata_promise 0000:00:08.0: version 1.03
sata_promise PATA port found
ata7: SATA max UDMA/133 cmd 0xF883E200 ctl 0xF883E238 bmdma 0x0 irq 17
ata8: SATA max UDMA/133 cmd 0xF883E280 ctl 0xF883E2B8 bmdma 0x0 irq 17
ata9: PATA max UDMA/133 cmd 0xF883E300 ctl 0xF883E338 bmdma 0x0 irq 17
ata7: SATA link down (SStatus 0)
scsi6 : sata_promise
ata8: SATA link up 1.5 Gbps (SStatus 113)
ata8: PIO error
scsi7 : sata_promise
ata9: disabling port
scsi8 : sata_promise
Changing which controller it plugs into makes no change to dmesg. If the SI controller is removed, it starts off with ata5 disabled instead of ata9, but the card was still plugged in when I tried the drive on a different controller, so I don't know if the dmesg would show ata5 or ata9 disabled. It won't really bother me if the drive has died, and I have to get it RMA'd, but I'd like to have a chance to clear it off first (non-destructive cleaning off) and reclaim a (relatively) small number of files that I hadn't burned a backup of yet if possible. Any ideas?