I've done a lot of googling and can't find any more than a "This should do that..." in hypothetical situations... Here is whats going on:
2x SATA Drives connected via Sil3114, running Software RAID-1 on kernel 2.6.9-11, centos 4.1 (similar to redhat 4, update 1)
One of the drives has visible bad sectors. That is, the drive's auto allocation is full and it is now returning "Sector unreadable/writable" to the driver, on certain parts of the disk. Basically, when the kernel is 20% of the way through rebuilding the mirror, I start getting "sense key read error" and "unrecovered read error" on the failed disk. The drive appears as SCSI.
Now, theoretically, according to threads on the subject in linux.kernel on usenet, the raid driver should fail the drive if it has trouble with the disk. What trouble and how much of it I haven't been able to discover. If I pull the drive while the system is running, the driver fails the disk after hanging up for a few seconds, which is acceptable. I guess it detects an unrecoverable error and appropriately fails the drive.
The problem is with these read errors, it isn't failing the drive. My message log is just filling up with "Cannot read sector 200001... cannot read sector 200002" for hours and presumably will continue until it reaches the last bad sector. Meanwhile the raid array is pretty much unusable by anything else. Presumably the system will stop serving web pages, etc in this state as well.
So I could just pull the drive completely, MD will fail the drive, and the system will continue on about its business.
The problem is, soon this server will go into production in a co-location facility, and response time under an hour to swap a hard drive will get pretty expensive. Ideally I am trying to find a solution that will let me tell the raid driver to "Fail drive after X consecutive soft errors."
The consensus seems to be that the MD driver does not handle visible bad sectors well.
Alternatively, something to the sil_sata or scsi subsystem that would tell it to disconnect the drive after consecutive read/write errors would be nice as well.
Theoretically, it would be difficult to have a cron job automatically check the logs for these types of messages and manually fail the drive as it seems to lock up the array completely when this happens. Or if I have the cron job run every 5 seconds, maybe it would stay in memory?
Any thoughts or suggestions on this would be greatly appreciated
Thx.