SATA RAID disk fail detection
I have been running a few RAID 1 SATA-based servers using kernel 2.6.x, Linux kernel software RAID and a SATA controller on an ASUS motherboard.
I have on several occasions have problems with a cable (and a single disk crash), causing the box to become irresponsive, as it used 99.999% CPU to fail accessing the disk controller (the error text was unfortunately of cause not in the log). I did not have unusual problems replacing disks and cables and recovering the array.
But I kind of hoped that the configuration would have failed the bad disk and continued to run on the good one. This did not happen, as the bad cable appearently took down the controller, thereby loosing the other disk as well.
Is there a way to avoid this? Would it e.g. be solved by bying another SATA controller and then placing one of the two disks (or both) on this? Are some of them more recommendable than others, and will I still be able to boot both disks?
Regards
Henrik
|