mdadm: how to avoid complete rebuild of RAID 6 array (6/8 active devices)
Hi Everyone,
First off, a little background on my setup. OS: Ubuntu 7.10 i386 Server (2.6.22-14-server) upgraded to Ubuntu 8.04 i386 Server (2.6.24-19-server) I have 8 SATA drives connected and the drives are organized into three md RAID arrays as follows: /dev/md1: ext3 partition mounted as /boot, composed of 8 members (RAID 1) (sda1/b1/c1/d1/e1/f1/g1/h1) /dev/md2: ext3 partition mounted as /root, composed of 8 members (RAID 1) (sda2/b2/c2/d2/e2/f2/g2/h2) /dev/md3: ext3 partition mounted as /mnt/raid-md3, composed of 8 members (RAID 6) (sda3/b3/c3/d3/e3/f3/g3/h3), this is the main data partition holding 2.7TiBs worth of data All the raid member partitions are set to type "fd" (Linux RAID Autodetect). Important Note: 6 of the drives are connected to two Sil3114 SATA controller cards whilst 2 of the drives are connected to the on-board SATA controller (I don't know which model it is). After upgrading my Ubuntu installation to 8.04, upon system restart there was an error message saying that my RAID arrays were degraded and thus the system was unable to boot from it. At the time, not knowing the cause of the sudden RAID failure, I attempted to force mdadm to start the arrays anyways (the RAID 1 arrays with 8 members each were no causes for concern, of course, but I wanted to back up my data on the degraded md3 array as soon as possible). Then it hit me, why would it recognize only 6 drives? Apparently the kernel has some compatibility problems with certain SATA controllers and my on-board controller chip was one of them. Sure enough, after moving all 8 drives to the Silicon Image controllers, the drives were all recognized without any problems. If the missing drives were recognized again before the array was ever brought up again, everything would've been fine. But unfortunately I forced mdadm (--run switch) to bring it online with 2 missing members. This is when the problem began. I know that as soon as I re-add the two missing drives back into the md3 (RAID 6) array, the system will attempt to rebuild the array, using the data from the 6 drives. Given the size of the array and the type of the disk drives being used (off-the-shelf SATA drives with bit error rate of 1 out of 10^14 bits), I think it is highly likely that the system will encounter one or more bit errors during the rebuild. Anyway, I panicked and brought the md3 array down first to prevent possible further damage. So, at this stage what I'm wondering is: 1. If mdadm encounters a bit error during a RAID 6 rebuild, will it just give up on that particular file and move on to recover other data on the array? Or will it trash the entire array? 2. Is it possible to cheat mdadm by somehow replacing the new "raid metadata" on the 6 drives with the old data on the 2 drives? Will it make mdadm think the array is clean, consistent and nothing ever happened? Please do note that I did not write ANY new data onto the RAID 6 array from the time it was degraded until the time I brought it down with (--stop). Sorry for the long post and thank you for your time in advance. I really hope to get this RAID array back up without data corruption because I don't have a working backup of the array (I know, very stupid of me). Dave |
Quote:
Fortunately, it was your on-board controller that had the problem, though, and that only had two devices on it to fail. Quote:
The first thing anyone should do if they suspect data is at risk is STOP, THINK and possibly turn stuff off until they've read up on everything they need to know to attempt recovery. Quote:
Quote:
Quote:
Your only alternative now is to image all the RAID6 data to another set of drives/image files as a backup and then perform exactly what you intend to do now - an array rebuild. I would HIGHLY suggest that you do this. You can even make the RAID rebuild from a file image of a drive partition if necessary (the beauty of the "everything is a file" idea in Unix). This way, you can store an image of those RAID6 partitions on a computer somewhere and see WHAT WOULD HAPPEN if you were to rebuild the RAID with that data/parity before you actually mess about putting those disks/controllers into a machine. Quote:
It's quite likely that it will abandon an array rebuild as soon as it encounters a problems. It's also quite likely that the force option (which you are only SUPPOSED to use in such circumstances where you have no choice to get working data back) will let you ignore those errors and continue the rebuild, which could potentially leave you with either a corrupt RAID (if you've removed all the locations of the parity data, etc.), corrupt filesystem (if the error hits in the file system itself), or a corrupt file or two (if the error hits inside a file's data). Quote:
Read (DO NOT MOUNT WITH THE WRITE OPTION) the data off those RAID6 partitions onto a large harddrive or existing filesystem (e.g. dd if=/dev/sda3 of=/home/user/data/sda3-image ), power down all those drives and see what happens when you try to rebuild the RAID from those file images (or make sure the file images are 100% safe and then try to rebuild the RAID from the drives themselves). And in future... BACKUP. To a non-disk medium. RAID is USELESS against file/disk/controller corruption. It is USELESS against unreliable hardware. RAID *cannot* compensate for deliberate missing with its metadata. It is USELESS against hardware which degrades past its stated tolerances (e.g. three drives failing in a RAID6 etc.). RAID6 is USELESS against failures of more than a single disk while it's rebuilding. Personally, I'd go out, buy a couple of the largest hard drives I can find and put images of the RAID6 partitions on BOTH of them. Then I'd stick one drive back in the box and put it somewhere safe, and make the other drive the ONLY drive in a machine. Then I'd attempt a RAID6 recovery on those image files and see what happens. If it all goes well, I'd power on the original machine, wipe out it's RAID6 and build a new empty one, then copy the data (NOT THE REPAIRED FILE IMAGES) from the recovered array (making sure that the other surviving copy of the original disks, that disk I put back in it's box, was kept very safe). |
Quote:
Ridiculous! -art edit: ridiculous |
Sorry to bring up an old thread. I am also planning to build a software raid 6 system with mdadm, starting with 8 drives with the idea of expanding further down the line to 11. I'm planning to use two pci-e controllers with 4 sata ports each http://www.newegg.com/Product/Produc...82E16816103058, and put the rest of the drives on the onboard controller. If a controller decides to kick the bucket will i be able to just toss a new one in and resume normal operation? If not what is the solution to get around this problem? Thanks!
|
The RAID driver assembles an array based on the UUIDs of the partitions. Therefor it doesn't matter if you replace the controllers.
My favorite solution is to buy all the spare parts right now, test them before you take the machine into production, and then store them in a safe place. You know you hardware replacement plan works, and the cost of those controllers is negligable compared to the data. jlinkels |
All times are GMT -5. The time now is 07:00 AM. |