LQ Newbie
Registered: Dec 2002
Location: Edmonton, AB, Canada
Distribution: Debian, etc.
Posts: 9
Rep:
|
Linux MD RAID woes
This problem may be due to some specific detail of my config, or essentially due to my foolish optimism over the state of Linux software 'Multiple Device' (MD) RAID support. I'm hoping someone will recognize enough of it to at least help me avoid breaking things in the same way again!
I've taken my old Asus A7V board (with 1GHz Duron CPU and 768 MB SDRAM) and set up what I intended to be a nice, decently-performing, reliable, cheap file-and-compute server. I took a couple of 30 GB 7200 RPM ATA-100 Maxtor drives out of my other systems, and bought a couple of newer 60 GB 7200 RPM ATA-133 Maxtor drives, and put 'em all on the ATA-100 bus built into the A7V's motherboard. I booted a Knoppix CD, allowing me to bring up the pair of 60 GB drives as 2 of 3 drives for a raid-5 set, running in degraded mode. That way, I could copy the data from the old systems' 30 GB drives over before moving those drives into the A7V's box. Then I striped (MD raid-0) the two 30 GB drives together, hot-added the result into the raid-5 set, and let it 'reconstruct' its third component drive onto the striped pair. /proc/mdstat reported typically 2000 KB/s of throughput for the reconstruction. I did all the RAID configuration with the mdadm utility, by the way. The full configuration was something like this:
hda == 60 GB drive; hda1 == about 64 MB for '/boot' volume (see below)
hdb == 30 GB drive
hdc == 60 GB drive; hdc1 == about 64 MB for '/boot' volume (see below)
hdd == 30 GB drive
md0 == hda1 & hdc1 as raid-1 (mirror)
md1 == hdb5 + hdd5 as raid-0 (stripe) of about 58 GB for '/' volume...
md2 == hdb6 + hdd6 as raid-0 (stripe) of about 200 MB for swap volume...
md5 == hda5 + hdc5 + md1 as raid-5 '/' volume
md6 == hda6 + hdc6 + md2 as raid-5 swap volume
This all seemed to work okay! I was a little surprised, but mostly relieved. All the partitions could be type FD: RAID Autodetect, and /etc/fstab could contain just MD device names, so even if I were to turn on the motherboard's slower ATA bus (which would become hda to hdd), the kernel should still be able to find all its filesystems, despite their suddenly being based on drives named hde to hdh. I worried a little that md1 and md2 might not get auto-assembled before the kernel went looking for md5 and md6, but I don't recall whether I tested for that. Seems like I must have, before re-running lilo, but I'm not sure I did.
After having this configuration up for a couple days, I noticed (got notified by mdadm, actually; a nice feature) that hdc5 had been dropped from the md5 set, so md5 was running in degraded mode again. Yikes! Until I had time to investigate further, I wanted to see if this was a fluke, so I removed hdc1 from the set and hot-added it back in to let reconstruction begin. It started 'rebuilding' but this time at only about 450 KB/s! That doesn't seem to make sense. No way could the stripe of two (slightly older) 30 GB drives have more than double the throughput of a 60 GB drive on the same bus! Could it?! This is on a Reiser filesystem, although that seems like it should be irrelevant.
Well, I wish all I had here was a bit of curiosity about that throughput. But no, we had a series of power failures the next day, and at least one must have out-lasted the UPS. I got home to find it at a kernel panic over not finding the root filesystem. I've spent a couple days trying to coax it into assembling md5, again. No LILO options I could think of would even get me to single-user mode. So I attached a CD-ROM drive, turned the slower ATA busses back on, and booted Knoppix again. I'm able to assemble every array except md1 and md5. I expect hdc5 didn't get rebuilt (or the drive actually has problems), so unless I can get md1 to assemble, md5 is toast. I've discovered that the partitions on hdb and hdd (i.e., hdf and hdh) were actually of type 83 (Linux), not FD (RAID Autodetect), but changing that seems to have made no difference (nor should it, since at this stage I'm trying to assemble the volumes by hand). Restoring from my only tape backup would really suck, if it's even complete and good (and I have my doubts; which is to say I have some lurking terror).
Was it utterly stupid to try building a raid-5 set with a stripe as one of its components?
Are there tools which might help me get md1 and (thus) md5 going?
Am I totally screwed, here? :-(
|