I've been struggling to properly configure a multilevel RAID for sometime now. Having spent my afternoon smashing my head against the wall, I've decided to post here. Here's the story:
I have 5 hard drives: 3 200GB drives, a 100GB drive, and a 120GB drive.
I created two partitions on the 120GB drive: a 20GB OS partition (unraided) and a 100GB partition (to be RAID0'd with the 100GB drive).
I created a RAID 5 with the 3 200GB drives and the 200GB RAID0. As the RAID0 is a block device, this should work just fine (or so I've been told).
I can start the RAID5 just fine. However, upon reboot, the RAID5 refuses to recognize the RAID0 (/dev/md0) and runs in degraded mode.
I can rebuild the array using the following commands:
Code:
sudo mdadm --assemble --force /dev/md1 /dev/sda1 /dev/sdb1 /dev/hdc1
sudo mdadm --add /dev/md1 /dev/md0
sudo mdadm --detail --scan > /etc/mdadm/mdadm.conf
where /dev/md1 is the RAID 5 and /dev/md0 is the RAID0 (which is working perfectly according to /proc/mdstat). Upon the rebuild, this array is working perfectly (according to /proc/mdstat). All four drives are up. However, upon reboot, it goes back into degraded mode.
Here's the relevant part of my syslog file
Code:
Nov 18 17:15:44 localhost kernel: md: md0 stopped.
Nov 18 17:15:44 localhost kernel: md: bind<hdb3>
Nov 18 17:15:44 localhost kernel: md: bind<hda1>
Nov 18 17:15:44 localhost kernel: md0: setting max_sectors to 128, segment boundary to 32767
Nov 18 17:15:44 localhost kernel: raid0: looking at hda1
Nov 18 17:15:44 localhost kernel: raid0: comparing hda1(97685632) with hda1(97685632)
Nov 18 17:15:44 localhost kernel: raid0: END
Nov 18 17:15:44 localhost kernel: raid0: ==> UNIQUE
Nov 18 17:15:44 localhost kernel: raid0: 1 zones
Nov 18 17:15:44 localhost kernel: raid0: looking at hdb3
Nov 18 17:15:44 localhost kernel: raid0: comparing hdb3(97683136) with hda1(97685632)
Nov 18 17:15:44 localhost kernel: raid0: NOT EQUAL
Nov 18 17:15:44 localhost kernel: raid0: comparing hdb3(97683136) with hdb3(97683136)
Nov 18 17:15:44 localhost kernel: raid0: END
Nov 18 17:15:44 localhost kernel: raid0: ==> UNIQUE
Nov 18 17:15:44 localhost kernel: raid0: 2 zones
Nov 18 17:15:44 localhost kernel: raid0: FINAL 2 zones
Nov 18 17:15:44 localhost kernel: raid0: zone 1
Nov 18 17:15:44 localhost kernel: raid0: checking hda1 ... contained as device 0
Nov 18 17:15:44 localhost kernel: (97685632) is smallest!.
Nov 18 17:15:44 localhost kernel: raid0: checking hdb3 ... nope.
Nov 18 17:15:44 localhost kernel: raid0: zone->nb_dev: 1, size: 2496
Nov 18 17:15:44 localhost kernel: raid0: current zone offset: 97685632
Nov 18 17:15:44 localhost kernel: raid0: done.
Nov 18 17:15:44 localhost kernel: raid0 : md_size is 195368768 blocks.
Nov 18 17:15:44 localhost kernel: raid0 : conf->hash_spacing is 195366272 blocks.
Nov 18 17:15:44 localhost kernel: raid0 : nb_zone is 2.
Nov 18 17:15:44 localhost kernel: raid0 : Allocating 8 bytes for hash.
Nov 18 17:15:44 localhost kernel: md: md1 stopped.
Nov 18 17:15:44 localhost kernel: md: bind<sdb1>
Nov 18 17:15:44 localhost kernel: md: bind<hdc1>
Nov 18 17:15:44 localhost kernel: md: bind<sda1>
Nov 18 17:15:44 localhost kernel: raid5: device sda1 operational as raid disk 0
Nov 18 17:15:44 localhost kernel: raid5: device hdc1 operational as raid disk 2
Nov 18 17:15:44 localhost kernel: raid5: device sdb1 operational as raid disk 1
Nov 18 17:15:44 localhost kernel: raid5: allocated 4203kB for md1
Nov 18 17:15:44 localhost kernel: raid5: raid level 5 set md1 active with 3 out of 4 devices, algorithm 2
Nov 18 17:15:44 localhost kernel: RAID5 conf printout:
Nov 18 17:15:44 localhost kernel: --- rd:4 wd:3 fd:1
Nov 18 17:15:44 localhost kernel: disk 0, o:1, dev:sda1
Nov 18 17:15:44 localhost kernel: disk 1, o:1, dev:sdb1
Nov 18 17:15:44 localhost kernel: disk 2, o:1, dev:hdc1
I'm running Debian (etch) with a custom compiled 2.6.18 kernel (so RAID and whatnot is compiled directly in, and kernel 2.6.18 has made a number of SATA improvements).
I've spent quite a bit of time trying to figure this one out. If anyone has the slightest clue why this is happening, your input would be greatly apppreciated.
Thanks!