This isn't working for me. =
I am testing mdadm RAID mirrors for (hopefully) production servers. My server has two SCSI drives (sda & sdb) on the same controller. My goal is to configure a system that can survive loss of either disk in the system mirror without system interruption, and be re-bootable thereafter, with or without the missing disk.
I configured the mirrors during CentOS 5.3 / Anaconda installation as follows:
___1 Configured "Software RAID" partitions:
______1 sda1, sda2, sda3
______2 sdb1, sdb2, sdb3
___2 Configured RAID devices:
______1 /dev/md11 - raid 1 mirror including sda1 and sdb1
______2 /dev/md12 - raid 1 mirror including sda2 and sdb2
______3 /dev/md13 - raid 1 mirror including sda3 and sdb3
___3 mount points
______1 / on /dev/md11
______2 /var on /dev/md12
______3 swap on /dev/md13
The installation went well, so I performed the following test:
___1 pulled sdb while the system was running
___2 system didn't crash =
___3 re-inserted sdb
___4 re-scan the bus (echo "- - - " > /sys/class/scsi_host/host0/scan)
___5 removed sdb1, sdb2, and sdb3 from md11, md12, and md13, respectively
______mdadm /dev/md1X --remove /dev/sdbX
___6 re-added sdb1, sdb2, and sdb3 to md11, md12, and md13
______mdadm /dev/md1X --add /dev/sdbX
___7 waited for the mirrors to rebuild
___8 rebooted
___9 verified that the mirrors were "clean"
Successful test, so I went a step further:
___1 shutdown with both drives mounted and mirrors healthy
___2 pulled sdb
___3 booted
___4 got the error: "GRUB Hard Disk Error"
___5 followed the instructions above
______1 first of all, CentOS rescue couldn't mount my system under /mnt/sysimg
______2 I mounted it manually, chrooted, and ran /sbin/grub-install /dev/sda1
______3 Error: "/dev/sda1: Not found or not a block device."
______4 Same error with /dev/md11
______5 In fact, when chrooted to my sda1 partition, mount shows:
_________ /dev/md11 on / type ext3 (rw)
_________ proc on /proc type proc (rw)
______6 so where exactly is grub-install supposed to act?
___6 I tried to "de-mdadm" sda1, sda2, and sda3 as follows:
______1 change partition type to 83 (82 for sda3) using fdisk
______2 mounted sda1 (read-write) and renamed /etc/mdadm.conf to /etc/mdadm.conf.bad
______3 edited /boot/grub.conf, changed /dev/md11 to /dev/sda1
______4 unmounted sda1
______5 zeroed the superblocks
_________mdadm --zero-superblock /dev/sdaX
___7 rebooted and still the same error: "GRUB Hard Disk Error"
Am I going about this wrong? How does one boot a mdadm-mirrored system after losing the "secondary" mirror disk? Seems like a fairly basic RAID feature to me. >:[
After figuring this out, I want to try pulling the "primary" disk (sda) and booting from sdb. I understand that requires some GRUB reconfiguration.
Thanks for any help anyone can offer!
-- Aaron