raid1 mdadm repair degraded array with used good hard drive
I have a used but good harddrive which I'd like to use as a replacement for a removed harddrive in existing raid1 array.
mdadm --detail /dev/md0
0 0 0 -1 removed
1 8 17 1 active sync /dev/sdb1
I thought I needed to mark the removed drive as failed but I cannot get mdadm set it to "failed".
mdadm --manage /dev/md0 --fail /dev/sda1
But mdadm response is:
mdadm: hot remove failed for /dev/sda1: no such device or address
I thought I must mark the failed drive as "failed" to prevent raid1 from trying to mirror in wrong direction when I install my used-but-good disk. I want to reformat the good used drive first right? I believe I must prevent raid array from automatically try to mirror in the wrong direction. Any suggestions on how to proceed?
If you are using grub, you will want to be sure grub is written to the new drive. The following shows how to set the grub for each drive, though, in your case, only one will need to be set:
Oh! It goes without saying that you should probably be sure your backup is up-to-date before going too much further.
Thanks for answering my question about the direction of the recovery /mirroring.
I did the following:
root# sfdisk -d /dev/sdb | sfdisk /dev/sda
kept getting device busy messages
lsof and fuser revealed nothing. I tried umount and fuser -k commands in as many variations as I could think of but no effect so I finally used the -f to force it.
root# sfdisk -d /dev/sdb | sfdisk -f /dev/sda
That worked ok.
root# fdisk -l reported both devices with same partitions - so I'm feeling good about it.
then I did command
root# mdadm --manage /dev/md0 --add /dev/sda
(oops I forgot the 1 at the end )
But it seemed ok. cat /proc/mdstat showed % rebuilding and about after and hour, I had both
drives in the mix.
The repair started and when it was finished
showed sda and sdb1
and the [UU]
But, there was no md0 so I thought maybe if I rebooted then md0 would magically start up.
Now I'm again without sda.
# cat /proc/mdstat
has me back where I started sda is no longer in the array
Do you think it failed on boot-up because it was expecting sda1 not sda?
The sda (rather than sda1) indicates that the array was assembled to the raw sda device, which sort of works, but is hard to recover from in that there's no partition to reference details about the physical partition.
mdadm /dev/md0 --remove /dev/sda
(probably already removed)
Then attempt to partition sda, then use your assemble command as before (using sda1).
Run the grub commands as stated above to be sure the boot records are set correctly, otherwise, it will appear to work, but if you lose a drive, you won't be able to boot. (You can still recover from a live CD, but that rather misses the point of RAID failover).
I re-added sda, only this time I used sda1 nomenclature as follows:
root# mdadm --manage /dev/md0 --add /dev/sda1
At completion of the resync
root#: cat /proc/mdstat
md0 : active raid1 sda1 sdb1
71681920 blocks [2/2] [UU]
unsued devices: <none>
Aside from rebooting, is there anything else I can to do test this and/or ensure that md0 remains intact?
I must have been typing when you posted so I didn't see your response until just now. This raid is not my boot device. This raid has only user data and user applications on it.
Do I still need to do anything with grub?
Pleased to report that after adding as a partition (sda1) everything works properly...
I rebooted and issued cat /proc/mdstat to find both drives are now in the mix.
Tested it by failing and removing sdb1, system stayed up. All data and apps were still available but now on the newly added sda1 drive.
I added sdb1 back and, after it was completely mirrored, I rebooted and both sdb1 and sda1 are still part of raid1.
You're welcome. You're right the grub is not involved if you're not booting from the Raid. I was only addressing that because that's where I'd got foot in the bucket.
|All times are GMT -5. The time now is 12:37 PM.|