LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (http://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   raid1 mdadm repair degraded array with used good hard drive (http://www.linuxquestions.org/questions/linux-hardware-18/raid1-mdadm-repair-degraded-array-with-used-good-hard-drive-736084/)

catbird 06-27-2009 03:40 PM

raid1 mdadm repair degraded array with used good hard drive
 
I have a used but good harddrive which I'd like to use as a replacement for a removed harddrive in existing raid1 array.

mdadm --detail /dev/md0
0 0 0 -1 removed
1 8 17 1 active sync /dev/sdb1

I thought I needed to mark the removed drive as failed but I cannot get mdadm set it to "failed".

I issue
mdadm --manage /dev/md0 --fail /dev/sda1
But mdadm response is:

mdadm: hot remove failed for /dev/sda1: no such device or address

I thought I must mark the failed drive as "failed" to prevent raid1 from trying to mirror in wrong direction when I install my used-but-good disk. I want to reformat the good used drive first right? I believe I must prevent raid array from automatically try to mirror in the wrong direction. Any suggestions on how to proceed?

Jerre Cope 06-29-2009 12:34 AM

Try:
Code:

cat /proc/mdstat
to see which physical drives are in your array and what their status is. If there really is an sda1, then you can simply:

Quote:

mdadm /dev/md0 --remove /dev/sda1
Your RAID array rebuilds the correct direction when you ADD the drive to the array. It knows the one you're adding must be the one to rebuild. Install your old drive, set the partion to FD (it must be the same or larger than the other drives partion), then

Quote:

mdadm --add /dev/md0 /dev/sda1
You can

Quote:

cat /proc/mdstat
to see the rebuild status

If you are using grub, you will want to be sure grub is written to the new drive. The following shows how to set the grub for each drive, though, in your case, only one will need to be set:

Quote:

# grub

GNU GRUB version 0.95 (640K lower / 3072K upper memory)

[ Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists the possible
completions of a device/filename.]

grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd0)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/e2fs_stage1_5" exists... yes
Running "embed /boot/grub/e2fs_stage1_5 (hd0)"... 15 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"... succeeded
Done.

grub> root (hd1,0)
Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd1)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/e2fs_stage1_5" exists... yes
Running "embed /boot/grub/e2fs_stage1_5 (hd1)"... 15 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/boot/grub/stage2
/boot/grub/grub.conf"... succeeded
Done.

grub> quit

Oh! It goes without saying that you should probably be sure your backup is up-to-date before going too much further.

catbird 07-04-2009 12:19 PM

Thanks for answering my question about the direction of the recovery /mirroring.

I did the following:
root# sfdisk -d /dev/sdb | sfdisk /dev/sda

kept getting device busy messages

lsof and fuser revealed nothing. I tried umount and fuser -k commands in as many variations as I could think of but no effect so I finally used the -f to force it.

root# sfdisk -d /dev/sdb | sfdisk -f /dev/sda

That worked ok.
root# fdisk -l reported both devices with same partitions - so I'm feeling good about it.

then I did command
root# mdadm --manage /dev/md0 --add /dev/sda
(oops I forgot the 1 at the end )

But it seemed ok. cat /proc/mdstat showed % rebuilding and about after and hour, I had both
drives in the mix.

The repair started and when it was finished
cat /proc/mdstat
showed sda[2] and sdb1[1]
and the [UU]

But, there was no md0 so I thought maybe if I rebooted then md0 would magically start up.

I rebooted.

Now I'm again without sda.
# cat /proc/mdstat
has me back where I started sda is no longer in the array
and [_U]

Do you think it failed on boot-up because it was expecting sda1 not sda?

Thanks

Jerre Cope 07-04-2009 02:55 PM

The sda (rather than sda1) indicates that the array was assembled to the raw sda device, which sort of works, but is hard to recover from in that there's no partition to reference details about the physical partition.

I think

mdadm /dev/md0 --remove /dev/sda

(probably already removed)

Then attempt to partition sda, then use your assemble command as before (using sda1).

Run the grub commands as stated above to be sure the boot records are set correctly, otherwise, it will appear to work, but if you lose a drive, you won't be able to boot. (You can still recover from a live CD, but that rather misses the point of RAID failover).

Good luck.

catbird 07-04-2009 04:01 PM

I re-added sda, only this time I used sda1 nomenclature as follows:
root# mdadm --manage /dev/md0 --add /dev/sda1

At completion of the resync

root#: cat /proc/mdstat
personalities: [raid1]
md0 : active raid1 sda1[0] sdb1[1]
71681920 blocks [2/2] [UU]
unsued devices: <none>

Aside from rebooting, is there anything else I can to do test this and/or ensure that md0 remains intact?

catbird 07-04-2009 04:04 PM

I must have been typing when you posted so I didn't see your response until just now. This raid is not my boot device. This raid has only user data and user applications on it.
Do I still need to do anything with grub?

Thanks

catbird 07-05-2009 09:36 AM

Pleased to report that after adding as a partition (sda1) everything works properly...
I rebooted and issued cat /proc/mdstat to find both drives are now in the mix.

Tested it by failing and removing sdb1, system stayed up. All data and apps were still available but now on the newly added sda1 drive.

I added sdb1 back and, after it was completely mirrored, I rebooted and both sdb1 and sda1 are still part of raid1.

Thanks again.

Jerre Cope 07-09-2009 12:31 AM

You're welcome. You're right the grub is not involved if you're not booting from the Raid. I was only addressing that because that's where I'd got foot in the bucket.


All times are GMT -5. The time now is 01:29 PM.