LinuxQuestions.org - software raid1, underlying disks

- Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)

- - software raid1, underlying disks (https://www.linuxquestions.org/questions/linux-general-1/software-raid1-underlying-disks-4175464721/)

software raid1, underlying disks

Hi Folks,

Three raid partitions, md0-md2.

md0 boot
md1 OS (centos 5.9)
md2 data

/dev/sda1 * 1 33 265041 fd Linux raid autodetect
/dev/sda2 34 8388 67111537+ fd Linux raid autodetect
/dev/sda3 8389 121601 909383422+ fd Linux raid autodetect
/dev/sdb1 * 1 33 265041 fd Linux raid autodetect
/dev/sdb2 34 8388 67111537+ fd Linux raid autodetect
/dev/sdb3 8389 121601 909383422+ fd Linux raid autodetect

Smartd is reporting pending sectors on /dev/sdb so I bought a pair of 2Tb disks. I have added one of the new disks live (this mobo is hot swappable). This appeared as /dev/sdp. Added it as a spare.

sfdisk -d /dev/sda | sfdisk /dev/sdp
dd if=/dev/sda/ bs=512 count=1 of=/dev/sdp

Failed /dev/sdb[1-3] and raid1 is currently re-syncing to /dev/sdp. Machine is physically difficult to get at so would prefer powering it off only the once.

Question is this: When /dev/sdp is synced. Can I pull the machine to bits, remove both /dev/sda and /dev/sdb disks (scsi 1:0:0:0 & 2:0:0:0 respectively), stick /dev/sdp in their place *plus* the virgin 2Tb (blank) counterpart & expect the machine to boot off of what was /dev/sdp.

Any thoughts?
TIA

You recreated the partion table using sfdump. That is good.

However, the second step (dd) I don't understand. There you did copy the first 512 bytes again, overwriting the partition table.

It seems like you wanted to copy the mbr. But in that case you should have copied only the first 446 bytes, and leave the rest (446-512) alone because that was the partition table. I think the dd step is not necessary and damages the partition table.

I did this a long long time ago, but to give you some pointers:

Copy the partition table using sfdump as you did
Take out the failed disk and insert the new one and let them sync
Now very important, you must install the boot loader (see below)
Once syncing is complete and you installed the boot loader, fail the second (formerly) good disk, and follow the same procedure

Read this article.
http://www200.pair.com/mecham/raid/r...aded-etch.html
It is not fully applicable as it start installing an OS on a degraded array, but it points out very well how to make your RAID bootable. Pay attention that both disks are made bootable. If not, the machine might not boot of the disk with the boot loader fails.

Since you are replacing the disks using the re-assemble mechanism of mdadm itself the UUID of the disks should not give problems, but check anyway if mdadm.conf is updated with the UUIDs of the newly assembled array.

jlinkels

PS. You don't want to do this on a production machine. Use some old hardware and make test runs. I think it can even be done in a VM.

Yes. It was just the first 446 bytes I wanted to copy. I think I'd best repeat the whole thing to be safe.

One further question. Atm the new disk appeared as /dev/sdp but when I replace the old /dev/sdb it will become /dev/sdb - will mdadm be able to handle that via UUID's? Also bearing this in mind, how will this affect what I tell grub when I install the bootloader (which I had completely forgotten about!)?

TIA

Quote:

Originally Posted by swamp-dog (Post 4966453)

Yes. It was just the first 446 bytes I wanted to copy. I think I'd best repeat the whole thing to be safe.

No, there is no need for dd.
The sfdump procedure is good and does what is needs to do. There is no need for a byte-for-byte copy of the mbr. It only can (and will confuse) things. Install grub as pointed out in the web page I pointed to.

Quote:

Originally Posted by swamp-dog (Post 4966453)

One further question. Atm the new disk appeared as /dev/sdp but when I replace the old /dev/sdb it will become /dev/sdb - will mdadm be able to handle that via UUID's? Also bearing this in mind, how will this affect what I tell grub when I install the bootloader (which I had completely forgotten about!)?
TIA

I hope the new disk becomes /dev/sdb. It is highly suspicious that the last disk became /dev/sdp. Very unusual. What happened to /dev/sdc through /dev/sdo?? When you add the new sdb to the array (and there is a procedure for that, check again that web page or the man page) mdadm takes care that the new UUID will be used. If the disk does not become /dev/sdb but something else, you can recreate the array with that something else, but you have to be certain that the device names remain the same through a reboot.

For the boot loader it doesn't matter. Just do a grub install to the real /dev/sdb or /dev/sda ignoring any raiding.

About device names at booting: in case you have different device names after installing new disks, forget about options to change them using udev. udev is only acting after the array is assembled. It won't rename drives before assembling, so the raid will assemble using the device names they get a booting, period.

Again, you should try this beforehand on a test system. mdadm is quite resilient and non-destructive, but Murphy's law is a strong contender when working on live data.

jlinkels

With your help I fixed it. :-)

The missing disks are a red herring. There were 7 disks attached, two of which were usb caddies. They can be a bit flakey in that they sometimes reinitialise themselves *plus* they are LUKS encrypted. It being the weekend I can reboot with them turned off. The remaining three are simple ext3 "whole disk" partitions for primary backup purposes (caddies being for offsite). It just so happened that when I plugged in one of the new disks it appeared as /dev/sdp.

The main issue was the bios. When I have it boot off another drive it appears to be mapping that drive to "slot 0" rather than the bios slot listed initially - difficult to be certain as that bios info appears fleetingly. Long & short of it I must have been installing grub onto the wrong disk.

Things improved last night. I ripped the machine to bits and replaced both old disks with the new ones. Wouldn't boot (grub grub grub) until I reattached the old working disk. This appeared as /dev/sde(*) which is what gave me the hint about the bios above. That and mdadm picking /dev/sd[ab] for /dev/md[01] but electing for /dev/sde2 for /dev/md2 had me very confused(**) over which actual disks were being used for what.

Anyway, with the new disks connected to their proper/final sata connectors I successfully installed & verified grub to both new disks.

(*) Dunno why /dev/sde, that should be one of the primary backup disks.

(**) Cockpit error on my part methinks. Very reluctant to tell mdadm to fail /dev/sde3 so methinks mdadm (for some reason) picked that up in preference to /dev/sdb3. I don't understand why but /var/log/messages ..

Jun 7 19:33:00 sdn kernel: md: Autodetecting RAID arrays.
Jun 7 19:33:00 sdn kernel: md: autorun ...
Jun 7 19:33:00 sdn kernel: md: considering sdb1 ...
Jun 7 19:33:00 sdn kernel: md: adding sdb1 ...
Jun 7 19:33:00 sdn kernel: md: sdb2 has different UUID to sdb1
Jun 7 19:33:00 sdn kernel: md: sdb3 has different UUID to sdb1
Jun 7 19:33:00 sdn kernel: md: md0 already running, cannot run sdb1
Jun 7 19:33:00 sdn kernel: md: export_rdev(sdb1)
Jun 7 19:33:00 sdn kernel: md: considering sdb2 ...
Jun 7 19:33:00 sdn kernel: md: adding sdb2 ...
Jun 7 19:33:00 sdn kernel: md: sdb3 has different UUID to sdb2
Jun 7 19:33:00 sdn kernel: md: md1 already running, cannot run sdb2
Jun 7 19:33:00 sdn kernel: md: export_rdev(sdb2)
Jun 7 19:33:00 sdn kernel: md: considering sdb3 ...
Jun 7 19:33:00 sdn kernel: md: adding sdb3 ...
Jun 7 19:33:00 sdn kernel: md: md2 already running, cannot run sdb3
Jun 7 19:33:00 sdn kernel: md: export_rdev(sdb3)
Jun 7 19:33:00 sdn kernel: md: ... autorun DONE.

..gave me a hint I should do exactly that.

On the upside, I've learnt more about mdadm than I ever thought I needed to know.

Cheers!