mdadm: no such device: md0 -- RAID doesn't work after system recovery
I am trying to establish a recovery procedure for my file server, but I have problems booting from RAID.
In the server I have a RAID 1 array with 2 sata disks, 5 partitions. I backed up the file server to tape by tarring the root directory. To make a test recovery I did this on the test server:
Then I removed the memory stick and rebooted. Grub boots, shows the menu, continues, and then says: mdadm: no such device: md0 mdadm: no such device: md2 mdadm: no such device: md3 mdadm: no such device: md4 and boots into the busybox shell. Note that md1 (which is intended to be used as swap) is not among the error messages. In fact, I see a message that md1 is started succesfully, but I don't recall the exact text. md1 is (like the other partitions) mentioned in fstab. Booting back using the live distro which is provided with RAID support, the arrays immediately start to sync where they left when I stopped the machine. When I mount the file systems again, the files are still there. Booting again in the restored system brings me into busybox again. But In busybox I can issue: mdadm --assemble /dev/md0 /dev/sd[ab]1 and even there the arrays are started and start to sync at the point where they were in the live distro. So my conclusion is that the arrays are sound, can be recognized and will function. They do so in at least two booted and running Linux environments, but refuse to so so in the copy of the server file system. There are some additional things:
Since the errors that I get are from mdadm I don't think in terms of boot loader problems where partitions cannot be found. The RAID driver is obviously included in initramfs. So why oh why would mdadm give these errors during booting while the arrays seems to be sound? Is there any pointer to a document which describes in detail exactly at what moment mdadm is started to assemble the arrays and make them accessible? And where does it look? Can it be different from the place it looks while the system is running? jlinkels |
I am a few steps closer to a solution.
As it seems, at boot time mdadm uses mdadm.conf to assemble the raid arrays, and uses the array's UUID. Although no file systems are mounted at the time of booting, initramfs certainly is mounted, and the mdadm.conf contained in initrd.img together with the use of the UUID's exactly causes this problem. There are two possible solutions. One, I can access the mdadm.conf on the backup, extract the UUID's of the md devices, stop and reassemble the arrays I just created on the empty disks using the option --update=uuid --uuid=nnn:nnn:nnn:nnn. Two, I can disassemble the initrd.img file, do a scan on the newly created arrays, paste that into the mdadm.conf, and reassemble the intitrd.img file. Both options work. The first is some more work with grep/awk/sed, the second requires the handling of initrd.img. Because I want to keep the live server and a backup server as identical as possible, I tend to choose for the first solution. jlinkels |
All times are GMT -5. The time now is 12:17 AM. |