LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 11-28-2009, 07:10 PM   #1
jlinkels
Senior Member
 
Registered: Oct 2003
Location: Bonaire
Distribution: Debian Lenny/Squeeze/Wheezy/Sid
Posts: 4,067

Rep: Reputation: 491Reputation: 491Reputation: 491Reputation: 491Reputation: 491
mdadm: no such device: md0 -- RAID doesn't work after system recovery


I am trying to establish a recovery procedure for my file server, but I have problems booting from RAID.

In the server I have a RAID 1 array with 2 sata disks, 5 partitions. I backed up the file server to tape by tarring the root directory.

To make a test recovery I did this on the test server:
  • booted the test server from USB using a live version of Debian Lenny. This live version is the same version as my live server, same kernel version 2.6.26-AMD64
  • created partions on both sda and sdb using the dump output of sfdisk as acquired from the live server.
  • created the RAID1 array using: mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sd[ab]1 Which is almost identical to what I did on the live server (see below). Created /dev/md1 .. /dev/md4. Array started to sync nicely.
  • formatted the boot partition with ext3, left md1 for the swap alone, formatted the other partitions md2 .. md4 as xfs.
  • created directory /mnt/restore. Mounted /dev/md0. Created /mnt/restore/home, /mnt/restore/vmbackup, etc. Mounted /dev/md2 on /mnt/restore/home, mounted /dev/md3 on /mnt/restore/vmbackup, etc. This is identical to what I do on the live server. But instead of mounting under /, directories are mounted in /mnt/restore/ and below.
  • restored the tar: tar -C /mnt/restore/ -xvf /dev/st0 Restore went flawlessly. Each partition holds the data it should hold, i.e. all data is restored to the correct partitions.
  • chrooted into /mnt/restore/. Installed grub on both hard disks: root (hd0,0); setup (hd0). Same for hd1.

Then I removed the memory stick and rebooted. Grub boots, shows the menu, continues, and then says:
mdadm: no such device: md0
mdadm: no such device: md2
mdadm: no such device: md3
mdadm: no such device: md4

and boots into the busybox shell.
Note that md1 (which is intended to be used as swap) is not among the error messages. In fact, I see a message that md1 is started succesfully, but I don't recall the exact text. md1 is (like the other partitions) mentioned in fstab.

Booting back using the live distro which is provided with RAID support, the arrays immediately start to sync where they left when I stopped the machine. When I mount the file systems again, the files are still there.

Booting again in the restored system brings me into busybox again. But In busybox I can issue: mdadm --assemble /dev/md0 /dev/sd[ab]1 and even there the arrays are started and start to sync at the point where they were in the live distro.

So my conclusion is that the arrays are sound, can be recognized and will function. They do so in at least two booted and running Linux environments, but refuse to so so in the copy of the server file system.

There are some additional things:
  • when I created the array for the first time in the live server I did so from a running installation. Created a degraded RAID with a missing disk, copied the running installation to RAID, booted from RAID and added the missing disk. IMHO that should not make much difference.
  • while experimenting on the test server I did a couple of stupid things with the RAID arrays. Like formatting the partitions first before creating the array. That caused problems of course so I corrected that. A number of times I added, failed, removed and re-added disks on the arrays. A lot of things, but I don't recall them all. Eventually I got everything right again. At the very last I rebooted the so restored system on RAID arrays I did everything which Should Not Be Done. But at that time the test server did boot from the restored installation.
  • when I wanted to try again and note each step carefully for the real recovery procedure I wanted to start from scratch. Therefore I zeroes the first 100 MB of both sda and sdb. When I did the recovery after that the result was as mentioned above.

Since the errors that I get are from mdadm I don't think in terms of boot loader problems where partitions cannot be found. The RAID driver is obviously included in initramfs. So why oh why would mdadm give these errors during booting while the arrays seems to be sound? Is there any pointer to a document which describes in detail exactly at what moment mdadm is started to assemble the arrays and make them accessible? And where does it look? Can it be different from the place it looks while the system is running?

jlinkels
 
Old 11-30-2009, 08:14 PM   #2
jlinkels
Senior Member
 
Registered: Oct 2003
Location: Bonaire
Distribution: Debian Lenny/Squeeze/Wheezy/Sid
Posts: 4,067

Original Poster
Rep: Reputation: 491Reputation: 491Reputation: 491Reputation: 491Reputation: 491
I am a few steps closer to a solution.

As it seems, at boot time mdadm uses mdadm.conf to assemble the raid arrays, and uses the array's UUID.

Although no file systems are mounted at the time of booting, initramfs certainly is mounted, and the mdadm.conf contained in initrd.img together with the use of the UUID's exactly causes this problem.

There are two possible solutions. One, I can access the mdadm.conf on the backup, extract the UUID's of the md devices, stop and reassemble the arrays I just created on the empty disks using the option --update=uuid --uuid=nnn:nnn:nnn:nnn.

Two, I can disassemble the initrd.img file, do a scan on the newly created arrays, paste that into the mdadm.conf, and reassemble the intitrd.img file.

Both options work. The first is some more work with grep/awk/sed, the second requires the handling of initrd.img.

Because I want to keep the live server and a backup server as identical as possible, I tend to choose for the first solution.

jlinkels
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Software Raid 5 (md) recovery using mdadm fakeroot Linux - Server 24 04-15-2013 09:19 AM
Why can't I mount this md0 raid? (mdadm and software raid) cruiserparts Linux - Software 35 01-05-2013 03:35 PM
mdadm RAID 5, 6 disks failed. recovery possible? ufmale Linux - Server 10 10-20-2008 08:24 AM
mdadm fails to assemble my RAID device tomhildebrand Fedora 6 06-28-2007 12:08 AM
mdadm: no device found for /dev/md0 doorito Linux - Server 1 04-09-2007 09:21 PM


All times are GMT -5. The time now is 07:09 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration