LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Help recoving a failed MD RAID-5 on openSUSE 11.1 (https://www.linuxquestions.org/questions/linux-server-73/help-recoving-a-failed-md-raid-5-on-opensuse-11-1-a-736416/)

twinge 06-29-2009 09:55 AM

Help recoving a failed MD RAID-5 on openSUSE 11.1
 
Hello all,
I set up a server running openSuSE 11.1 a few weeks back for my own use at home. It was configured with 4x1TB hard drives and I used software raid to set these four drives into a raid-5 array. This morning I turned on the server to find it had re-booted itself, possibly from a power failure or brownout, and failed to boot back up. The software raid is giving me an error and I have spent all of today searching the internet with a way to get the array back up but come up with nothing. Any help would be appreciated. Here is some info on what is going on:

- physically the hard drives are fine - I did a full sector-by-sector scan on each one using Western Digitals SMART utilities and they all passed without a flaw.
- /dev/md0 is a 16gb raid-0 array for swap consisting of /dev/sda2, /dev/sdb2, /dev/sdc2, /dev/sdd2 - this array works fine
- /dev/md1 is a 3TB raid-5 array for the OS and Data consisting of /dev/sda3, /dev/sdb3, /dev/sdc3, /dev/sdd3 - this is the array giving me issues
- the /boot partition is /dev/sda1

When I boot I receive this error message and then I am dumped into the SH shell early in the boot process. I have access to some basic command line utils like mdadm but lots of stuff is missing.

Code:

md: md1 stopped.
md: bind<sdb3>
md: bind<sdc3>
md: bind<sdd3>
md: bind<sda3>
raid5: device sda3 operational as raid disk 0
raid5: device sdd3 operational as raid disk 3
raid5: device sdc3 operational as raid disk 2
raid5: device sdb3 operational as raid disk 1
raid5: allocated 4288kB for md1
raid5: raid level 5 set md1 active with 4 out of 4 devices, algorithm 0
RAID5 conf printout:
 --- rd:4 wd:4
 disk 0, o:1, dev:sda3
 disk 1, o:1, dev:sdb3
 disk 2, o:1, dev:sdc3
 disk 3, o:1, dev:sdd3
md1: bitmap file is out of date, doing full recovery
md1: bitmap initialisation failed: -5
md1: failed to create bitmap (-5)
mdadm: failed to RUN_ARRAY /dev/md/1: Input/output error.
invalid root filesystem -- exiting to /bin/sh
$

Here is the info for my array and the four disks attached to it:

Code:

$mdadm -D /dev/md1
/dev/md1:
        Version : 1.00
  Creation Time : Thu Jun 18 15:15:16 2009
    Raid Level : raid5
  Used Dev Size : 972462464 (927.41 GiB 995.80 GB)
  Raid Devices : 4
  Total Devices : 4
    Persistence : Superblock is persistent

    Update Time : Sun Jun 28 10:50:07 2009
          State : active, Not Started
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

        Layout : left-asymmetric
    Chunk Size : 128K

          Name : linux:1
          UUID : e9a0da25:0bce6c41:0330678f:44257bba
        Events : 6303

    Number  Major  Minor  RaidDevice State
      0      8        3        0      active sync  /dev/sda3
      1      8      19        1      active sync  /dev/sdb3
      2      8      35        2      active sync  /dev/sdc3
      4      8      51        3      active sync  /dev/sdd3

$mdadm --examine /dev/sda3
/dev/sda3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
    Array UUID : e9a0da25:0bce6c41:0330678f:44257bba
          Name : linux:1
  Creation Time : Thu Jun 18 15:15:16 2009
    Raid Level : raid5
  Raid Devices : 4

 Avail Dev Size : 1944925016 (927.41 GiB 995.80 GB)
    Array Size : 5834774784 (2782.24 GiB 2987.40 GB)
  Used Dev Size : 1944924928 (927.41 GiB 995.80 GB)
  Super Offset : 1944925272 sectors
          State : clean
    Device UUID : eabd52cf:8b404ce5:57000a4a:74617399

Internal Bitmap : -233 sectors from superblock
    Update Time : Sun Jun 28 10:50:07 2009
      Checksum : 5858713e - correct
        Events : 6303

        Layout : left-asymmetric
    Chunk Size : 128K

    Array Slot : 0 (0, 1, 2, failed, 3)
  Array State : Uuuu 1 failed

$mdadm --examine /dev/sdb3
/dev/sdb3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
    Array UUID : e9a0da25:0bce6c41:0330678f:44257bba
          Name : linux:1
  Creation Time : Thu Jun 18 15:15:16 2009
    Raid Level : raid5
  Raid Devices : 4

 Avail Dev Size : 1944925016 (927.41 GiB 995.80 GB)
    Array Size : 5834774784 (2782.24 GiB 2987.40 GB)
  Used Dev Size : 1944924928 (927.41 GiB 995.80 GB)
  Super Offset : 1944925272 sectors
          State : active
    Device UUID : dbe61e90:6a957602:8ad6d54c:b561a4f6

Internal Bitmap : -233 sectors from superblock
    Update Time : Sun Jun 28 10:50:07 2009
      Checksum : 964bc582 - correct
        Events : 6303

        Layout : left-asymmetric
    Chunk Size : 128K

    Array Slot : 1 (0, 1, 2, failed, 3)
  Array State : uUuu 1 failed

$mdadm --examine /dev/sdc3
/dev/sdc3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
    Array UUID : e9a0da25:0bce6c41:0330678f:44257bba
          Name : linux:1
  Creation Time : Thu Jun 18 15:15:16 2009
    Raid Level : raid5
  Raid Devices : 4

 Avail Dev Size : 1944925016 (927.41 GiB 995.80 GB)
    Array Size : 5834774784 (2782.24 GiB 2987.40 GB)
  Used Dev Size : 1944924928 (927.41 GiB 995.80 GB)
  Super Offset : 1944925272 sectors
          State : active
    Device UUID : cbd92876:238d1eb7:4bc2e26e:ca7d581a

Internal Bitmap : -233 sectors from superblock
    Update Time : Sun Jun 28 10:50:07 2009
      Checksum : 76beb802 - correct
        Events : 6303

        Layout : left-asymmetric
    Chunk Size : 128K

    Array Slot : 2 (0, 1, 2, failed, 3)
  Array State : uuUu 1 failed

$mdadm --examine /dev/sdd3
/dev/sdd3:
          Magic : a92b4efc
        Version : 1.0
    Feature Map : 0x1
    Array UUID : e9a0da25:0bce6c41:0330678f:44257bba
          Name : linux:1
  Creation Time : Thu Jun 18 15:15:16 2009
    Raid Level : raid5
  Raid Devices : 4

 Avail Dev Size : 1944925016 (927.41 GiB 995.80 GB)
    Array Size : 5834774784 (2782.24 GiB 2987.40 GB)
  Used Dev Size : 1944924928 (927.41 GiB 995.80 GB)
  Super Offset : 1944925272 sectors
          State : active
    Device UUID : b8eaa407:238e9d80:7b8e7eef:a7b56aaa

Internal Bitmap : -233 sectors from superblock
    Update Time : Sun Jun 28 10:50:07 2009
      Checksum : e267cdfe - correct
        Events : 6303

        Layout : left-asymmetric
    Chunk Size : 128K

    Array Slot : 4 (0, 1, 2, failed, 3)
  Array State : uuuU 1 failed

Note: /dev/sdd3 says it is using slot 4 but the valid slots are 0,1,2,3,failed - this looks off to me. It also says the state of the array has 1 failed HDD but I cant find which one it is and the result from mdadm -D /dec/md1 says none are failed?

I have tried forcing the array to mount but the result is the same as when booting:

Code:

$mdadm -A -f /dev/md1
md: md1 stopped.
md: bind<sdb3>
md: bind<sdc3>
md: bind<sdd3>
md: bind<sda3>
raid5: device sda3 operational as raid disk 0
raid5: device sdd3 operational as raid disk 3
raid5: device sdc3 operational as raid disk 2
raid5: device sdb3 operational as raid disk 1
raid5: allocated 4288kB for md1
raid5: raid level 5 set md1 active with 4 out of 4 devices, algorithm 0
RAID5 conf printout:
 --- rd:4 wd:4
 disk 0, o:1, dev:sda3
 disk 1, o:1, dev:sdb3
 disk 2, o:1, dev:sdc3
 disk 3, o:1, dev:sdd3
md1: bitmap file is out of date, doing full recovery
md1: bitmap initialisation failed: -5
md1: failed to create bitmap (-5)
mdadm: failed to RUN_ARRAY /dev/md/1: Input/output error.

Does anyone have any suggestions on what else I could try to get this array up so I can get my data off if it? This is my first experience using MD. I have used a lot of hardware raid solutions but never software before. I am beginning to get afraid my data may be gone forever . Any help would be very much appreciated.
Thank you,
David

leandean 07-01-2009 12:34 AM

Boot into rescue mode. Mount md1 and check menu.lst to make sure you are booting the array and not trying to boot off just one drive. Chances are the machine rebooted after a kernel update and it was overwritten or modified. You didn't say what the hardware is. Some have their own personalities :) There is also a hot-key combo that will let the machine boot but I can't remember what it is :( I have it written down at work but I'm not there. Getting old sucks!

leandean 07-02-2009 11:33 AM

<ctrl> - d is the hotkey to boot the machine. After it boots you need to edit /lib/mkinitrd/boot-md.sh and make it like this::

if [ "$md_dev" ] ; then
/sbin/mdadm $mdconf --auto=md $md_dev || /sbin/mdadm -Ac partitions
$mdarg --auto=md $md_dev
fi
sleep 1
echo change > /sys/block/md$md_minor/uevent
wait_for_events
fi

Then run 'mkinitrd'.


All times are GMT -5. The time now is 09:37 AM.