LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   RAID10 Recovery Issue - mdadm segfault (https://www.linuxquestions.org/questions/linux-server-73/raid10-recovery-issue-mdadm-segfault-704641/)

marc2112 02-14-2009 12:44 PM

RAID10 Recovery Issue - mdadm segfault
 
Hi folks,

I have a 4 disk software RAID10 that's not doing too well. 2 of 4 disks lost power and are flagged as "faulty removed". Now that the power is restored, I feel like there should be a way to force the array back together but all the usual commands aren't working.

Here you can see the output of the commands I ran:
Code:

root@mclovin:~# uname -a
Linux mclovin 2.6.27-7-generic #1 SMP Fri Oct 24 06:42:44 UTC 2008 i686 GNU/Linux

root@mclovin:~# mdadm --examine /dev/sd[bcde]6
/dev/sdb6:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : 6eab4bb6:261a7710:36ca2f4e:c0945bf0
  Creation Time : Thu Oct 25 09:34:43 2007
    Raid Level : raid10
  Used Dev Size : 486432000 (463.90 GiB 498.11 GB)
    Array Size : 972864000 (927.80 GiB 996.21 GB)
  Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Feb 13 18:50:42 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 2
  Spare Devices : 0
      Checksum : e484aa6a - correct
        Events : 12551546

        Layout : near=2, far=1
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    0      8      22        0      active sync  /dev/sdb6

  0    0      8      22        0      active sync  /dev/sdb6
  1    1      8      38        1      active sync  /dev/sdc6
  2    2      0        0        2      faulty removed
  3    3      0        0        3      faulty removed
/dev/sdc6:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : 6eab4bb6:261a7710:36ca2f4e:c0945bf0
  Creation Time : Thu Oct 25 09:34:43 2007
    Raid Level : raid10
  Used Dev Size : 486432000 (463.90 GiB 498.11 GB)
    Array Size : 972864000 (927.80 GiB 996.21 GB)
  Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Feb 13 18:50:42 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 2
  Spare Devices : 0
      Checksum : e484aa7c - correct
        Events : 12551546

        Layout : near=2, far=1
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    1      8      38        1      active sync  /dev/sdc6

  0    0      8      22        0      active sync  /dev/sdb6
  1    1      8      38        1      active sync  /dev/sdc6
  2    2      0        0        2      faulty removed
  3    3      0        0        3      faulty removed
/dev/sdd6:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : 6eab4bb6:261a7710:36ca2f4e:c0945bf0
  Creation Time : Thu Oct 25 09:34:43 2007
    Raid Level : raid10
  Used Dev Size : 486432000 (463.90 GiB 498.11 GB)
    Array Size : 972864000 (927.80 GiB 996.21 GB)
  Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Feb 13 13:54:43 2009
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
      Checksum : e3c4d08d - correct
        Events : 12547691

        Layout : near=2, far=1
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    2      8      54        2      active sync  /dev/sdd6

  0    0      8      22        0      active sync  /dev/sdb6
  1    1      8      38        1      active sync  /dev/sdc6
  2    2      8      54        2      active sync  /dev/sdd6
  3    3      8      70        3      active sync  /dev/sde6
/dev/sde6:
          Magic : a92b4efc
        Version : 00.90.00
          UUID : 6eab4bb6:261a7710:36ca2f4e:c0945bf0
  Creation Time : Thu Oct 25 09:34:43 2007
    Raid Level : raid10
  Used Dev Size : 486432000 (463.90 GiB 498.11 GB)
    Array Size : 972864000 (927.80 GiB 996.21 GB)
  Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Fri Feb 13 13:54:43 2009
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0
      Checksum : e3c4d09f - correct
        Events : 12547691

        Layout : near=2, far=1
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    3      8      70        3      active sync  /dev/sde6

  0    0      8      22        0      active sync  /dev/sdb6
  1    1      8      38        1      active sync  /dev/sdc6
  2    2      8      54        2      active sync  /dev/sdd6
  3    3      8      70        3      active sync  /dev/sde6
root@mclovin:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : inactive sdb6[0](S) sde6[3](S) sdd6[2](S) sdc6[1](S)
      1945728000 blocks
     
md1 : active raid1 sdb7[0] sdc7[1]
      144448 blocks [2/2] [UU]
     
unused devices: <none>
root@mclovin:~# mdadm --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
root@mclovin:~# mdadm -R /dev/md0
mdadm: failed to run array /dev/md0: Input/output error
root@mclovin:~# mdadm -A -f /dev/md0 /dev/sd[bcde]6
mdadm: device /dev/md0 already active - cannot assemble it
root@mclovin:~# mdadm -S /dev/md0
mdadm: stopped /dev/md0
root@mclovin:~# mdadm -A -f /dev/md0 /dev/sd[bcde]6
mdadm: forcing event count in /dev/sdd6(2) from 12547691 upto 12551546
Segmentation fault


root@mclovin:~# fdisk -l

Disk /dev/sda: 46.1 GB, 46115758080 bytes
255 heads, 63 sectors/track, 5606 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0008919e

  Device Boot      Start        End      Blocks  Id  System
/dev/sda1  *          1        5371    43142526  83  Linux
/dev/sda2            5372        5606    1887637+  5  Extended
/dev/sda5            5372        5606    1887606  82  Linux swap / Solaris

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000a5252

  Device Boot      Start        End      Blocks  Id  System
/dev/sdb1              1      60801  488384001    5  Extended
/dev/sdb5              1        225    1807249+  82  Linux swap / Solaris
/dev/sdb6            244      60801  486432103+  fd  Linux raid autodetect
/dev/sdb7  *        226        243      144553+  fd  Linux raid autodetect

Partition table entries are not in disk order

Disk /dev/sdc: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000b4b68

  Device Boot      Start        End      Blocks  Id  System
/dev/sdc1              1      60801  488384001    5  Extended
/dev/sdc5              1        225    1807249+  82  Linux swap / Solaris
/dev/sdc6            244      60801  486432103+  fd  Linux raid autodetect
/dev/sdc7  *        226        243      144553+  fd  Linux raid autodetect

Partition table entries are not in disk order

Disk /dev/sdd: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000bb6d9

  Device Boot      Start        End      Blocks  Id  System
/dev/sdd1              1      60801  488384001    5  Extended
/dev/sdd5              1        243    1951834+  82  Linux swap / Solaris
/dev/sdd6            244      60801  486432103+  fd  Linux raid autodetect

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000c7297

  Device Boot      Start        End      Blocks  Id  System
/dev/sde1              1      60801  488384001    5  Extended
/dev/sde5              1        243    1951834+  82  Linux swap / Solaris
/dev/sde6            244      60801  486432103+  fd  Linux raid autodetect

Disk /dev/sdf: 120.0 GB, 120034123776 bytes
255 heads, 63 sectors/track, 14593 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0009d18f

  Device Boot      Start        End      Blocks  Id  System
/dev/sdf1  *          1      14593  117218241  83  Linux

root@mclovin:~# dmesg |grep md0
[    5.461569] md: md0 stopped.
[  20.409181] md: md0 stopped.
[  20.534084] md: md0 stopped.
[ 6138.265051] raid10: not enough operational mirrors for md0
[ 6161.946195] md: md0 stopped.
[ 6182.582541] md: md0 stopped.
[ 6211.255438] md: md0 stopped.
[ 6224.076157] md: md0 stopped.
[ 6233.452433] md: md0 stopped.
[ 6238.108606] md: md0 stopped.
[ 6240.572650] md: md0 stopped.
[ 6243.812747] md: md0 stopped.
[ 6277.934102] md: md0 stopped.
[ 6283.862128] md: md0 stopped.
[ 6413.730319] raid10: not enough operational mirrors for md0
[ 6442.373353] md: md0 stopped.
[ 6445.273324] md: md0 stopped.
root@mclovin:~# dmesg |grep sd[bcde]6
[    5.016358]  sdb: sdb1 < sdb5 sdb6 sdb7 >
[    5.037718]  sdc1 < sdc5 sdc6 sdc7 >
[    5.070611]  sdd6 >
[    5.071221]  sde: sde1 < sde5 sde6 >
[    5.484069] md: bind<sdc6>
[    5.484371] md: bind<sdd6>
[    5.484609] md: bind<sde6>
[    5.484821] md: bind<sdb6>
[  20.409199] md: unbind<sdb6>
[  20.426577] md: export_rdev(sdb6)
[  20.426640] md: unbind<sde6>
[  20.455976] md: export_rdev(sde6)
[  20.456210] md: unbind<sdd6>
[  20.479675] md: export_rdev(sdd6)
[  20.479738] md: unbind<sdc6>
[  20.492755] md: export_rdev(sdc6)
[  20.498101] md: bind<sdc6>
[  20.498429] md: bind<sdd6>
[  20.498651] md: bind<sde6>
[  20.498856] md: bind<sdb6>
[  20.534101] md: unbind<sdb6>
[  20.556589] md: export_rdev(sdb6)
[  20.556609] md: unbind<sde6>
[  20.572589] md: export_rdev(sde6)
[  20.572607] md: unbind<sdd6>
[  20.588586] md: export_rdev(sdd6)
[  20.588603] md: unbind<sdc6>
[  20.604585] md: export_rdev(sdc6)
[  20.609955] md: bind<sdc6>
[  20.610247] md: bind<sdd6>
[  20.610476] md: bind<sde6>
[  20.610683] md: bind<sdb6>
[ 6138.185459] md: kicking non-fresh sde6 from array!
[ 6138.185730] md: unbind<sde6>
[ 6138.188532] md: export_rdev(sde6)
[ 6138.189523] md: kicking non-fresh sdd6 from array!
[ 6138.190810] md: unbind<sdd6>
[ 6138.192528] md: export_rdev(sdd6)
[ 6161.947067] md: unbind<sdb6>
[ 6161.948531] md: export_rdev(sdb6)
[ 6161.949486] md: unbind<sdc6>
[ 6161.952525] md: export_rdev(sdc6)
[ 6211.261448] md: bind<sdc6>
[ 6211.261754] md: bind<sdd6>
[ 6211.261987] md: bind<sde6>
[ 6211.262231] md: bind<sdb6>
[ 6224.076461] md: unbind<sdb6>
[ 6224.080538] md: export_rdev(sdb6)
[ 6224.080559] md: unbind<sde6>
[ 6224.084527] md: export_rdev(sde6)
[ 6224.084544] md: unbind<sdd6>
[ 6224.088525] md: export_rdev(sdd6)
[ 6224.088542] md: unbind<sdc6>
[ 6224.096522] md: export_rdev(sdc6)
[ 6233.457038] md: bind<sde6>
[ 6238.108630] md: unbind<sde6>
[ 6238.112528] md: export_rdev(sde6)
[ 6238.115798] md: bind<sdd6>
[ 6240.572673] md: unbind<sdd6>
[ 6240.576527] md: export_rdev(sdd6)
[ 6240.579518] md: bind<sdc6>
[ 6243.812772] md: unbind<sdc6>
[ 6243.816529] md: export_rdev(sdc6)
[ 6243.819605] md: bind<sdb6>
[ 6277.934307] md: unbind<sdb6>
[ 6277.936529] md: export_rdev(sdb6)
[ 6277.941371] md: bind<sdd6>
[ 6277.941615] md: bind<sde6>
[ 6277.941818] md: bind<sdc6>
[ 6283.862315] md: unbind<sdc6>
[ 6283.864531] md: export_rdev(sdc6)
[ 6283.864550] md: unbind<sde6>
[ 6283.868525] md: export_rdev(sde6)
[ 6283.868541] md: unbind<sdd6>
[ 6283.872524] md: export_rdev(sdd6)
[ 6283.881922] md: bind<sdc6>
[ 6283.882218] md: bind<sdd6>
[ 6283.882462] md: bind<sde6>
[ 6283.882727] md: bind<sdb6>
[ 6413.668352] md: kicking non-fresh sde6 from array!
[ 6413.668375] md: unbind<sde6>
[ 6413.676025] md: export_rdev(sde6)
[ 6413.676071] md: kicking non-fresh sdd6 from array!
[ 6413.676081] md: unbind<sdd6>
[ 6413.680034] md: export_rdev(sdd6)
[ 6442.373383] md: unbind<sdb6>
[ 6442.374905] md: export_rdev(sdb6)
[ 6442.374994] md: unbind<sdc6>
[ 6442.376495] md: export_rdev(sdc6)


Pearlseattle 02-15-2009 08:00 AM

If you go here and search for the first occurrence of "mdadm" you'll see the two commands I used to reassemle the raid when two of my 4 HDDs failed in the raid5. It worked for me, and for another person in this forum, but it was always just a raid5, and I didn't messed around too much with it after it failed.
Hope it helps...

marc2112 02-15-2009 09:25 AM

Finally got this to work! The mdadm command I had issued originally to force assemble which was producing a segfault works with the 2.6.8 version of mdadm (2.6.7 is shipped with xubuntu 8.10)

root@mclovin:~# /usr/src/mdadm-2.6.8/mdadm --assemble --force /dev/md0 /dev/sd[bcde]6
mdadm: forcing event count in /dev/sdd6(2) from 12547691 upto 12551546
mdadm: forcing event count in /dev/sde6(3) from 12547691 upto 12551546
mdadm: clearing FAULTY flag for device 2 in /dev/md0 for /dev/sdd6
mdadm: clearing FAULTY flag for device 3 in /dev/md0 for /dev/sde6
mdadm: /dev/md0 has been started with 4 drives.


---Marc


All times are GMT -5. The time now is 10:05 PM.