LinuxQuestions.org - [SOLVED] Can not get RAID set to start

After a reboot, my RAID set won't start. Here is /proc/mdstat:

Code:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 

md0 : inactive sdc3[2] sdd3[3] sda3[0] sde3[4]

      7781050112 blocks

      

unused devices: <none>

There are five drives in the array. /dev/sdb3 is not shown above. All drives are partitioned exactly the same:

Code:

# sfdisk -l /dev/sdb



Disk /dev/sdb: 243201 cylinders, 255 heads, 63 sectors/track

Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0



  Device Boot Start    End  #cyls    #blocks  Id  System

/dev/sdb1          0+    30      31-    248976  82  Linux swap

/dev/sdb2        31    1026    996    8000370  83  Linux

/dev/sdb3  *  1027  243200  242174  1945262655  fd  Linux raid autodetect

/dev/sdb4          0      -      0          0    0  Empty

All swap partitions (including /dev/sdb1) appear to be functional, and I can mount, read and write /dev/sdb2 ok. Querying the drives gives me:

Code:

# mdadm --misc -Q /dev/sda3

/dev/sda3: is not an md array

/dev/sda3: device 0 in 5 device active raid5 /dev/md0.  Use mdadm --examine for more detail.

# mdadm --misc -Q /dev/sdb3

/dev/sdb3: is not an md array

/dev/sdb3: device 1 in 5 device mismatch raid5 /dev/md0.  Use mdadm --examine for more detail.

Examining the drives give conflicting results for /dev/sdb3 and as opposed to the other four drives. Here for example is /dev/sda3 which shows four active/working drives and one failed:

Code:

# mdadm --misc --examine /dev/sda3

/dev/sda3:

          Magic : a92b4efc

        Version : 0.90.00

          UUID : 4e77808f:197dbdcf:413393e8:3b8beff3

  Creation Time : Sat May 22 18:38:19 2010

    Raid Level : raid5

  Used Dev Size : 1945262528 (1855.15 GiB 1991.95 GB)

    Array Size : 7781050112 (7420.59 GiB 7967.80 GB)

  Raid Devices : 5

  Total Devices : 4

Preferred Minor : 0



    Update Time : Wed Nov 21 17:16:58 2012

          State : active

 Active Devices : 4

Working Devices : 4

 Failed Devices : 1

  Spare Devices : 0

      Checksum : 9eb7a564 - correct

        Events : 4101971



        Layout : left-symmetric

    Chunk Size : 64K



      Number  Major  Minor  RaidDevice State

this    0      8        3        0      active sync  /dev/sda3



  0    0      8        3        0      active sync  /dev/sda3

  1    1      0        0        1      faulty removed

  2    2      8      35        2      active sync  /dev/sdc3

  3    3      8      51        3      active sync  /dev/sdd3

  4    4      8      67        4      active sync  /dev/sde3

Note second drive faulty/removed. Compare with /dev/sdb3:

Code:

# mdadm --misc --examine /dev/sdb3

/dev/sdb3:

          Magic : a92b4efc

        Version : 0.90.00

          UUID : 4e77808f:197dbdcf:413393e8:3b8beff3

  Creation Time : Sat May 22 18:38:19 2010

    Raid Level : raid5

  Used Dev Size : 1945262528 (1855.15 GiB 1991.95 GB)

    Array Size : 7781050112 (7420.59 GiB 7967.80 GB)

  Raid Devices : 5

  Total Devices : 5

Preferred Minor : 0



    Update Time : Mon Jul 23 00:56:48 2012

          State : clean

 Active Devices : 5

Working Devices : 5

 Failed Devices : 0

  Spare Devices : 0

      Checksum : 9e27a008 - correct

        Events : 2588290



        Layout : left-symmetric

    Chunk Size : 64K



      Number  Major  Minor  RaidDevice State

this    1      8      19        1      active sync  /dev/sdb3



  0    0      8        3        0      active sync  /dev/sda3

  1    1      8      19        1      active sync  /dev/sdb3

  2    2      8      35        2      active sync  /dev/sdc3

  3    3      8      51        3      active sync  /dev/sdd3

  4    4      8      67        4      active sync  /dev/sde3

Note five active/working drives with none failed, and all listed as active sync. syslog shows:

Code:

Nov 21 19:54:16 triphod kernel: md: kicking non-fresh sdb3 from array!

Nov 21 19:54:16 triphod kernel: 2: w=1 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0

Nov 21 19:54:16 triphod kernel: 3: w=2 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0

Nov 21 19:54:16 triphod kernel: 0: w=3 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0

Nov 21 19:54:16 triphod kernel: 4: w=4 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0

Nov 21 19:54:16 triphod kernel: raid5: cannot start dirty degraded array for md0

Nov 21 19:54:16 triphod kernel: RAID5 conf printout:

Nov 21 19:54:16 triphod kernel:  --- rd:5 wd:4

Nov 21 19:54:16 triphod kernel:  disk 0, o:1, dev:sda3

Nov 21 19:54:16 triphod kernel:  disk 2, o:1, dev:sdc3

Nov 21 19:54:16 triphod kernel:  disk 3, o:1, dev:sdd3

Nov 21 19:54:16 triphod kernel:  disk 4, o:1, dev:sde3

Nov 21 19:54:16 triphod kernel: raid5: failed to run raid set md0

Nov 21 19:54:16 triphod kernel: md: pers->run() failed ...

Nov 21 19:54:16 triphod kernel: md: do_md_run() returned -5

Nov 21 19:54:16 triphod kernel: md: md0 still in use.

I am at a loss to determine what went wrong, and how to recover from the error. Any constructive suggestions greatly appreciated!