LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Can not get RAID set to start (https://www.linuxquestions.org/questions/linux-server-73/can-not-get-raid-set-to-start-4175438312/)

netfoot 11-22-2012 12:56 PM

Can not get RAID set to start
 
After a reboot, my RAID set won't start. Here is /proc/mdstat:

Code:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md0 : inactive sdc3[2] sdd3[3] sda3[0] sde3[4]
      7781050112 blocks
     
unused devices: <none>

There are five drives in the array. /dev/sdb3 is not shown above. All drives are partitioned exactly the same:

Code:

# sfdisk -l /dev/sdb

Disk /dev/sdb: 243201 cylinders, 255 heads, 63 sectors/track
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

  Device Boot Start    End  #cyls    #blocks  Id  System
/dev/sdb1          0+    30      31-    248976  82  Linux swap
/dev/sdb2        31    1026    996    8000370  83  Linux
/dev/sdb3  *  1027  243200  242174  1945262655  fd  Linux raid autodetect
/dev/sdb4          0      -      0          0    0  Empty

All swap partitions (including /dev/sdb1) appear to be functional, and I can mount, read and write /dev/sdb2 ok. Querying the drives gives me:

Code:

# mdadm --misc -Q /dev/sda3
/dev/sda3: is not an md array
/dev/sda3: device 0 in 5 device active raid5 /dev/md0.  Use mdadm --examine for more detail.
# mdadm --misc -Q /dev/sdb3
/dev/sdb3: is not an md array
/dev/sdb3: device 1 in 5 device mismatch raid5 /dev/md0.  Use mdadm --examine for more detail.

Examining the drives give conflicting results for /dev/sdb3 and as opposed to the other four drives. Here for example is /dev/sda3 which shows four active/working drives and one failed:

Code:

# mdadm --misc --examine /dev/sda3
/dev/sda3:
          Magic : a92b4efc
        Version : 0.90.00
          UUID : 4e77808f:197dbdcf:413393e8:3b8beff3
  Creation Time : Sat May 22 18:38:19 2010
    Raid Level : raid5
  Used Dev Size : 1945262528 (1855.15 GiB 1991.95 GB)
    Array Size : 7781050112 (7420.59 GiB 7967.80 GB)
  Raid Devices : 5
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Nov 21 17:16:58 2012
          State : active
 Active Devices : 4
Working Devices : 4
 Failed Devices : 1
  Spare Devices : 0
      Checksum : 9eb7a564 - correct
        Events : 4101971

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    0      8        3        0      active sync  /dev/sda3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      0        0        1      faulty removed
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3
  4    4      8      67        4      active sync  /dev/sde3

Note second drive faulty/removed. Compare with /dev/sdb3:

Code:

# mdadm --misc --examine /dev/sdb3
/dev/sdb3:
          Magic : a92b4efc
        Version : 0.90.00
          UUID : 4e77808f:197dbdcf:413393e8:3b8beff3
  Creation Time : Sat May 22 18:38:19 2010
    Raid Level : raid5
  Used Dev Size : 1945262528 (1855.15 GiB 1991.95 GB)
    Array Size : 7781050112 (7420.59 GiB 7967.80 GB)
  Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Mon Jul 23 00:56:48 2012
          State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
      Checksum : 9e27a008 - correct
        Events : 2588290

        Layout : left-symmetric
    Chunk Size : 64K

      Number  Major  Minor  RaidDevice State
this    1      8      19        1      active sync  /dev/sdb3

  0    0      8        3        0      active sync  /dev/sda3
  1    1      8      19        1      active sync  /dev/sdb3
  2    2      8      35        2      active sync  /dev/sdc3
  3    3      8      51        3      active sync  /dev/sdd3
  4    4      8      67        4      active sync  /dev/sde3

Note five active/working drives with none failed, and all listed as active sync. syslog shows:

Code:

Nov 21 19:54:16 triphod kernel: md: kicking non-fresh sdb3 from array!
Nov 21 19:54:16 triphod kernel: 2: w=1 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0
Nov 21 19:54:16 triphod kernel: 3: w=2 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0
Nov 21 19:54:16 triphod kernel: 0: w=3 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0
Nov 21 19:54:16 triphod kernel: 4: w=4 pa=0 pr=5 m=1 a=2 r=5 op1=0 op2=0
Nov 21 19:54:16 triphod kernel: raid5: cannot start dirty degraded array for md0
Nov 21 19:54:16 triphod kernel: RAID5 conf printout:
Nov 21 19:54:16 triphod kernel:  --- rd:5 wd:4
Nov 21 19:54:16 triphod kernel:  disk 0, o:1, dev:sda3
Nov 21 19:54:16 triphod kernel:  disk 2, o:1, dev:sdc3
Nov 21 19:54:16 triphod kernel:  disk 3, o:1, dev:sdd3
Nov 21 19:54:16 triphod kernel:  disk 4, o:1, dev:sde3
Nov 21 19:54:16 triphod kernel: raid5: failed to run raid set md0
Nov 21 19:54:16 triphod kernel: md: pers->run() failed ...
Nov 21 19:54:16 triphod kernel: md: do_md_run() returned -5
Nov 21 19:54:16 triphod kernel: md: md0 still in use.

I am at a loss to determine what went wrong, and how to recover from the error. Any constructive suggestions greatly appreciated!

Ser Olmy 11-22-2012 02:22 PM

If you Google for "md: kicking non-fresh", you'll get lots of hits and quite a few suggenstions.

It seems this can happen if the system goes through an unclean shutdown due to power failure, kernel panics or whatever. You will need to remove /dev/sdb3 from the array and then re-add it. Alternatively, you could reassemble the array.

netfoot 11-30-2012 02:13 PM

After googling as suggested, I got the array to run. Here's what I did:

Code:

mdadm /dev/md0 --add /dev/sdb
mdadm /dev/md0 --run

The array came up and resynced itself.


All times are GMT -5. The time now is 05:28 PM.