zanerock 05-28-2010 01:15 PM

mdadm phantom drive and info discrepencies
Server with 4 disk partitions in an RAID 5 array using md.

Yesterday the array failed with two devices showing as faulty. After rebooting from rescue, I was able to force the assembly and start the array and everything looks to be okay as far as the data goes, but when I run:


mdadm --examine /dev/sda3
I get (truncating to the interesting bits):


      Raid Level : raid5
  Raid Devices : 4

 Avail Dev Size : 1448195216 (690.55 GiB 741.48 GB)
    Array Size : 4344585216 (2071.66 GiB 2224.43 GB)
  Used Dev Size : 1448195072 (690.55 GiB 741.48 GB)

    Array Slot : 0 (0, 1, 2, failed, 3)
  Array State : Uuuu 1 failed



mdadm --detail /dev/md1


    Array Size : 2172292608 (2071.66 GiB 2224.43 GB)
  Used Dev Size : 1448195072 (1381.11 GiB 1482.95 GB)
  Raid Devices : 4
  Total Devices : 4

 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

Additionally, the 'slot' for devices a-d line up like this:

a - 0
b - 1
c - 2
d - 4 (!)

The first number 'Array Size' from examine is twice as big as sit should be based on the output from detail and comparing to twinned server, and why does the 'Array State' and 'Array Slot' from examine indicate there's a 5th device that's not indicated anywhere else?

And how do I fix this?

tvynrylan 06-01-2010 07:14 PM

The command


mdadm --manage /dev/md1 --remove failed
may be helpful. As for the specific index that a drive gets assigned, I wouldn't worry too much about it. There are only four devices in your array; the fourth device was assigned index 4 when it was added for whatever reason (quite possibly because the RAID manager already thought that the same device was at index 3 or something).

Remember that --examine (IIRC) is going to give you the RAID metadata from one device while --detail gives you what the RAID manager thinks. The fact that there's a discrepancy bothers me too, but I'm not sure what you would do about that short of forcing the rewrite of the RAID's metadata. Perhaps the remove command above will help?

zanerock 06-02-2010 12:34 AM

Good suggestion, but no luck. I've been considering updating the device superblock/meta-data and will likely do so this weekend.

