LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   help with mdadm disk failure (https://www.linuxquestions.org/questions/linux-server-73/help-with-mdadm-disk-failure-645336/)

ufmale 05-28-2008 10:06 AM

help with mdadm disk failure
 
I previously created a raid 5 from 4 disks using mdadm (sd[e-h]1)
After using it for awhile, one if the disk fails (mdadm report failure). The disk disappear when using "$fdisk -l"
My OS is redhat EL4, kernel 2.6.9-22

I restart the machine and call fdisk again, no suprise, the disk is back.

$fdisk -l
Code:

Disk /dev/sde: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sde1              1      60801  488384001  fd  Linux raid autodetect

Disk /dev/sdf: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sdf1              1      60801  488384001  fd  Linux raid autodetect

Disk /dev/sdg: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sdg1              1      60801  488384001  fd  Linux raid autodetect

Disk /dev/sdh: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

  Device Boot      Start        End      Blocks  Id  System
/dev/sdh1              1      60801  488384001  fd  Linux raid autodetect

I then try to restart the RAID, but it failed.
I am sure that the drive is good, but somehow mdadm does not like the drive. How do I fix this problem?

[root@ss1 /]# mdadm --assemble /dev/md1 /dev/sd[e-h]1
Code:

mdadm: /dev/md1 assembled from 2 drives and 1 spare - not enough to start the array.
[root@ss1 /]# cat /proc/mdstat
Code:

Personalities :
md1 : inactive sde1[0] sdf1[4] sdh1[3] sdg1[2]
      1953535744 blocks
unused devices: <none>


pinniped 05-28-2008 06:26 PM

The message is "not enough to start the array", which presumably means the drive that failed has unique data on it. You will need to rebuild an array and restore from backup.

You can 'add' a device to an array to replace a faulty gizmo, but if that drive had unique data (such as when you do plain striping), then this operation is impossible really - you need to wipe and rebuild.

tmwsiy2012 05-29-2008 08:59 AM

I would interpret that message differently.

In a RAID 5 array you will need a minimum of three active disks.

your md1 that is built with 2 drives and a spare does not meet this minimum requirement. You would need 3 drives and a spare or 4 drives to have a vail RAID 5.

I would try to investigate why mdadm is only seeing three of the four drives.

You should be fine any way you slice it if you only lost one drive in RAID5 after all that is the point of RAID 5. the parity data written to the other drives is enough to rebuild a single drive of data.

Anyway just putting in my 2 cents.

Please post back with any developments :)

ufmale 05-29-2008 09:59 AM

I am not sure either why mdadm see only 3 drives. I checked each drive by reformatting it one by one, then mount it sucessfully.
I re-create the RIAD 5 again and it works now.

The problem has been happening many time and i have to re-create the RAID 5 with mdadm. I am not sure if the problem is the disk itself.


All times are GMT -5. The time now is 01:40 PM.