Marjonel Montejo 09-30-2009 08:16 PM

Software RAID (mdadm) - RAID 0 returns incorrect status for disk failure/disk removed
Problem Description: mdadm does not update the /proc/mdstat when a drive fails or removed for RAID 0 case. When mdadm --detail /dev/md0 is executed, the information is not updated. It is as if there were no drive failures or drive removal.

I have tried testing drive fail/drive removal for RAID 5 and /proc/mdstat and mdadm --detail returns the expected result (it is displayed in /proc/mdstat that the drive failed).

I would like to ask why the behavior of drive failure/drive removal is not as expected in RAID 0 case.

Ideas deeply appreciated.

i_grok 09-30-2009 08:27 PM

My suspicion is that this is because a RAID0 can suffer no failures.

When one disk of a RAID1 or RAID5 fails, the remaining drives continue operating normally. But as soon as one drive of a RAID0 fails, Linux can no longer perform any operations on the array.

If Linux does nothing, instead of marking the drive as failed, then there is a chance you can repair/replace/recover the failed drive and restart the array.

Marjonel Montejo 09-30-2009 10:23 PM

Thanks for the reply.

If this is the case, how can the user check what drive failed or confirm the status of the RAID 0 when either one of its drives fails? Is there a way wherein a DeviceDisappeared event or Fail event will be sent through mdadm --monitor --mail when a RAID 0 mdadm array fails or a way wherein the /proc/mdstat file will be updated?
I am using mdadm v2.6.7.1 (ubuntu distribution)

i_grok 10-01-2009 09:20 AM

Yes, you're exactly right. Running mdadm in monitor mode should produce a DeviceDisappeared event. See this section of the man page:

Marjonel Montejo 10-04-2009 06:15 PM

Thanks. Got the DeviceDisappeared working.

