mdadm raid6 active despite 3 drive failures
I am currently having problems with my RAID partition. First two disks were having trouble (sde, sdf). Through smartctl I noticed there were some bad blocks, so first I set them to fail, and readded them so that the RAID array will overwrite these.
Since that didn't work, I went ahead and replaced the disks. The recovery process was slow and I left things running overnight. This morning I find out that another disk (sdb) has failed. Strangely enough the array has not become inactive.
md3 : active raid6 sdf1(S) sde1(S) sdak1 sdj1 sdk1 sdb1(F) sdan1 sdd1 sdc1 sdg1 sdi1 sdal1 sdam1 sdao1 sdh1
25395655168 blocks level 6, 64k chunk, algorithm 2 [15/12] [_UU__UUUUUUUUUU]
Does anyone have any recommendations as the steps to take ahead with regards to recovery/fixing the problem? The disk is basically full so I haven't written anything to disk in the interim of this problem.
Well there's a good summary of RAID types here https://secure.wikimedia.org/wikipedia/en/wiki/RAID, but basically it says you only need 4 active disks to keep a RAID 6 running.
You seem to have 15(?) total disks, with 2 Syncing and one Failed; just replace the Failed one and continue.
Obviously the less you use the raid, the faster the syncs will complete.
when it's finished syncing, you need to do at least 1 of
1. purge some space
2. add more disks
3. backup and replace with something else
I think that RAID6 has a fault tolerance of two partitions, that's why I'm so worried. Therefore, with two disks turned into 'spares' and another one failing, I don't think there's anymore tolerance for errors.
|All times are GMT -5. The time now is 09:00 AM.|