LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 05-23-2008, 12:26 PM   #1
atarghe1
LQ Newbie
 
Registered: May 2008
Posts: 2

Rep: Reputation: 0
mdadm RAID 5 single drive failure


Last night we had an issue where we thought one of the drives was bad in our 3 drive RAID 5 created using mdadm. Luckily the drive was okay. However, in the mean time we spent a good amount of time trying to figure out how one would recover from a single drive failure in this situation using mdadm. We searched this forum and did not seem to find any successful situations where someone recovered from a single drive failure using RAID 5 with mdadm. My question is has anyone been successful in recovering from a single drive failure using an mdadm RAID 5? If so, what was the procedure? This is unfortunately an issue we never considered when creating our RAID using mdadm. If recovery is not truly possible, we will go with a hardware RAID. Thanks for your time!

-Andrew
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 05-23-2008, 01:10 PM   #2
sayNoToDiamond
LQ Newbie
 
Registered: May 2008
Posts: 2

Rep: Reputation: 0
Sorry to hear about your experience Andrew!

Clearly, mdadm is not an actual RAID tool, as the RAID acronym contains the word 'redundant' and mdadm is incapable of recovering from an actual failure of a drive. All you can hope for is that the drive(s) that are marked as 'faulty' are not truly faulty ... mdadm only thinks they are. Then the trick is to force mdadm into deciding the disks aren't faulty after all and adding them back into the array.

In the event of an actual hardware failure, I would recommend pulling the faulty drives and replacing the remaining drives as well (they are sure to follow!). Sure, you lose your RAID, but at least you can be confident that your next mdadm RAID will have the best chance at a long, fruitful life before encountering any hiccups that render it completely useless.

Good luck!
 
Old 05-23-2008, 01:54 PM   #3
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Rep: Reputation: 30
Quote:
Then the trick is to force mdadm into deciding the disks aren't faulty after all and adding them back into the array.
So if that happen, we should take the mdadm down, put the drive back and reassemble?
Last time, it happened, i took it out and put it back in, using mdadm --add, it does not pick up the drive. it still thought that the drive was bad,
 
Old 05-23-2008, 03:05 PM   #4
sayNoToDiamond
LQ Newbie
 
Registered: May 2008
Posts: 2

Rep: Reputation: 0
Quote:
Originally Posted by ufmale View Post
So if that happen, we should take the mdadm down, put the drive back and reassemble?
Last time, it happened, i took it out and put it back in, using mdadm --add, it does not pick up the drive. it still thought that the drive was bad,
First you should inspect the drive's md superblock using mdadm --examine to make sure that the superblocks match on each of the devices (even the supposedly faulty one).

If everything matches, what has worked for me in the past is using:

mdadm --assemble --force /dev/md0 <device 1> <device 2> ... <device N>

This should force mdadm to remove the faulty flag from the device. If the device is truly bad, then it still won't work, but if it was marked faulty due to power issues or a hardware controller failure then it should bring the RAID back up.
 
Old 06-03-2008, 06:18 PM   #5
tux68
LQ Newbie
 
Registered: Jun 2008
Posts: 1

Rep: Reputation: 2
Quote:
Originally Posted by atarghe1 View Post
... has anyone been successful in recovering from a single drive failure using an mdadm RAID 5? If so, what was the procedure?
Hi Andrew,

Recently went through the same thing here, a drive in our Raid-5 array was marked faulty. Further investigation revealed the drive was really quite dead. Although Google wasn't very helpful, the procedure turned out to be rather straight forward in the end.

1. Use "mdadm --manage /dev/md0 -r /dev/sdd" to remove the drive that was marked as faulty from the array.

2. Power down and replace the drive with a good drive.

3. Power up and set the partition table on the new drive to match those of the other drives in the array. Here we used "sfdisk -d /dev/sda | sfdisk /dev/sdd".

4. Add the proper partition on the new drive into the array, "mdadm --manage /dev/md0 -a /dev/sdd2"

5. Sit back and wait for the recovery to happen, you can "cat /proc/mdstat" to watch its progress; you should see something like:

Personalities : [raid5]
md0 : active raid5 sdd2[4] sdc2[2] sdb2[1] sda2[0]
731985408 blocks level 5, 256k chunk, algorithm 2 [4/3] [UUU_]
[===>.................] recovery = 19.7% (48253056/243995136) finish=59.1min speed=55184K/sec


You can get more detailed instructions for Raid 1 here:

www dot howtoforge dot com/replacing_hard_disks_in_a_raid1_array

Essentially the same steps as for Raid 5 which worked here.

Hope this helps,
Sean

Last edited by tux68; 06-03-2008 at 06:36 PM.
 
2 members found this post helpful.
Old 06-04-2008, 07:57 AM   #6
atarghe1
LQ Newbie
 
Registered: May 2008
Posts: 2

Original Poster
Rep: Reputation: 0
Much appreciated. That was exactly what I was interested in. Thanks for taking the time to post that information. I imagine that will help many others in this situation!

Last edited by atarghe1; 06-04-2008 at 07:59 AM.
 
Old 06-24-2008, 08:28 AM   #7
Chazz422
LQ Newbie
 
Registered: Mar 2004
Location: Philadelphia, PA
Distribution: Suse 12 (KDE)
Posts: 3

Rep: Reputation: 0
I had a drive in my 4 drive raid5 array fry a couple nights ago. I searched google and came across this helpful information which also worked for me. Thanks for the detailed howto.
 
Old 12-14-2012, 06:20 PM   #8
TylerD75
Member
 
Registered: Aug 2004
Location: Norway
Distribution: Gentoo
Posts: 94

Rep: Reputation: 17
Smile Thank You!

Thank you for this answer tux68!

Haven't had a drive "dropout" for ages, so I had forgotten how to do this...
This made my day!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Raid 5, mdadm failure report ufmale Linux - Server 1 05-24-2008 12:57 PM
RAID mdadm - Sending E-Mails on RAID Failure? rootking Linux - General 1 12-25-2007 03:59 AM
replace failure disk and rebuild RAID with mdadm ufmale Linux - Software 0 11-15-2007 02:24 PM
Major problem with software raid (mdadm) and disk failure norwolf Linux - Server 8 07-27-2007 06:14 AM
Ubuntu Raid 1, can't boot after Single Disk Failure elliotfuller Linux - General 7 06-05-2007 10:05 PM


All times are GMT -5. The time now is 01:13 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration