LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 06-09-2008, 01:10 AM   #1
rjstephens
LQ Newbie
 
Registered: Jun 2004
Posts: 11

Rep: Reputation: 0
recovering software raid - disk marked as failed


Hi

I have a 4 disk RAID5 software raid away that i'm desperately trying to get my data from.

Of the 4 disks in the array, one is totally wrecked. Of the other three, mdadm picks them up but refuses to activate the array because one of them is marked as faulty.

I can't seem to find anywhere that tells mdadm to ignore the faulty status of the disks and mount the array anyway. The only thing I can see that MIGHT do it is a mdadm --build command, but that seems incredibly risky.

Any help would be greatly appreciated.

-Richard
 
Old 06-09-2008, 01:41 AM   #2
Vit77
Member
 
Registered: Jun 2008
Location: Toronto, Canada
Distribution: Mandriva, RHEL, Mageia, SuSE
Posts: 130

Rep: Reputation: 17
Mark the faulty disk as BAD: mdadm /dev/md0 --fault /dev/sda1
Data in array should be accessible since that.

To remove the disk from RAID: mdadm /dev/md0 --remove failed

Be sure to use proper device names.
 
Old 06-09-2008, 01:56 AM   #3
rjstephens
LQ Newbie
 
Registered: Jun 2004
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Vit77 View Post
Mark the faulty disk as BAD: mdadm /dev/md0 --fault /dev/sda1
Data in array should be accessible since that.

To remove the disk from RAID: mdadm /dev/md0 --remove failed

Be sure to use proper device names.
I don't see how that would help.
The faulty disk is no longer in the array. The disk that i'm having problems with is already marked as faulty. I've tried removing it from the array but this gives me an error :
# mdadm /dev/md0 --remove /dev/sda1
mdadm: hot remove failed for /dev/sda1: No such device
#


even though the device appears in /proc/mdstat and i can see it is definitely there

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive hde1[1] sda1[4](F) sdb1[2]
488390080 blocks

unused devices: <none>
#

mdadm --run /dev/md0 failes as well:

# mdadm --run /dev/md0
[ 593.455078] raid5: device hde1 operational as raid disk 1
[ 593.455143] raid5: device sdb1 operational as raid disk 2
[ 593.455202] raid5: not enough operational devices for md0 (2/4 failed)
[ 593.455261] RAID5 conf printout:
[ 593.455315] --- rd:4 wd:2
[ 593.455370] disk 1, o:1, dev:hde1
[ 593.455425] disk 2, o:1, dev:sdb1
[ 593.455479] raid5: failed to run raid set md0
[ 593.455535] md: pers->run() failed ...
mdadm: failed to run array /dev/md0: Input/output error
#
 
Old 06-09-2008, 02:21 AM   #4
Vit77
Member
 
Registered: Jun 2008
Location: Toronto, Canada
Distribution: Mandriva, RHEL, Mageia, SuSE
Posts: 130

Rep: Reputation: 17
I feel sorry, but
Quote:
Originally Posted by rjstephens View Post
2/4 failed
#
You might try to add back fault drives by turns, but I'm afraid it'll not succeed.

RAID5 doesn't work with 2 disks.

Last edited by Vit77; 06-09-2008 at 02:23 AM.
 
Old 06-09-2008, 02:39 AM   #5
rjstephens
LQ Newbie
 
Registered: Jun 2004
Posts: 11

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Vit77 View Post
I feel sorry, but

You might try to add back fault drives by turns, but I'm afraid it'll not succeed.

RAID5 doesn't work with 2 disks.
uhh

Ok, of the 2 disks that are not working, one is completely wrecked. The other is working fine but is marked as failed for some reason

How do I remove it from the failed state? is it not just a flag in the superblock?
 
Old 06-09-2008, 02:46 AM   #6
Vit77
Member
 
Registered: Jun 2008
Location: Toronto, Canada
Distribution: Mandriva, RHEL, Mageia, SuSE
Posts: 130

Rep: Reputation: 17
##########

Last edited by Vit77; 06-09-2008 at 02:51 AM. Reason: sorry, wrong keys...
 
Old 06-09-2008, 03:02 AM   #7
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
Could you use the --force option with an mdadm assemble command?

Quote:
Originally Posted by mdadm manpage
MODES
mdadm has several major modes of operation:

Assemble
Assemble the parts of a previously created array into an active
array. Components can be explicitly given or can be searched
for. mdadm checks that the components do form a bona fide
array, and can, on request, fiddle superblock information so as
to assemble a faulty array.
 
Old 06-09-2008, 03:30 AM   #8
rjstephens
LQ Newbie
 
Registered: Jun 2004
Posts: 11

Original Poster
Rep: Reputation: 0
If i stop the array and try as you suggest, here is what happens:


# mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1 /dev/hde1
[ 6126.057516] md: md0 stopped.
[ 6126.061879] md: bind<sdb1>
[ 6126.061966] md: bind<sda1>
[ 6126.062378] md: bind<hde1>
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive hde1[1](S) sda1[4](S) sdb1[2](S)
732584256 blocks

unused devices: <none>
# mdadm --run /dev/md0
[ 6253.994781] raid5: device hde1 operational as raid disk 1
[ 6253.994845] raid5: device sdb1 operational as raid disk 2
[ 6253.994904] raid5: not enough operational devices for md0 (2/4 failed)
[ 6253.994964] RAID5 conf printout:
[ 6253.995018] --- rd:4 wd:2
[ 6253.995072] disk 1, o:1, dev:hde1
[ 6253.995127] disk 2, o:1, dev:sdb1
[ 6253.995182] raid5: failed to run raid set md0
[ 6253.995238] md: pers->run() failed ...
mdadm: failed to run array /dev/md0: Input/output error
# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive hde1[1] sda1[4](F) sdb1[2]
488390080 blocks

unused devices: <none>
#
 
Old 06-10-2008, 02:02 AM   #9
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
All I can think of is running spinrite on sda1. However I don't know if you have a hardware problem or the data on the drive was corrupted. I think that the --force option should cause mdadm to attempt the use the faulty member, in case the drive data isn't faulty.

Maybe the data really is corrupted.

Last edited by jschiwal; 06-10-2008 at 02:03 AM.
 
Old 06-10-2008, 03:29 AM   #10
rjstephens
LQ Newbie
 
Registered: Jun 2004
Posts: 11

Original Poster
Rep: Reputation: 0
OK, will try spinrite then. Thanks for the advice.

-Richard
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Software RAID 5 crash and wrongful failed disk flagged Cairan Linux - Server 0 12-18-2006 05:14 AM
Recovering software 5 RAID wesleywest Red Hat 1 02-09-2005 02:35 PM
Software Raid - recreate failed disk. FragInHell Red Hat 5 11-25-2004 04:32 PM
Recovering Software RAID Mirror rootking Linux - Software 1 11-01-2004 07:32 PM
software raid - add device wrongly marked faulty back into array? snoozy Linux - General 2 06-27-2003 02:11 PM


All times are GMT -5. The time now is 01:49 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration