Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I have a similar problem - a raid 5 array with 11 drives and one drive sdh encountered problems. I tried to rebuild but sdb failed midway and my array was degraded.
I followed some advice to recover the data using mdadm -C /dev/md0 /dev/sd[efghiabcdkj]1 both using command line and webmin but the drive order sde[0], sdf[2], sdg[3].... sdk[9], sdj[10] was messed up and the array was reordered sda[0]... sdk[10]. I tried mounting and received a VFS: ext 3 file system not found...
I've tried for a week now to recover the data (which consists of personal data i saved over the last 20 yrs and work data i have spent the last 2 years working on) but to no avail. Any help is greatly appreciated. Thanks in advance.
Again, I've had the same problem. I wrote a script to try recreating the array with every possible ordering of the drives. After each creation, I tried to mount the drive as ext3. When the mount succeeds, you've got a working array.
Sorry I don't have the script to show you but it's fairly straightforward. Create a drive order string by cycling through the various drives (including "missing") in each position, skipping drives that are already in the string. Create the array using the string and try to mount the device as ext3. Eventually you should get one that works.
With 11 drives (12 when you include "missing"), a brute force approach is going to be tedious but it should eventually work.
With 11 drives however, you should consider hot spares and/or RAID 6. Or replace all of your drives with larger ones to reduce the number of drives. The more drives in a RAID array, the greater the heat problems and the probability of drive failure.
OK, her's the script I have used successfully more than once. It can be easily modified for more or fewer drives. Note the nested for loops that need to be adjusted to match the number of drives you actually have. I've just had a need to use this script once again, so I know it works. It took less than a minute to salvage 2T of broken RAID5 array.
some notes:
- each for loop contains an if to prevent duplicate partitions from being used in a --create. If you have a lot of partitions in your array, this can get very awkward.
- you need a ./yes file containing the letter "Y" to feed the mdadm --create command or you get asked for a lot confirmations.
- the mdadm --stop command doesn't always work, so I've put it in a loop. The array must be stopped between attempts!
- I've put missing at the end of the drive list so that it will try to reconstruct using all the drives if possible
for a in $drives; do
for b in $drives; do
if [ "$b" != "$a" ]; then
for c in $drives; do
if [ "$c" != "$a" -a "$c" != "$b" ]; then
for d in $drives; do
if [ "$d" != "$a" -a "$d" != "$b" -a "$d" != "$c" ]; then
for e in $drives; do
if [ "$e" != "$a" -a "$e" != "$b" -a "$e" != "$c" -a "$e" != "$d" ]; then
echo attempting to create $mddevice with $a $b $c $d $e
if `mdadm --create $mddevice --level=5 --raid-devices=5 $a $b$ $c $d $e < ./yes`; then
if `mount -t ext3 $mddevice $mountpoint`; then
exit
fi
fi
while ! mdadm --stop $mddevice ; do
sleep 0.25
done
fi
done
fi
done
fi
done
fi
done
done
Last edited by garydale; 10-27-2010 at 11:12 PM.
Reason: corrected typo
(I've successfully recovered from a 2 disk failure in a software RAID5 array of 7 drives without losing much data, so it's certainly possible)
It's also *much easier* to be able to pull the disks out of a machine and drop them into an entirely different system running a different linux distribution and even a different architecture, ie: PPC to x86 or Sparc. Doing that with a hardware RAID card can cause driver issues and all sorts.
Software RAID is incredibly flexible.
mdadm and fsck are usually all you'll need to recover from any sort of Software RAID5 issue in Linux. You will lose data, but depending on the amount of activity on the filesystem it can be surprisingly little.
It's possible? That gives me a bit of hope. I know this thread is old, but it's the first that came up on google. So maybe somebody can still help me.
Short story, I've had a 3 disc RAID5 array, mdadm software raid. Two of the discs got their first 13.5 GB overwritten, the third one is fine.
Is there any way to recover the data? The discs are all 1 TB disc, so even tough the first 13.5 GB has gotton overwritten, the other 987.5 GB of data could be ok.
I've seen wonders done with RAID5 before, how would I go about making this miracle come to be?
Recovering data is a two-step process, although I'd advise you to work from copies of the original drives in all cases. You can get 1T drives for under $100, so how much is your data worth?
First, get the RAID array back up and running.
Second, use standard data recovery techniques to recover your files. This can vary depending on the file system you are using and the exact damage to the file system. The fact that you are dealing with a RAID5 array is not really relevant unless you were going to try a hardware recovery service.
In your case, overwriting 13.5G is pretty severe damage. I don't think it will be easy. Good luck.
yep... very true... making copies is invaluable... i wished i had the patience to wait for the replacement drives to arrive instead of starting the recovery process without making back ups =(
Okay, it was fixed, and all I lost was some large, old, files that could be easily restored from backup.
The way to go about this was to recreate the raid array with the assume clean option and the one good disk and one overwritten disk.
Then, add the third disk to the array and it will start to resync. This in itself is not enough, because a large part of the data is missing. However, since there was an LVM PV volume on this array, that information could be retrieved. Using the backup copies of the LVM information, I was able to recreate the exact same PV again.
Ran pvscan, vgscan and lvscan and my volumes where mountable again. Ran XFS chk on the mounted volume, it found about 13GB of 'problems', damaged video files mostly, but those could be restored from backup.
All in all, the recovery was speedy, succesfull and unexpected. Thanks for the help.
However, I do now have, once more, a refound respect for backups If I hadn't backed up the data, the loss would have been significant. Remember people, RAID is only your first line of defense and it's not a backup.
Thank you for the instructions.
I tried this on a raid 5 with 3 disks in my little NAS. One drive was acting up, and I pulled out another, so it was technically broken to one disk for some amount of time.
It did not boot up correct the next time.
So I made an image of each drive. dd if=/dev/... of=/dev/...
Then put my clone drives in the NAS and by shell I found the drives and used the command written a way back.
Code:
mdadm --create /dev/md0 --level=5 --raid-devices=3 --name=MyRAIDDisk0 /dev/scsi/c0b0t0u0 /dev/scsi/c1b0t0u0 /dev/scsi/c2b0t0u0
mdadm: /dev/scsi/c0b0t0u0 appears to be part of a raid array:
level=raid5 devices=3 ctime=Thu Feb 17 10:02:38 2011
mdadm: /dev/scsi/c1b0t0u0 appears to be part of a raid array:
level=raid5 devices=3 ctime=Thu Feb 17 10:02:38 2011
mdadm: /dev/scsi/c2b0t0u0 appears to be part of a raid array:
level=raid5 devices=3 ctime=Thu Feb 17 10:02:38 2011
Continue creating array? y
mdadm: array /dev/md0 started.
Well, this came out like the example. But the raid did not appear again in the web admin. The volumes were not there either, but it was rebuilding the third disk.
I was not sure how to mount this because I knew it had several partitions but they were not identified.
So, I uploaded a little file from cgsecurity called testdisk_static and made it executable. I ran the file and chose to analyze md0. This found the Linux LVM partition. So I made a prayer, then rebooted with hope that NAS mapper would find the md0 again and read the LVM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.