LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 10-29-2007, 09:23 AM   #1
auroraglacialis
LQ Newbie
 
Registered: Oct 2007
Posts: 9

Rep: Reputation: 0
Unhappy RAID 6 failure - 3 (ot of 7) disks failed but 2 of them are ok, recovery possible?


(edit: misleading title. should be "RAID 6 failure - 3 (ot of 7) disks failed but 2 of them are ok, recovery possible?")

Hi.

I used to have a RAID 6 (created with mdadm) consisting of 7 HDDs (5+2). This summer, one disk died and I removed it. I figured that there are still (5+1) disks left, so one spare still - and continued working with it until I can buy a new HDD. But then disaster struck and took out another drive, /proc/mdstat told me there are now only 5 drives, so I have no reserve.

I checked the disk that gave up last with SMART and it came out ok, so I added it back in the RAID as a new drive and it started syncing. At about 2% the PC froze, upon reboot only 3 drives where in the RAID. I checked the missing drives with mdadm -E and they where ok, superblocks and all. So I figured I have to manually add them to the RAID for some reason.

Now the big mistake was to use --add for the first device I tried to get back in the RAID and looking at the RAID info, it was added as "spare". Then strang things happened again with the PC and I checked for Hardware errors. Found that 2 IDE controllers where not behaving well any longer, probably causing all the trouble.

Now to get the data back I tried to copy every HD that was at one time part of the RAID to image files (with dd if=/dev/hdx of=/mnt/backup/hdx), so I would only have to use the on-board controller. So now I have 6 files which are images of the 6 HDDs that where formerly installed, 4 of which where pretty much untouched, one was added as a new drive that started syncing while the RAID was still active and one was added as spare while the RAID was inactive.

Trying to simply assemble the RAID from this fails with an IO-Error, using readonly gives no valid filesystem.

I figure that the data should well be there, since one drive was only 2% written on (the rest should still contain the old data from a time it was still in the RAID) and one drive was just added as spare (it was not written at all unless adding it as a spare deletes the contents). And I basically need only one of them to have 5 valid disks to start the RAID.

Now how can I recover at least some data? Somehow telling the RAID to put the "spare" back in the place where it was before dropping out? Recovering some data from the HDD that was in the RAID, dropped out and was added as a new HDD until it was about 2% in the resyncing process?

Hope someone can help me. I put some personal and business files on the RAID which are lost now.

Many thanks
Aurora

Last edited by auroraglacialis; 10-29-2007 at 04:54 PM. Reason: misleading title
 
Old 10-31-2007, 02:22 AM   #2
aylen
LQ Newbie
 
Registered: Oct 2007
Posts: 2

Rep: Reputation: 0
Trying various options at this critical stage would be dangerous. I think contacting RAID Recovery Services like Disk Doctors Labs Inc. would be the best solution for your problem.
 
Old 11-01-2007, 10:11 AM   #3
auroraglacialis
LQ Newbie
 
Registered: Oct 2007
Posts: 9

Original Poster
Rep: Reputation: 0
private solution required

Hi.
Since I am not a company and do not have the means to spend a lot of money on this, I'd prefer a private solution. I made image-copies of the original drives and kept the original drives, so I could try some recovery with the drive images and let the original HDDs untouched. Of course read-only recovery is preferrable though. I have backups of some of the data and some data is not critical (photoscans, Audio-CD copies, old Photoshop-Files), but about 5% of the data is not in backups and not recoverable from other sources (Photo printouts etc).

If the definite answer is: "The data is not recoverable at all or only by professionals for hundreds of " then that is a definite answer, too. I would have to start restoring the backups etc which means a lot of work and I will be missing some files, but at least I could free the HDDs and backup-HDDs and start filling them with data again.

Greetings
Aurora
 
Old 11-09-2007, 04:14 PM   #4
koflanagan
LQ Newbie
 
Registered: Mar 2005
Location: San Antonio
Posts: 20

Rep: Reputation: 0
Hmm 3 out of 7 disk are dead in a raid 6.. I would say is not recoverable..
 
Old 11-10-2007, 12:25 PM   #5
auroraglacialis
LQ Newbie
 
Registered: Oct 2007
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by koflanagan View Post
Hmm 3 out of 7 disk are dead in a raid 6.. I would say is not recoverable..
No actually only one disk is really dead. Two more howeveer are somehow out-of-date. One was removed from the array and re-added as a new disk, but the recovery-process stopped at 2% due to a Controller failure. Another disk was also removed from the RAID and then re-added, but it was taken up as a "spare" disk, not re-integrated in the RAID, although nothing on the disk was changed (it was part of the array before and just got kicked out due to the same Controller failure).

So basically I started with a RAID with 7 disks, then
* one disk died and was removed.
* RAID had 6 drives left
* An unknown error kicked disk A out
* RAID had 5 drives left
* Disk A was checked with SMART and came up error-free
* Disk A was added to the RAID
* Array started "recovery" on Disk A and aborted at 2%
* After reboot, Disk A and Disk B was missing from the Array
* YUK - only 4 Drives in the Array
* Did mdadm --add on Disk B, effectively adding it as "spare"
* YUK now it's 4 drives and 1 spare although Disk B was the 5th Disk!
* Controller was determined to be the source of the problem
* Made a dd-Copy of all HDDs to experiment with recovery options

So Disk B should still contain the data, unless adding it as spare (however that happened) deleted all the content.
And Disk A should at least contain 98% of the data that was in there before the recovery started.

It is not vital to recover all of the content, it would be ok to recover most of it.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
data recovery using linux boot disks? vbsaltydog Linux - Software 27 05-11-2006 11:45 PM
How to dual boot Suse linux and XP with only recovery disks and one hard drive Jonnyk429 Linux - Newbie 1 02-09-2006 03:51 PM
Raid Problem Fedora Core 3, RAID LOST DISKS ALWAYS icatalan Linux - Hardware 1 09-17-2005 04:14 AM
Raid 1 Recovery after a drive failed... Wyntyr Linux - General 2 09-02-2005 05:01 PM
WARNING: Some disks in your RAID arrays seem to have failed! patrickkenlock Debian 4 04-26-2004 03:19 AM


All times are GMT -5. The time now is 11:14 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration