LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 10-17-2009, 02:22 PM   #1
eRJe
Member
 
Registered: May 2005
Location: Netherlands
Distribution: Slackware 14.1 Kernel 3.12.1
Posts: 103

Rep: Reputation: 16
Recovering (tools) RAID5


Hi foks,

Some weeks ago 2 drives (total of 6) failed in my RAID5 array. It appeared that only one was really bad and the second drive I could add back to the array. (I failed the broken drive) Unfortunately I was not able to mount the array anymore. I'm getting the typical error: wrong fs, bad option, bad superblock...

I examined the array and this is wat I got:

Code:
mdadm -D /dev/md1

        Version : 00.90.03
  Creation Time : Wed Jul 29 09:26:08 2009
     Raid Level : raid5
     Array Size : 2441919680 (2328.80 GiB 2500.53 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11GB)
   Raid Devices : 6
  Total Devices : 5
Preferred Minor : 1
   Persisitence : Superblock is persistent

    Update Time : Sat Oct 17 20:08:54 2009
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : Left-symmetric
     Chunk Size : 64K

           UUID : 7342878c:38a74bb:f463d13d:0af5195f
         Events : 0.50

Number   Major  Minor  RaidDevice  State
  0        8      33      0        active sync   /dev/sdc1
  1        8      17      1        active sync   /dev/sdb1
  2        8      49      2        active sync   /dev/sdd1
  3        8       1      3        active sync   /dev/sda1
  4        8      65      4        active sync   /dev/sde1
  5        0       0      5        removed
What caught my eye was the strange order of drives. I'm pretty sure I created the array in an alphabetical order ie sda, sdb, sdc, sdd, sde, sdf.

I heard some good stuff about test-disk so I gave that a try. Test-disk was able to find the directory structure on the array. So that was giving me some hope. However, going to deep into the directory tree would cause Test-Disk to exit with a fault.

Since I needed some extra storage anyways, I ordered 4 TB disks and created a new 3 TB RAID5 array and created 5 disk dumps. Now I can play with the images instead of the drives.

Code:
dd if=/dev/sda1 of=/mnt/backup/sda1.img bs=64k
dd if=/dev/sdb1 of=/mnt/backup/sdb1.img bs=64k
dd if=/dev/sdc1 of=/mnt/backup/sdc1.img bs=64k
dd if=/dev/sdd1 of=/mnt/backup/sdd1.img bs=64k
dd if=/dev/sde1 of=/mnt/backup/sde1.img bs=64k
Then I did

Code:
losetup /dev/loop0 /mnt/backup/sda1.img
losetup /dev/loop1 /mnt/backup/sdb1.img
losetup /dev/loop2 /mnt/backup/sdc1.img
losetup /dev/loop3 /mnt/backup/sdd1.img
losetup /dev/loop4 /mnt/backup/sde1.img
And eventually

Code:
mdadm --create /dev/md2 --level=5 --raid-devices=6 --auto=part /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 missing
Note the new order of disks, ie sda, sdb, sdc, sdd, sde.

Still I couldn't mount the array, same error as before. To test the image raid I run Test-Disk on it. Test-Disk couldn't find anything on the image raid array, so I recreated the image array but this I time used the same order of disks as the original array, ie sdc, sdb, sdd, sda, sde. Run Test-Disk again and yes, there was the directory structure again.

Now this makes me thinking. I'm pretty, pretty sure I did not created the array in this weird order. However, it seems I did... Right?

I've read on some boards I should run disk check utilities but others say I shouldn't. The original raid array was formatted as Reiserfs.

Anyone with some clever idea's?

Robbert
 
Old 10-18-2009, 01:32 AM   #2
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
Quote:
Originally Posted by eRJe View Post
Hi foks,

Some weeks ago 2 drives (total of 6) failed in my RAID5 array. It appeared that only one was really bad and the second drive I could add back to the array. (I failed the broken drive) Unfortunately I was not able to mount the array anymore. I'm getting the typical error: wrong fs, bad option, bad superblock...

I examined the array and this is wat I got:

Code:
mdadm -D /dev/md1

        Version : 00.90.03
  Creation Time : Wed Jul 29 09:26:08 2009
     Raid Level : raid5
     Array Size : 2441919680 (2328.80 GiB 2500.53 GB)
  Used Dev Size : 488383936 (465.76 GiB 500.11GB)
   Raid Devices : 6
  Total Devices : 5
Preferred Minor : 1
   Persisitence : Superblock is persistent

    Update Time : Sat Oct 17 20:08:54 2009
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

         Layout : Left-symmetric
     Chunk Size : 64K

           UUID : 7342878c:38a74bb:f463d13d:0af5195f
         Events : 0.50

Number   Major  Minor  RaidDevice  State
  0        8      33      0        active sync   /dev/sdc1
  1        8      17      1        active sync   /dev/sdb1
  2        8      49      2        active sync   /dev/sdd1
  3        8       1      3        active sync   /dev/sda1
  4        8      65      4        active sync   /dev/sde1
  5        0       0      5        removed
What caught my eye was the strange order of drives. I'm pretty sure I created the array in an alphabetical order ie sda, sdb, sdc, sdd, sde, sdf.

I heard some good stuff about test-disk so I gave that a try. Test-disk was able to find the directory structure on the array. So that was giving me some hope. However, going to deep into the directory tree would cause Test-Disk to exit with a fault.

Since I needed some extra storage anyways, I ordered 4 TB disks and created a new 3 TB RAID5 array and created 5 disk dumps. Now I can play with the images instead of the drives.

Code:
dd if=/dev/sda1 of=/mnt/backup/sda1.img bs=64k
dd if=/dev/sdb1 of=/mnt/backup/sdb1.img bs=64k
dd if=/dev/sdc1 of=/mnt/backup/sdc1.img bs=64k
dd if=/dev/sdd1 of=/mnt/backup/sdd1.img bs=64k
dd if=/dev/sde1 of=/mnt/backup/sde1.img bs=64k
Then I did

Code:
losetup /dev/loop0 /mnt/backup/sda1.img
losetup /dev/loop1 /mnt/backup/sdb1.img
losetup /dev/loop2 /mnt/backup/sdc1.img
losetup /dev/loop3 /mnt/backup/sdd1.img
losetup /dev/loop4 /mnt/backup/sde1.img
And eventually

Code:
mdadm --create /dev/md2 --level=5 --raid-devices=6 --auto=part /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 missing
Note the new order of disks, ie sda, sdb, sdc, sdd, sde.

Still I couldn't mount the array, same error as before. To test the image raid I run Test-Disk on it. Test-Disk couldn't find anything on the image raid array, so I recreated the image array but this I time used the same order of disks as the original array, ie sdc, sdb, sdd, sda, sde. Run Test-Disk again and yes, there was the directory structure again.

Now this makes me thinking. I'm pretty, pretty sure I did not created the array in this weird order. However, it seems I did... Right?

I've read on some boards I should run disk check utilities but others say I shouldn't. The original raid array was formatted as Reiserfs.

Anyone with some clever idea's?

Robbert
Not such a clever idea: You didn't manage to swap cables did you during the repair did you? Perhaps you did setup the drives in order, but during the repair got the cables swapped either at the motherboard, or at the drive end. I think I've seen some bios that let you flip drive addresses too.

Sounds like you found your data in the end. I don't use reiserfs anymore since it's been orphaned, but it's never let me down. Neither has mdadm.
 
Old 10-18-2009, 02:50 AM   #3
eRJe
Member
 
Registered: May 2005
Location: Netherlands
Distribution: Slackware 14.1 Kernel 3.12.1
Posts: 103

Original Poster
Rep: Reputation: 16
I was very cautious when I disconnected the drives. I actually marked them to be sure I would connect then in the same order again. They are connected to a promise 4x SATA controller. Maybe controller messed up the order of drives after I reconnected them?

Unfortunately I still do not have my data back. Even tough Test-Disk did show me the directory structure. I haven't managed to actually restore anything with Test-Disk.

So I would still be very interested in any hints and idea's what I could try next to either repair the superblock or some tool which could recover my data.
 
Old 10-18-2009, 11:38 PM   #4
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
If the mdadm thinks the array is ok, maybe it is the file system

Quote:
Originally Posted by eRJe View Post
I was very cautious when I disconnected the drives. I actually marked them to be sure I would connect then in the same order again. They are connected to a promise 4x SATA controller. Maybe controller messed up the order of drives after I reconnected them?

Unfortunately I still do not have my data back. Even tough Test-Disk did show me the directory structure. I haven't managed to actually restore anything with Test-Disk.

So I would still be very interested in any hints and idea's what I could try next to either repair the superblock or some tool which could recover my data.
Have your tried:

reiserfsck --rebuild-tree /dev/md2

I've used this to recover from bad hardware errors. It's for desperate times, but since you're dealing with loopback maps, it's worth a try.
 
Old 10-21-2009, 02:09 AM   #5
eRJe
Member
 
Registered: May 2005
Location: Netherlands
Distribution: Slackware 14.1 Kernel 3.12.1
Posts: 103

Original Poster
Rep: Reputation: 16
I gave it a try. It took for more then 24 hours to finish the rebuild. It found quite some stuff but at the end only 20% got recovered...

I guess I can be sure now, that the (weird) order of disks is correct, right? If is wasn't I don't think reiserfsck would have found anything?

So that leaves me to no other option then to accept my losses...
 
Old 10-21-2009, 05:30 PM   #6
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
Maybe try smartmon tools and check for other bad drives. Test the drive you thought was failed again. Maybe only one of your images is bad and another dd from another machine might achieve a cleaner copy.
 
Old 02-19-2010, 10:24 AM   #7
eRJe
Member
 
Registered: May 2005
Location: Netherlands
Distribution: Slackware 14.1 Kernel 3.12.1
Posts: 103

Original Poster
Rep: Reputation: 16
Hi,

Just like to give a final report of my results trying to rebuild my RAID 5 array.

The procedure of imaging the disks, mounting them as loop device and reconfigure the RAID array works quite well! After hours and hours of rebuilding, I thought I had recovered lots of data but unfortunately all that data returned as crap.

I think my Promise SATA controller really messed around when the first disk got kicked out of the array (this WD HD turned out to be really broken). Not sure why the second disk got kicked out of the array resulting the array to fail. However, before the second disk got kicked out, I think the RAID array did some rebuilding, which resulted in corrupted data.

Rebuilding corrupted data doesn't fix anything of course so my master plan was doomed to fail. It was if all data got scrambled together.ie when I would play a MP3 is would play the first 3 seconds, then a few seconds halfway the song, etc. So somthing went wrong when all the little pieces had to be put back in order...

Anyway, in a attempt to stay just a little bit positive, I am trying to convince myself that I at least learned something from it. (Jay)!

Now I have 2 RAID-5 arrays (double trouble?). One big one with all my data and one smaller one from the old (working) disks to make backups of the really "I-don't-wanna-loose-this-data-again-stuff!

And, more important, I keep a mdadm.conf file with the actual RAID setup information. Next time I WILL know the order of disks! Ow and I installed a little script checking the RAID status now and then.

Thanks for all the input, sorry for keeping you waiting so long :-)

Robbert
 
Old 02-19-2010, 10:42 AM   #8
Jerre Cope
Member
 
Registered: Oct 2003
Location: Texas (central)
Distribution: ubuntu,Slackware,knoppix
Posts: 323

Rep: Reputation: 37
I use combinations of Linux software RAID 1 and rsync which I think is simpler and covers the event of hardware other than the hard drives failing.

(Server 1 RAID1 ) >== rsync ==> (Server 2 RAID1 )

It's also easier to migrate your data once bigger drives become available.

Thanks for sharing.
 
Old 02-19-2010, 10:51 AM   #9
eRJe
Member
 
Registered: May 2005
Location: Netherlands
Distribution: Slackware 14.1 Kernel 3.12.1
Posts: 103

Original Poster
Rep: Reputation: 16
That was quick :-)

I am actually considering placing a NAS behind a VPN router somewhere far away from my server. This NAS will be a lot smaller and will only have a backup of the most valuable data.

Thanks again!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Multi Layer RAID50 fail (Intel SRCS14L RAID5 + 3ware 9550SX-4LP RAID5)+Linux RAID 0 BaronVonChickenPants Linux - Server 4 09-27-2009 04:06 AM
module-init-tools, initramfs-tools broken virgilhowardson Debian 1 03-08-2009 12:30 AM
cdda ripping tools: what tools are good these days? jgombos Linux - Software 3 01-03-2005 11:09 PM
is there any virtual cd tools like deamon tools on linux ? ixogn Linux - Software 1 02-24-2004 10:19 AM
Recovering Raid5 After Crash rspurlock Linux - Hardware 1 08-19-2003 12:12 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 09:55 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration