LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
LinkBack Search this Thread
Old 07-21-2010, 05:21 PM   #1
jml48197
LQ Newbie
 
Registered: Jul 2010
Posts: 6

Rep: Reputation: 0
Broken raid 5 (11 drives in mdadm) -- data recovery/raid reconstruction needed -- ple


Hi there:

Thanks for reading this thread and I thank you in advance for any help you can provide.

So this is what happened... I noticed that my MDADM RAID 5 array with drives ordered: /dev/sd[EFGHIABCDKJ]1 reported a failed drive -- /dev/sdb1. I stopped the array and ran smartctl -t long /dev/sdb1 and received a pass.

So I added /dev/sdb1 back to /dev/md0 with mdadm --add. In the process of rebuilding, /dev/sdh1 went offline (the data cable must have been knocked loose while I was moving from FL to MI) and now the array state is degraded. I checked both drives using smartctl again and received 2 passes.

I read advice on some forum about using mdadm -C /dev/md0 /dev/sd[efghiabcdkj]1 but the array resynced with the drive order messed up (sd[abcdefghijk]1 as opposed to sd[efghiabcdkj]1). I tried to mdadm -Af /dev/md0 but got a missing superblock error message.

Came across another post stating that I should do mdadm -C --assume-clean /dev/md0 /dev/sd[efghia MISSING cdkj]1 and then add /dev/sdb1 and then mdadm --assemble /dev/md0 --resync=update but I had a flashdrive plugged in my server which got assigned /dev/sdi1 (OPPS)... Anyways, I pulled the plug quickly, halted the system, removed the flash drive and repeated the steps.

================================================================================
fdisk -l reports:
Disk /dev/hda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/hda1 * 1 3187 25599546 7 HPFS/NTFS
/dev/hda2 3188 60801 462784455 5 Extended
/dev/hda5 3188 9561 51199123+ 7 HPFS/NTFS
/dev/hda6 9562 28045 148472698+ 83 Linux
/dev/hda7 28046 28835 6345643+ 82 Linux swap / Solaris
/dev/hda8 28836 60801 256766863+ 83 Linux

Disk /dev/sda: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 182402 1465138552+ 83 Linux

Disk /dev/sdb: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdb1 * 1 182402 1465138552+ fd Linux raid autodetect

Disk /dev/sdc: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdc1 * 1 182402 1465138552+ 83 Linux

Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdd1 * 1 182402 1465138552+ 83 Linux

Disk /dev/sde: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sde1 * 1 182401 1465136001 83 Linux

Disk /dev/sdf: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdf1 * 1 182401 1465136001 83 Linux

Disk /dev/sdg: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdg1 * 1 182401 1465136001 83 Linux

Disk /dev/sdh: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdh1 * 1 182401 1465136001 83 Linux

Disk /dev/sdi: 1500.3 GB, 1500301910016 bytes
16 heads, 63 sectors/track, 2907021 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

Device Boot Start End Blocks Id System
/dev/sdi1 * 1 2907021 1465138552+ 83 Linux

Disk /dev/sdj: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdj1 * 1 182402 1465138552+ 83 Linux

Disk /dev/sdk: 1500.3 GB, 1500301910016 bytes
255 heads, 63 sectors/track, 182401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sdk1 * 1 182402 1465138552+ 83 Linux

Disk /dev/md0: 0 MB, 0 bytes
2 heads, 4 sectors/track, 0 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table


================================================================================

So I am guessing that me inserting the flashdrive messed up the # of heads on all the other drives except the drive that did not get assigned the first mdadm -C because its assignment was taken by the flashdrive.

So.... bottom line is... now the resync is completed (diskstats shows reads but no writes to disk) and I am unable to mount the array. I get a "VFS: Can't find ext3 filesystem on dev md0" message.

Current status: R-Studio reports some data, testdisk is still analyzing my partition, I aborted Raid Reconstructor cause it reports taking like 20 days to complete...

Any hints on how I can recover my data? Any suggestions you can offer will be greatly appreciated cause I am starting a new job and cannot afford to look disorganized despite the bad run of events this past week. Thanks... J
 
Old 07-23-2010, 03:23 AM   #2
zeno0771
Member
 
Registered: Jun 2006
Location: Northern IL
Distribution: Arch64
Posts: 106

Rep: Reputation: 19
Do you have an mdadm.conf for this array? If it's set to do a certain thing and you tell mdadm to do something else with a /dev/md0 which is already defined in mdadm.conf you could have a problem.

I've had drives mis-ordered in RAID arrays and suffered no ill effects; as long as the drive names themselves are consistent (e.g. it's always the same 3/5/7/10 letters) you shouldn't have problems with that.

I'd say get a copy of Parted Magic and try to assemble/mount the array within it (Parted Magic has mdadm), separate from the installed OS. Also, avoid running a bunch of diagnostic stuff on an array all at once or it'll take forever and possibly return inaccurate results; do one test at a time.

Just out of curiosity, why did you choose RAID-5 with so many drives? RAID-5 is only fault-tolerant to 1 drive, regardless the size of the array. In fact, as the array gets bigger the likelihood of an array failure actually increases probability-wise.
 
Old 07-26-2010, 02:07 PM   #3
jml48197
LQ Newbie
 
Registered: Jul 2010
Posts: 6

Original Poster
Rep: Reputation: 0
Thanks for your suggestion... I'd give it a try and get back to you. I did have a mdadm.conf file for the array and /dev/md0 was already defined (not by UUID but by drive order).

As for why I used RAID-5, my best guess would be as a cost saving measure (electric and drive cost). In hindsight, I should have gone with RAID6 and a more appropriate filesystem: XFS.
 
Old 07-26-2010, 02:10 PM   #4
jml48197
LQ Newbie
 
Registered: Jul 2010
Posts: 6

Original Poster
Rep: Reputation: 0
Anyone familiar with testdisk and RAID? I tried testdisk's analyze function several times and it was running for like 3 days and stalls.
 
Old 07-27-2010, 12:57 PM   #5
zeno0771
Member
 
Registered: Jun 2006
Location: Northern IL
Distribution: Arch64
Posts: 106

Rep: Reputation: 19
Quote:
Originally Posted by jml48197 View Post
As for why I used RAID-5, my best guess would be as a cost saving measure (electric and drive cost). In hindsight, I should have gone with RAID6 and a more appropriate filesystem: XFS.
If I may suggest RAID-10...

In your case there would be minimal net change in power consumption and you'd have much better fault tolerance and performance (in fact writes can be almost twice as fast). I speak from experience; I started with RAID-5 as well but as my storage needs grew past 3 physical drives the shrinking fault-tolerance made me nervous. Also, mdadm does RAID-10 natively now (that is, in one step), no more stripe-then-mirror.

Of course, all of this depends on exactly what you're using the array for; mine was for a combination of storage and hosting VMs which is a lot of small writes with the occasional big one. I don't regret my decision one bit; the performance is better and I feel safer between backups.

Also, +1 on XFS.
 
  


Reply

Tags
data, mdadm, raid, raid5, recovery


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Software Raid 5 (md) recovery using mdadm fakeroot Linux - Server 24 04-15-2013 09:19 AM
mdadm RAID 0 Recovery? romeo_tango Linux - Hardware 11 06-10-2010 08:19 PM
mdadm raid 5 recovery / reassemble Ciesko Linux - Server 1 04-15-2010 12:53 PM
[SOLVED] Hardware RAID vs Software RAID and DATA RECOVERY bskrakes Linux - General 7 07-04-2008 01:09 PM
Raid 1 with mdadm, restarting causes full reconstruction jeriryan Linux - Software 2 06-12-2008 02:28 PM


All times are GMT -5. The time now is 10:30 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration