LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 12-17-2009, 09:16 PM   #1
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Rep: Reputation: 40
Trouble rebuilding RAID-5 array after power outage


After a power outage last night, I am unable to assemble my raid-5 array. All four external drives come on fine and are visible by doing an
Code:
fdisk -l
Quote:
/dev/sdd1 1 79041 634896801 fd Linux raid autodetect
/dev/sde1 1 79041 634896801 fd Linux raid autodetect
/dev/sdf1 1 79041 634896801 fd Linux raid autodetect
/dev/sdg1 1 79041 634896801 fd Linux raid autodetect
I tried assembling the array using the following methods
Code:
mdadm -As
And the more explicit command:
Code:
mdadm -vv -A /dev/md0 --uuid=894fb698:08203e63:ca10a67a:b78d8f10 --no-degraded /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1
Both give me the same output as follows:
Quote:
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdf1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdg1 is identified as a member of /dev/md0, slot 0.
mdadm: added /dev/sdg1 to /dev/md0 as 0
mdadm: added /dev/sdf1 to /dev/md0 as 1
mdadm: added /dev/sde1 to /dev/md0 as 2
mdadm: added /dev/sdd1 to /dev/md0 as 3
mdadm: /dev/md0 assembled from 1 drive (out of 4), but not started.
The /proc/mdstat shows the following:
Quote:
Personalities :
md0 : inactive sdd1[3](S) sde1[2](S) sdf1[1](S) sdg1[0](S)
2539586816 blocks

unused devices: <none>
Is there anything I can do to safely reassembly my raid-5 array? I have over 3 years of nightly backups on this array which is 2TB large. I could really use some help.

Thanks in advance.
 
Old 12-18-2009, 12:42 AM   #2
TL_CLD
Member
 
Registered: Sep 2006
Posts: 366

Rep: Reputation: 45
How about:
Code:
mdadm --run /dev/md0
 
Old 12-18-2009, 11:47 AM   #3
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by TL_CLD View Post
How about:
Code:
mdadm --run /dev/md0
Tried that, here's the error I got
Quote:
mdadm: failed to run array /dev/md0: Input/output error
What the hell is going on? I thought linux RAID was more reliable than this shit!

Running the following command on all four component drives shows they are "clean" (see output below). So what's going on? Is there some MDADM black magic I'm supposed to do to get this array back online?

Code:
mdadm -E /dev/sde1
Quote:
/dev/sde1:
Magic : a92b4efc
Version : 00.90.00
UUID : 894fb698:08203e63:ca10a67a:b78d8f10
Creation Time : Sun Apr 8 10:38:55 2007
Raid Level : raid5
Used Dev Size : 634896704 (605.48 GiB 650.13 GB)
Array Size : 1904690112 (1816.45 GiB 1950.40 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 4

Update Time : Wed Dec 16 19:10:25 2009
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 735506cd - correct
Events : 4

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 2 8 65 2 active sync /dev/sde1

0 0 8 97 0 active sync /dev/sdg1
1 1 8 81 1 active sync /dev/sdf1
2 2 8 65 2 active sync /dev/sde1
3 3 8 49 3 active sync /dev/sdd1

Last edited by WindowBreaker; 12-18-2009 at 11:52 AM.
 
Old 12-18-2009, 12:25 PM   #4
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Original Poster
Rep: Reputation: 40
UPDATE: Trouble rebuilding RAID-5 array after power outage

I ran the file command against each of my component drives' partitions and here is the result:

Code:
file -sL /dev/sdd1
Quote:
/dev/sdd1: X11 SNF font data, MSB first
Code:
file -sL /dev/sde1
Quote:
/dev/sde1: data
Code:
file -sL /dev/sdf1
Quote:
/dev/sdf1: data
Code:
file -sL /dev/sdg1
Quote:
/dev/sdg1: Linux rev 1.0 ext3 filesystem data (needs journal recovery)
I'd like to run fsck using an alternate superblock. How do I determine which blocks are alternate superblocks?

I've tried the following with no success:

Code:
dumpe2fs /dev/sdd1
Quote:
dumpe2fs 1.41.3 (12-Oct-2008)
dumpe2fs: Filesystem revision too high while trying to open /dev/sdd1
Couldn't find valid filesystem superblock.
Code:
dumpe2fs -o superblock=32768 /dev/sdd1
Quote:
dumpe2fs 1.41.3 (12-Oct-2008)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sdd1
Couldn't find valid filesystem superblock.
The superblock was determined by running dumpe2fs on the only apparently valid partition, /dev/sdg1. Being that the drives are all identical in make, model, and size, the superblock locations should be the same.

Code:
e2fsck -b 163840 /dev/sdd1
Quote:
e2fsck 1.41.3 (12-Oct-2008)
e2fsck: Bad magic number in super-block while trying to open /dev/sdd1

The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

So what are my options? I cannot afford to lose years of backup data because of a fu**ing power outage!
 
Old 12-18-2009, 03:06 PM   #5
Chuck56
Member
 
Registered: Dec 2006
Location: Colorado, USA
Distribution: Slackware
Posts: 930

Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
Have you thought about rerunning "create" instead of "assemble" on the array? It's counter intuitive but if create sees an existing array it will resync the existing array instead of recreating. I know that works for RAID1. I've not used RAID5 so YMMV.
 
Old 12-18-2009, 04:35 PM   #6
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by Chuck56 View Post
Have you thought about rerunning "create" instead of "assemble" on the array? It's counter intuitive but if create sees an existing array it will resync the existing array instead of recreating. I know that works for RAID1. I've not used RAID5 so YMMV.
My concern is that create won't see the array and destroy my data. I know the data is in there somewhere. I think what happened is the power went out during a drive-intensive backup, so I ended up with inconsistent filesystem data structures. But with spare superblocks and all, I though it would be possible to repair.

Do you know if create will destroy my data?
 
Old 12-18-2009, 05:12 PM   #7
Chuck56
Member
 
Registered: Dec 2006
Location: Colorado, USA
Distribution: Slackware
Posts: 930

Rep: Reputation: 479Reputation: 479Reputation: 479Reputation: 479Reputation: 479
You should get a notification from mdadm if it finds a recoverable array with a yes/no to proceed. I don't know what will happen if it doesn't recognize the array.
 
Old 12-19-2009, 02:34 AM   #8
mRgOBLIN
Slackware Contributor
 
Registered: Jun 2002
Location: New Zealand
Distribution: Slackware
Posts: 999

Rep: Reputation: 231Reputation: 231Reputation: 231
Well when working with live data like this it can be pretty hairy.

You could try an "mdadm -Af /dev/md0'
(--assemble --force)

But if the data is really crucial you may be better asking the experts by sending a polite email to linux-raid AT THE HOST vger.kernel.org

I feel your frustration but when talking to them I'd refrain from comments such as this

Quote:
What the hell is going on? I thought linux RAID was more reliable than this shit!
 
Old 12-21-2009, 12:03 PM   #9
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by Chuck56 View Post
Have you thought about rerunning "create" instead of "assemble" on the array? It's counter intuitive but if create sees an existing array it will resync the existing array instead of recreating. I know that works for RAID1. I've not used RAID5 so YMMV.
You were correct. I created a new array consisting of the four component drives, and it is now "rebuilding". However, only 3 of the 4 drives are being used. I'll wait and see if the rebuild completes successfully, then see about adding the fourth component drive to the array.

Thanks!
 
Old 02-01-2010, 03:03 PM   #10
WindowBreaker
Member
 
Registered: Oct 2005
Distribution: Slackware
Posts: 228

Original Poster
Rep: Reputation: 40
[SOLVED] Trouble rebuilding RAID-5 array after power outage

Well, I could not rebuild the array successfully with 3/4 drives. I had one drive that became physically unresponsive, and another that apparently had some corruption due to the power outage.

What I did was physically remove the problematic drive, and build a brand-spanking new array with the other 3 drives. I lost a few years worth of nightly backups!

For anyone reading, I highly suggest instead of a basic RAID-5 array, take the time to build a proper RAID-10 array, allowing multiple drive failures before the entire array is lost.
 
Old 02-02-2010, 03:19 AM   #11
zordrak
Member
 
Registered: Feb 2008
Distribution: Slackware
Posts: 595

Rep: Reputation: 116Reputation: 116
Quote:
Originally Posted by WindowBreaker View Post
Well, I could not rebuild the array successfully with 3/4 drives. I had one drive that became physically unresponsive, and another that apparently had some corruption due to the power outage.

What I did was physically remove the problematic drive, and build a brand-spanking new array with the other 3 drives. I lost a few years worth of nightly backups!

For anyone reading, I highly suggest instead of a basic RAID-5 array, take the time to build a proper RAID-10 array, allowing multiple drive failures before the entire array is lost.
FWIW I recommend RAID61 if the data *really* matters.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Software Raid 1 behaving strangely after power outage Krigslund Linux - Server 6 01-16-2009 01:09 PM
[SOLVED] LiLO boot error code L_ on RAID array after power surge Bruce Hill Slackware 2 08-30-2008 07:06 PM
Software Raid 5 array power saving shaamone Linux - Hardware 5 03-23-2006 08:01 PM
RAID 5 after Power outage RabidSquirrel Linux - Hardware 3 08-02-2005 06:46 PM
power for raid array rtp405 Linux - Hardware 1 02-05-2003 05:36 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 12:09 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration