LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 07-29-2008, 11:26 AM   #1
Rascale
LQ Newbie
 
Registered: Oct 2003
Distribution: RH ES 3, ES 4, SLES 10.2
Posts: 9

Rep: Reputation: 0
recover dirty, degraded software raid 1 after power failure


Hi,

I have a redhat ES 3 box with software raid 1 array that lost power (my boss was trying to figure out what was plugged into the ups, oops, he found out!) It booted back up but the array was dirty, degraded.

----------------- dmesg -----------------
md: superblock update time inconsistency -- using the most recent one
md: freshest: sdc1
md: kicking non-fresh sdb1 from array!

----------------------------------------
# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.90.00
Creation Time : Tue Feb 15 07:30:50 2005
Raid Level : raid1
Array Size : 35881024 (34.22 GiB 36.74 GB)
Device Size : 35881024 (34.22 GiB 36.74 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Thu Jul 17 01:59:43 2008
State : dirty, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0

UUID : 335f77df:7b4443c4:45a84a39:94c75971
Events : 0.56

Number Major Minor RaidDevice State
0 0 0 0 faulty removed
1 8 33 1 active sync /dev/sdc1
----------------------------------------

/dev/sdb1 thinks everything is OK.

mdadm -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 335f77df:7b4443c4:45a84a39:94c75971
Creation Time : Tue Feb 15 07:30:50 2005
Raid Level : raid1
Device Size : 35881024 (34.22 GiB 36.74 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0

Update Time : Tue May 29 03:24:41 2007
State : dirty
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : bcd14504 - correct
Events : 0.48


Number Major Minor RaidDevice State
this 0 8 17 0 active sync /dev/sdb1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
----------------------------------------

/dev/sdc1 complains:

# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 335f77df:7b4443c4:45a84a39:94c75971
Creation Time : Tue Feb 15 07:30:50 2005
Raid Level : raid1
Device Size : 35881024 (34.22 GiB 36.74 GB)
Raid Devices : 2
Total Devices : 1
Preferred Minor : 0

Update Time : Thu Jul 17 01:59:43 2008
State : dirty
Active Devices : 1
Working Devices : 1
Failed Devices : 0
Spare Devices : 0
Checksum : bef44f37 - correct
Events : 0.56


Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1

0 0 0 0 0 faulty removed
1 1 8 33 1 active sync /dev/sdc1
----------------------------------------

What's the best way to recover from this? Can I just force a fresh new superblock onto /dev/sdb1? I'm pretty sure the drive is OK. I do have a spare drive available.

Thanks!

->R
 
Old 07-29-2008, 12:51 PM   #2
mostlyharmless
Senior Member
 
Registered: Jan 2008
Distribution: Slackware 14.1 (multilib) with kernel 3.15.5
Posts: 1,549
Blog Entries: 12

Rep: Reputation: 177Reputation: 177
This looks like a good opportunity for a backup first

I don't *know*, but would think that using -add to put /dev/sdb1 back into /dev/md0 should work. Using the spare first might be safer. I'm interested to see if anyone else has a better idea.
 
Old 07-30-2008, 03:06 PM   #3
kenoshi
Member
 
Registered: Sep 2007
Location: SF Bay Area, CA
Distribution: CentOS, SLES 10+, RHEL 3+, Debian Sarge
Posts: 159

Rep: Reputation: 32
Do a spot backup first before you do anything else.

Rebuild the array, power outage causes all kinds of problems unless you have a controller with a powered write cache.

If /dev/sdb is more than 3 years old, replace it with the spare. You should rotate out old drives once every 3 years anyway.

Forgot to add...tell your boss to fire himself

Last edited by kenoshi; 07-30-2008 at 03:09 PM.
 
Old 07-31-2008, 12:00 PM   #4
Rascale
LQ Newbie
 
Registered: Oct 2003
Distribution: RH ES 3, ES 4, SLES 10.2
Posts: 9

Original Poster
Rep: Reputation: 0
Thanks for your comments. Here's what I did to recover.
1. shutdown all processes and databases using the array. lsof /dev/md0 is your friend.
2. Full backup, in addition to the usual nightly ones.
3. Stop the array mdadm -S /dev/md0
4. Added the drive back into the array. In this case,
mdadm /dev/md0 --add /dev/sdb1
5. Sit back and watch progress, watch -n 1 cat /proc/mdstat
6. Restart, dmesg says
raid1: device sdc1 operational as mirror 1
raid1: device sdb1 operational as mirror 0
raid1: raid set md0 active with 2 out of 2 mirrors
md: ... autorun DONE.

Leave work early and make the boss do Helpdesk, life is good
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Degraded Array on Software Raid pcinfo-az Linux - Hardware 8 07-03-2008 10:43 AM
Software Raid 5 State - Dirty gunnerjoe Linux - Server 3 02-14-2007 01:40 PM
Software RAID-1 unable to boot degraded keithk23 Linux - Server 2 09-27-2006 08:52 AM
Dirty Software Raid Lanmate Linux - Software 1 12-13-2005 06:00 PM
Can Mandrake 9.1 recover from a power failure? CyberLord_7 Mandriva 5 09-26-2003 01:43 PM


All times are GMT -5. The time now is 02:44 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration