Help answer threads with 0 replies.
Go Back > Forums > Linux Forums > Linux - General
User Name
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.


  Search this Thread
Old 07-03-2009, 12:36 PM   #1
LQ Newbie
Registered: Jul 2009
Posts: 2

Rep: Reputation: 0
md device failure (help!)

Hi all,

my md array just crapped out on me. I'm partly responsible, since one of the device in the RAID5 array died some time ago and I neglected to replace it, but I don't think it's the whole problem now.

When I assemble the array I get the following:
root@server:~# mdadm --assemble --verbose /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sdd1
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 3.
mdadm: no uptodate device for slot 0 of /dev/md0
mdadm: added /dev/sdb1 to /dev/md0 as 1
mdadm: added /dev/sdd1 to /dev/md0 as 3
mdadm: added /dev/sdc1 to /dev/md0 as 2
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array.
mdadm: /dev/md0 assembled from 2 drives - not enough to start the array

(slot 0 is the long-dead drive)
the output of "mdadm --examine" for 2 of the drive (sdc & sdd) is similar and looks like this:
State: Clean
Active Devices: 2
Working Devices: 2
Failed Devices: 1
Events: 1923796

while the output for sdb looks different:
State: active
Active Devices: 3
Working Devices: 3
Failed Devices: 0
Events: 1923787

Note the difference in the Events counter and the state. My guess is that the drive is out of sync with the rest.
I tried "mdadm --assemble --force --update=summaries" to bring the stray Events counter up to date per a recommendation I saw in a forum, but this command segfaults.
I tried strace-ing it and it faults right after reading 4K of data from /dev/sdb1.

To summarize: I'm not sure what to do next. I've read in forums that I should try to re-create the array but I fear it will completely destroy the data (not sure what creating an array from previously-array-ed disks does).

Any help will be appreciated. really!


-- Shachar
Old 07-03-2009, 06:45 PM   #2
Registered: May 2006
Location: BE
Distribution: Debian/Gentoo
Posts: 412

Rep: Reputation: 48
Well, for a start, if you have the space, 'dd dd each disk to make sure you have a backup just in case something does go wrong, you can always get back to the current point in time.

Did you put a new disk in the RAID and tried to rebuild it or are you still trying all of this with the failed disk?

Can you not see the content of your RAID? It should still work when only one disk fails.
Old 07-04-2009, 04:08 AM   #3
LQ Newbie
Registered: Jul 2009
Posts: 2

Original Poster
Rep: Reputation: 0
I am planning to go and buy a big disk to dd all the block devices onto it before making any changes.

But - as I said, this is not the first disk failure. I had a previous failure and didn't replace it.

I cannot see the contents of the RAID array since it won't start with 2 disks (out of 4). However, I'm not sure this is really a disk failure. from what I can tell it somehow managed to progress in writing to 2 of the 3 disks but one disk was left behind and was marked faulty, even though I don't see any read/write errors on this disk.

My questions is what can be done to "mark" this disk to be fine and with the same Events count, so I can start the array, even with a minor data loss?

Also - I found a post somewhere that says that "mdadm --build /dev/md0 --chunk-size=64 --raid-level=5 --devices /dev/sdb1 /dev/sdc1 /dev/sdd1 missing" worked for him when he tried to recover from a similar (but not identical) condition. Does "build" destroy data, or does it just reset the md superblock metadata? Will my logical volumes survive this?

Old 07-06-2009, 02:08 AM   #4
Registered: May 2006
Location: BE
Distribution: Debian/Gentoo
Posts: 412

Rep: Reputation: 48
Sorry for the delay in answering.

You should backup your disks to another using dd. That way, you don't just have one go at getting your data back.

I suggest you read the man pages to make sure what each option does in mdadm.

Best of luck in getting back your data. You should have had backups and you should have changed the disk as soon as it failed or at least had a spare disk that would have started rebuilding the raid as soon as there was a failure. A good option for you might have been RAID6



Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] FC 10: RAID failure test: grub cannot find boot device jot Fedora 3 03-26-2009 10:11 AM
FAT32 hd failure -plugged usb device Jose Rivas Fedora 2 04-03-2008 04:25 PM
6.2 Boot Failure: Populating /dev with device nodes gpenguin Linux From Scratch 3 03-12-2007 05:52 PM
Controller failure on USB storage device MJatIFAD Linux - Hardware 1 02-22-2006 08:57 AM
device mapper failure booting SuSE 9.1 Joseph Schiller Linux - Distributions 2 08-18-2004 08:54 AM > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 12:34 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration