LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-25-2009, 08:51 PM   #1
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Rep: Reputation: 30
mdadm question. Disk failed 3 out of 5.


I set up a Raid 5 or 6 (not sure)
for five disks that i have using mdadm on redhat.

I just found out that the disks have failed for quite some time.
I guess the data are probably lost, but I want to check with any expert here who may be able to help me retriving the data.

I am sure the disks have not gone bad since it happened before.
if I took the failed disk and re-format them, i am sure I can still use it again. I think the problem is mdadm fails the disk somehow.

Code:
[root@eh3 /]# /sbin/mdadm --assemble /dev/md1 /dev/sd[g-k]1
mdadm: /dev/md1 assembled from 2 drives - not enough to start the array.
[root@eh3 /]# cat /proc/mdstat
Personalities :
md1 : inactive sdh1[3] sdi1[4] sdg1[1]
      1465151808 blocks
unused devices: <none>
Above is what is shown after I reboot the machine and try to assemble the disk. Can someone tell me which disks are failed?
Are they sdh1[3], sdi1[4], and sdg1[1]?

Are there anyway I can get the data out of these disks?
 
Old 07-26-2009, 02:47 PM   #2
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
what does "fdisk -l" and "for i in `ls /dev/sd[g-k]`; do smartctl -a $i; done"
show? Sounds like the drives aren't even plugged in...
 
Old 07-27-2009, 12:51 PM   #3
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
sd[g-k]1 used to be the 5 disks for the md1. After reboot, I use "#fdisk -l" to check the disks. I can see all the disks, but mdadm refused to assemble them.
 
Old 07-27-2009, 02:25 PM   #4
dxangel
Member
 
Registered: Nov 2008
Location: London, UK
Distribution: CentOS, RedHat, Ubuntu
Posts: 79

Rep: Reputation: 18
what does mdadm --detail show?
 
Old 07-28-2009, 09:18 AM   #5
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by dxangel View Post
what does mdadm --detail show?
It would not show any info since mdadm refused to assemble the disks.

Code:
[root@eh3 ~]# /sbin/mdadm --detail
mdadm: No devices given.
[root@eh3 ~]# /sbin/mdadm --detail /dev/md0
mdadm: md device /dev/md0 does not appear to be active.
[root@eh3 ~]# /sbin/mdadm --detail /dev/md1
mdadm: md device /dev/md1 does not appear to be active.
[root@eh3 ~]#
 
Old 07-30-2009, 08:38 AM   #6
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
Try this:
mdadm --examine /dev/sd[g-k]1

You might be able to force it to start with:

/sbin/mdadm --assemble --force /dev/md1 /dev/sd[g-k]1
or
mdadm --assemble --scan --force

The run option might do something, not sure:
/sbin/mdadm --assemble --run --force /dev/md1 /dev/sd[g-k]1


There are alot of options for assemble mode, --force should do everything though.
http://man-wiki.net/index.php/8:mdadm

Post back results please
 
Old 08-03-2009, 06:03 PM   #7
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Force assemble does not work. It said 3 disks are not sufficient to assemble.

Here is the result with --examine. I have no idea what it meant.


/sbin/mdadm --examine /dev/sd[g-k]1
/dev/sdg1:
Magic : a92b4efc
Version : 00.90.00
UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
Creation Time : Tue Mar 24 19:30:08 2009
Raid Level : raid5
Device Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 1

Update Time : Thu Jun 4 08:04:34 2009
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 1
Spare Devices : 0
Checksum : 5774d686 - correct
Events : 0.391714

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 1 8 129 1 active sync /dev/sdi1
0 0 8 97 0 active sync /dev/sdg1
1 1 8 129 1 active sync /dev/sdi1
2 2 0 0 2 faulty removed
3 3 8 145 3 active sync
4 4 8 161 4 active sync
/dev/sdh1:
Magic : a92b4efc
Version : 00.90.00
UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
Creation Time : Tue Mar 24 19:30:08 2009
Raid Level : raid5
Device Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 1

Update Time : Mon Jun 15 13:22:36 2009
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 3
Spare Devices : 0
Checksum : 5783a4d7 - correct
Events : 0.391714

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 145 3 active sync
0 0 8 97 0 active sync /dev/sdg1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 145 3 active sync
4 4 8 161 4 active sync
/dev/sdi1:
Magic : a92b4efc
Version : 00.90.00
UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
Creation Time : Tue Mar 24 19:30:08 2009
Raid Level : raid5
Device Size : 488383936 (465.76 GiB 500.11 GB)
Raid Devices : 5
Total Devices : 4
Preferred Minor : 1

Update Time : Mon Jun 15 13:22:36 2009
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 3
Spare Devices : 0
Checksum : 5783a4e9 - correct
Events : 0.391714

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 4 8 161 4 active sync
0 0 8 97 0 active sync /dev/sdg1
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 145 3 active sync
4 4 8 161 4 active sync
[root@eh3 /]#
 
Old 08-03-2009, 06:19 PM   #8
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
That is only showing 3 disks, you are missing /dev/sdj1 and /dev/sdk1. Are the others even plugged in?

What does smartctl -a /dev/sdj && smartctl -a /dev/sdk say?
Also cat /proc/partitions
 
Old 08-10-2009, 09:55 AM   #9
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by esaym View Post
That is only showing 3 disks, you are missing /dev/sdj1 and /dev/sdk1. Are the others even plugged in?

What does smartctl -a /dev/sdj && smartctl -a /dev/sdk say?
Also cat /proc/partitions
you are right.. I cannot believe I missed that.
Here is what I did further after reboot the machine and make sure
all the disk are seen by the fdisk.


Code:
[root@eh3]# /sbin/mdadm --examin /dev/sd[b-f]1
/dev/sdb1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
  Creation Time : Tue Mar 24 19:30:08 2009
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Mon Aug 10 09:50:50 2009
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 3
  Spare Devices : 1
       Checksum : 5f428037 - correct
         Events : 0.62953324

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       17        0      active sync   /dev/sdb1
   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       33        2      spare   /dev/sdc1
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
  Creation Time : Tue Mar 24 19:30:08 2009
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Mon Aug 10 09:50:50 2009
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 3
  Spare Devices : 1
       Checksum : 5f42804e - correct
         Events : 0.62953324

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     5       8       33        5      spare   /dev/sdc1
   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       33        5      spare   /dev/sdc1
/dev/sdd1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
  Creation Time : Tue Mar 24 19:30:08 2009
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Sat Aug  8 19:51:12 2009
          State : clean
 Active Devices : 4
Working Devices : 5
 Failed Devices : 1
  Spare Devices : 1
       Checksum : 57d0a1b8 - correct
         Events : 0.748650

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     1       8       49        1      active sync   /dev/sdd1
   0     0       8       17        0      active sync   /dev/sdb1
   1     1       8       49        1      active sync   /dev/sdd1
   2     2       0        0        2      faulty removed
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       33        5      spare   /dev/sdc1
/dev/sde1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
  Creation Time : Tue Mar 24 19:30:08 2009
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Mon Aug 10 09:50:50 2009
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 3
  Spare Devices : 1
       Checksum : 5f42806f - correct
         Events : 0.62953325

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     3       8       65        3      active sync   /dev/sde1
   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       33        2      spare   /dev/sdc1
/dev/sdf1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 9a108dd8:d8fd3620:df7c4ee0:aaa5350e
  Creation Time : Tue Mar 24 19:30:08 2009
     Raid Level : raid5
    Device Size : 488383936 (465.76 GiB 500.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

    Update Time : Mon Aug 10 09:50:50 2009
          State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 3
  Spare Devices : 1
       Checksum : 5f428083 - correct
         Events : 0.62953326

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     4       8       81        4      active sync   /dev/sdf1
   0     0       8       17        0      active sync   /dev/sdb1
   1     1       0        0        1      faulty removed
   2     2       0        0        2      faulty removed
   3     3       8       65        3      active sync   /dev/sde1
   4     4       8       81        4      active sync   /dev/sdf1
   5     5       8       33        2      spare   /dev/sdc1
Also, What disks are actually failed here? is it sdd1?
or sdc1 and sdf1? I have some spare disks that I can add to the RAID.
Code:
[root@eh3]# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdb1[0] sdc1[5] sdf1[4] sde1[3] sdd1[6](F)
      1953535744 blocks level 5, 64k chunk, algorithm 2 [5/3] [U__UU]

unused devices: <none>
[root@eh3]#

Last edited by ufmale; 08-10-2009 at 09:56 AM.
 
Old 08-10-2009, 02:01 PM   #10
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
Looks like the array changed from /dev/md1 /dev/sd[g-k]1
to /dev/md0 /dev/sd[b-f]1 ?

Yes:

Quote:
md0 : active raid5 sdb1[0] sdc1[5] sdf1[4] sde1[3] sdd1[6](F)
1953535744 blocks level 5, 64k chunk, algorithm 2 [5/3] [U__UU]
Shows 5 disks with 2 missing "[5/3] [U__UU]" disk 6 "sdd1[6](F)" is failed, re-add it with


mdadm --add /dev/md1 /dev/sdd1

I take it that this is a 4 disk array with one spare?

Last edited by esaym; 08-10-2009 at 05:11 PM.
 
Old 08-10-2009, 02:43 PM   #11
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by esaym View Post
Looks like the array changed from /dev/md1 /dev/sd[g-k]1
to /dev/md0 /dev/sd[b-f]1 ?

Yes:



Show 5 disks with 2 missing "[5/3] [U__UU]" disk 6 "sdd1[6](F)" is failed, re-add it with


mdadm --add /dev/md1 /dev/sdd1

I take it that this is a 4 disk array with one spare?

I set up 2 RAID, md0 and md1. Both of them are currently failed.
amd i am trying to fix the md0 first.

I --remove /dev/sdd1 and --add /dev/sdd1 back. but it still still show
[U__UU]. However, the "sdd1[6]F" is gone. It just shows "sdd1[5]" The "F" is gone. How can we tell which disks are failed?
Code:
# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdd1[5] sdb1[0] sdc1[6] sdf1[4] sde1[3]
      1953535744 blocks level 5, 64k chunk, algorithm 2 [5/3] [U__UU]
By the way, this is a 5-disk array, setting up with RAID5

Last edited by ufmale; 08-10-2009 at 02:45 PM.
 
Old 08-10-2009, 03:21 PM   #12
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by ufmale View Post
I set up 2 RAID, md0 and md1. Both of them are currently failed.
amd i am trying to fix the md0 first.

I --remove /dev/sdd1 and --add /dev/sdd1 back. but it still still show
[U__UU]. However, the "sdd1[6]F" is gone. It just shows "sdd1[5]" The "F" is gone. How can we tell which disks are failed?
Code:
# cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 sdd1[5] sdb1[0] sdc1[6] sdf1[4] sde1[3]
      1953535744 blocks level 5, 64k chunk, algorithm 2 [5/3] [U__UU]
By the way, this is a 5-disk array, setting up with RAID5

I reboot the machine again and try to reassemble it,


Code:
# /sbin/mdadm --assemble --update=summaries  --force /dev/md0 /dev/sd[b-f]1
mdadm: /dev/md0 assembled from 3 drives and 2 spares - not enough to start the array.
# /sbin/mdadm --stop /dev/md0 [root@evvspeech3 charoe]# /sbin/mdadm --assemble --update=super-minor --run --force /dev/md0 /dev/sd[b-f]1
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
# cat /proc/mdstat
Personalities : [raid5]
md0 : inactive sdb1[0] sdc1[6] sdd1[5] sdf1[4] sde1[3]
      2441919680 blocks
unused devices: <none>
I am really confused about the spare disks. I don't remember I setup any disk to be spare. Does mdadm assign it automatically?

Last edited by ufmale; 08-10-2009 at 03:23 PM.
 
Old 08-10-2009, 05:22 PM   #13
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
Yes I think spares are automatically handled. If the array is working right, and you add another drive to it with --add, then it will be added as a spare. I don't know why you are now showing you have 2 spares. The only way to see spares is with /sbin/mdadm --examine /dev/sd[b-f]1. In your last post sdc was the spare. I guess you could try to add sdc and sdd back and try to reassemble. Any reason for the "--update=super-minor"? That updates the superblock of each drive. The superblock is the only place where array info is stored, so if that gets messed up....
 
Old 08-11-2009, 03:21 PM   #14
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by esaym View Post
Yes I think spares are automatically handled. If the array is working right, and you add another drive to it with --add, then it will be added as a spare. I don't know why you are now showing you have 2 spares. The only way to see spares is with /sbin/mdadm --examine /dev/sd[b-f]1. In your last post sdc was the spare. I guess you could try to add sdc and sdd back and try to reassemble. Any reason for the "--update=super-minor"? That updates the superblock of each drive. The superblock is the only place where array info is stored, so if that gets messed up....

I wasn't sure what I was doing. I just tried different option including "--update=super-minor". Now after rebooting couple of times, it does not seem to be able to assemble at all. I checked with fdisks and saw all the disks are there.
I tried different things, take put one disk at a time, or take sdc and sdd out and put it back. Nothing works.

Code:
# /sbin/mdadm --assemble --force /dev/md0 /dev/sd[b-f]1
mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to start the array.
Any more suggestion that I can try to test or work on it?
 
Old 08-11-2009, 03:48 PM   #15
esaym
Member
 
Registered: Nov 2006
Distribution: Lots of Debian
Posts: 165

Rep: Reputation: 32
Quote:
Originally Posted by ufmale View Post
I wasn't sure what I was doing. I just tried different option including "--update=super-minor". Now after rebooting couple of times, it does not seem to be able to assemble at all. I checked with fdisks and saw all the disks are there.
I tried different things, take put one disk at a time, or take sdc and sdd out and put it back. Nothing works.

Code:
# /sbin/mdadm --assemble --force /dev/md0 /dev/sd[b-f]1
mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to start the array.
Any more suggestion that I can try to test or work on it?
It looks dead to me
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
mdadm failed disk, why? mikesjays Linux - Hardware 3 06-28-2009 10:36 PM
mdadm RAID1 failed and not rebuilding indienick Linux - Hardware 7 01-20-2009 11:45 AM
mdadm RAID 5, 6 disks failed. recovery possible? ufmale Linux - Server 10 10-20-2008 09:24 AM
mdadm reports no superblock trying to rebuild failed RAID 5 hotcut23 Linux - Hardware 0 08-18-2007 02:39 AM
Failed Dependency installing mdadm blackdragonblood Linux - Software 3 02-03-2006 08:22 PM


All times are GMT -5. The time now is 02:34 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration