LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 10-08-2012, 05:51 PM   #1
artagel
LQ Newbie
 
Registered: Dec 2009
Posts: 6

Rep: Reputation: 0
Getting raid6 array to mount


Hi,
So I have a raid6 array of 22 drives. I had a drive go bad a couple of days ago, then this morning when I was going to remove that drive another one went bad. In my system, it's very hard to tell which actual physical drive is the bad one(design flaw), so I removed the one I thought it was. It was the wrong one, so I put it back in(yes I know, that was dumb...). At this point, my raid array was showing up as failed and the drive was showing up as spare. I rebooted the system in hopes that it would properly reassemble the raid array. Each time it, would simply start up the array, but it wouldn't run because it was missing a drive. The drive was /dev/sdc, which I could mdadm --add /dev/md0 /dev/sdc. But when I did that, it would always be spare.

I noticed that my event count was off on SDC, and SDC showed that there were a total of 23 drives(it was the spare), while my other drives were newer and showed the 19/22 drives. Every time I tried to assemble I got this:
Code:
gigantor:~# mdadm --assemble --force --scan --verbose
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sdu: Device or resource busy
mdadm: /dev/sdu has wrong uuid.
mdadm: /dev/sdt is identified as a member of /dev/md0, slot 21.
mdadm: /dev/sds is identified as a member of /dev/md0, slot 20.
mdadm: /dev/sdr is identified as a member of /dev/md0, slot 18.
mdadm: /dev/sdq is identified as a member of /dev/md0, slot 17.
mdadm: /dev/sdp is identified as a member of /dev/md0, slot 16.
mdadm: /dev/sdo is identified as a member of /dev/md0, slot 13.
mdadm: /dev/sdn is identified as a member of /dev/md0, slot 12.
mdadm: /dev/sdm is identified as a member of /dev/md0, slot 10.
mdadm: /dev/sdl is identified as a member of /dev/md0, slot 9.
mdadm: /dev/sdk is identified as a member of /dev/md0, slot 4.
mdadm: /dev/sdj is identified as a member of /dev/md0, slot 15.
mdadm: /dev/sdi is identified as a member of /dev/md0, slot 5.
mdadm: /dev/sdh is identified as a member of /dev/md0, slot 6.
mdadm: /dev/sdg is identified as a member of /dev/md0, slot 19.
mdadm: /dev/sdf is identified as a member of /dev/md0, slot 7.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 8.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 14.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 22.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: no uptodate device for slot 2 of /dev/md0
mdadm: no uptodate device for slot 3 of /dev/md0
mdadm: added /dev/sdk to /dev/md0 as 4
mdadm: added /dev/sdi to /dev/md0 as 5
mdadm: added /dev/sdh to /dev/md0 as 6
mdadm: added /dev/sdf to /dev/md0 as 7
mdadm: added /dev/sde to /dev/md0 as 8
mdadm: added /dev/sdl to /dev/md0 as 9
mdadm: added /dev/sdm to /dev/md0 as 10
mdadm: no uptodate device for slot 11 of /dev/md0
mdadm: added /dev/sdn to /dev/md0 as 12
mdadm: added /dev/sdo to /dev/md0 as 13
mdadm: added /dev/sdd to /dev/md0 as 14
mdadm: added /dev/sdj to /dev/md0 as 15
mdadm: added /dev/sdp to /dev/md0 as 16
mdadm: added /dev/sdq to /dev/md0 as 17
mdadm: added /dev/sdr to /dev/md0 as 18
mdadm: added /dev/sdg to /dev/md0 as 19
mdadm: added /dev/sds to /dev/md0 as 20
mdadm: added /dev/sdt to /dev/md0 as 21
mdadm: added /dev/sdc to /dev/md0 as 22
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 assembled from 19 drives and 1 spare - not enough to start the array.
It appears all my superblocks are ok, its just that I can't get my array to recognize that sdc belongs in it at the proper spot. After much googling and experimentation, being careful not to try things like zeroizing superblocks, etc...I finally tried to recreate the array.

Code:
gigantor:/tmp/src/mdadm-3.2.5# mdadm --create /dev/md0 --assume-clean --level=6 --chunk=16 --metadata=0.90 --uuid=3cd93aff:18032678:261503f8:d1eb9e65 --raid-devices=22 /dev/sd[a-t] missing missing
mdadm: /dev/sda appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdb appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdc appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdd appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sde appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: partition table exists on /dev/sde but will be lost or
       meaningless after creating array
mdadm: /dev/sdf appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdg appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdh appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdi appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdj appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdk appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdl appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdm appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdn appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdo appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdp appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdq appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdr appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sds appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
mdadm: /dev/sdt appears to be part of a raid array:
    level=raid6 devices=22 ctime=Sun Dec  6 22:25:51 2009
Continue creating array? y
mdadm: array /dev/md0 started.
gigantor:/tmp/src/mdadm-3.2.5# mdadm -D /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Mon Oct  8 21:15:58 2012
     Raid Level : raid6
     Array Size : 29302769920 (27945.30 GiB 30006.04 GB)
  Used Dev Size : 1465138496 (1397.26 GiB 1500.30 GB)
   Raid Devices : 22
  Total Devices : 20
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Mon Oct  8 21:15:58 2012
          State : clean, degraded 
 Active Devices : 20
Working Devices : 20
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 16K

           UUID : 3cd93aff:18032678:6154c110:0f190746 (local to host gigantor)
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8        0        0      active sync   /dev/sda
       1       8       16        1      active sync   /dev/sdb
       2       8       32        2      active sync   /dev/sdc
       3       8       48        3      active sync   /dev/sdd
       4       8       64        4      active sync   /dev/sde
       5       8       80        5      active sync   /dev/sdf
       6       8       96        6      active sync   /dev/sdg
       7       8      112        7      active sync   /dev/sdh
       8       8      128        8      active sync   /dev/sdi
       9       8      144        9      active sync   /dev/sdj
      10       8      160       10      active sync   /dev/sdk
      11       8      176       11      active sync   /dev/sdl
      12       8      192       12      active sync   /dev/sdm
      13       8      208       13      active sync   /dev/sdn
      14       8      224       14      active sync   /dev/sdo
      15       8      240       15      active sync   /dev/sdp
      16      65        0       16      active sync   /dev/sdq
      17      65       16       17      active sync   /dev/sdr
      18      65       32       18      active sync   /dev/sds
      19      65       48       19      active sync   /dev/sdt
      20       0        0       20      removed
      21       0        0       21      removed
When I tried to mount the array I got:

Code:
gigantor:/etc# mount -a
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
gigantor:/etc# dmesg | tail
[14304.524311] md0: detected capacity change from 0 to 30006036398080
[14304.525335]  md0: unknown partition table
[14347.702810] FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
[14347.704070] FAT: bogus number of reserved sectors
[14347.705203] VFS: Can't find a valid FAT filesystem on dev md0.
[14347.705368] qnx4: wrong fsid in superblock.
[14443.038299] FAT: utf8 is not a recommended IO charset for FAT filesystems, filesystem will be case sensitive!
[14443.039588] FAT: bogus number of reserved sectors
[14443.040778] VFS: Can't find a valid FAT filesystem on dev md0.
[14443.041002] qnx4: wrong fsid in superblock.
When I try a fsck /dev/md0:

Code:
gigantor:/etc# fsck /dev/md0
fsck from util-linux-ng 2.17.2
fsck.jfs version 1.1.12, 24-Aug-2007
processing started: 10/8/2012 21.37.25
Using default parameter: -p
The current device is:  /dev/md0

The superblock does not describe a correct jfs file system.

If device /dev/md0 is valid and contains a jfs file system,
then both the primary and secondary superblocks are corrupt
and cannot be repaired, and fsck cannot continue.

Otherwise, make sure the entered device /dev/md0 is correct.
fdisk -l /dev/md0 shows:
Code:
gigantor:/tmp/src/mdadm-3.1.1# fdisk -l /dev/md0

Disk /dev/md0: 30006.0 GB, 30006036398080 bytes
2 heads, 4 sectors/track, -1 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 16384 bytes / 327680 bytes
Disk identifier: 0x00000000

Disk /dev/md0 doesn't contain a valid partition table
Although...I think that it probably correct.


cat /proc/mdstat shows:
Code:
gigantor:/etc# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] 
md0 : active (auto-read-only) raid6 sdt[19] sds[18] sdr[17] sdq[16] sdp[15] sdo[14] sdn[13] sdm[12] sdl[11] sdk[10] sdj[9] sdi[8] sdh[7] sdg[6] sdf[5] sde[4] sdd[3] sdc[2] sdb[1] sda[0]
      29302769920 blocks level 6, 16k chunk, algorithm 2 [22/20] [UUUUUUUUUUUUUUUUUUUU__]
      
unused devices: <none>
Can anyone help point me in the right direction? I'm thinking that my data is still there, but perhaps the array isn't being assembled in the correct order. I don't have a backup of this raid since it's over 20TB, my drive failures had some unfortunate timing...but I know that I still have 20 good drives. If i can get the raid running again, I can add 2 new drives and get the parity drives back up and synced.

Thanks for any help.
-Dan
 
Old 10-08-2012, 06:59 PM   #2
markie83
LQ Newbie
 
Registered: Dec 2009
Posts: 27

Rep: Reputation: 0
ok now I will preface this by stating that I only have experience with raid-1 but what worked for me (to get the data out when a drive went bad) was to boot off of a different hard disk or livecd and let the other distro/livecd auto discover & auto assemble the raid (it will usually show up as /dev/md127 when auto assembled). Mount it and then copy your data to another safe place......then go back to the original install and redo your array from scratch.


.............YMMV but it has worked for me in raid 1.....good luck, your going to need it.
 
Old 10-09-2012, 05:28 AM   #3
artagel
LQ Newbie
 
Registered: Dec 2009
Posts: 6

Original Poster
Rep: Reputation: 0
Thanks for the advice. I did try to boot from a liveusb today. It didn't auto find my raid array, but I could assemble it and get it to run degraded like before.
Still couldn't mount it. I tried fsck /dev/md0 again(Which wouldn't run on my Debian installation), and it surprisingly ran. At this point I was scared that it could mess up something if I didn't have the drives in proper order in the array, but once it finished, I was able to mount md0 normally and see all my files.

Rebooted back into Debian and tried to assemble the array, and now its saying some devices are busy and it can only find 17....odd. Debian raid is rather confused for some reason. Even when I do mdadm -E /sd? the this line doesn't match up with the actual line. Part of the problem, I suspect, is that my damn drives power up in different orders every time, so they get different letters.

Rebooted one more time... didn't auto-mount the array, I modified my /etc/mdadm/mdadm.conf slightly to remove the DEVICES information, and tried to assemble again, this time it assembled fine.

My mdadm.conf file now looks like this
Code:
#DEVICES /dev/sd[a-z]
ARRAY /dev/md0 metadata=0.90 UUID=3cd93aff:18032678:6154c110:0f190746
MAILADDR root
Rebooted one more time to see if it will come up by itself now. It doesn't...i'll troubleshoot that more later. I can assemble, so I can at least get to my data.

Perhaps my issue the whole time was the DEVICES? I doubt it...fsck probably did fix something, either way....I'll be ordering some replacement drives tonight, and 2 extras. For some reason they don't really sell 7200rpm 1.5TB drives in Seoul...
Thanks for the help.
-Dan
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Software mdadm RAID6 recovery - how to reassemble badly broken array eduardr Linux - Server 0 09-20-2011 12:00 PM
18 drive RAID6 - cant start array - FC14 dreamwave Linux - Hardware 0 08-11-2011 05:54 AM
Reshaping RAID5 array to RAID6 Ken Emerson Ubuntu 1 05-30-2011 06:30 PM
After creating raid6 array, mkfs.jfs /dev/md0, then cannot mount /dev/md0 artagel Debian 4 12-06-2009 09:30 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 08:40 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration