Issues mounting RAID drive

BadWolf88 · 09-30-2014, 12:10 PM

This is my first time setting up a RAID array, and I am still learning how to work on Linux, so please excuse me if i have gone and done something horribly wrong. . .

I have a RAID 10 array split between 4 identical 1TB drives. While I was out of town, I noticed that I could no longer access my data (this is set up as a file server). When I got home i noticed a message from S.M.A.R.T saying that drive 1 has failing. (I have already sent out for a replacement from Seagate.)

Now the entire array is inactive and i cannot seem to get it to come back up. Whenever I try to mount using "mount /dev/md0" i get the message "Mount: wrong fs type, bad option, bad superblock on /dev/md0..."

I am not sure if this is because of the failing disk (which is still attached) or due to another problem. Can somebody please give me some direction here? I will post the outputs from mdadm --examine for all 4 drives below.

Code:

badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sda
[sudo] password for badwolf: 
/dev/sda:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8a03fe95:57d0e35b:17a1c935:542697cf
           Name : BadWolfNAS:0  (local to host BadWolfNAS)
  Creation Time : Fri Jun 20 09:30:38 2014
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
  Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1b39271c:0d77ea8a:08aa2f72:f9a290f8

    Update Time : Sun Jul 20 13:31:57 2014
       Checksum : 18581c63 - correct
         Events : 96

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8a03fe95:57d0e35b:17a1c935:542697cf
           Name : BadWolfNAS:0  (local to host BadWolfNAS)
  Creation Time : Fri Jun 20 09:30:38 2014
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
  Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : 6312bd9c:8d390f1b:d0eb7d4f:278baaf1

    Update Time : Fri Aug  1 10:44:37 2014
       Checksum : ff8870b7 - correct
         Events : 58413

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : A.AA ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdc
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8a03fe95:57d0e35b:17a1c935:542697cf
           Name : BadWolfNAS:0  (local to host BadWolfNAS)
  Creation Time : Fri Jun 20 09:30:38 2014
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
  Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 1bab37ac:416f4c8c:6f934a3e:cd13fcd6

    Update Time : Tue Sep 30 07:48:06 2014
       Checksum : 54afaf70 - correct
         Events : 144278

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : A..A ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdf
/dev/sdf:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 8a03fe95:57d0e35b:17a1c935:542697cf
           Name : BadWolfNAS:0  (local to host BadWolfNAS)
  Creation Time : Fri Jun 20 09:30:38 2014
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 1953262592 (1862.78 GiB 2000.14 GB)
  Used Dev Size : 1953262592 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : e2e4aae4:2c2f3339:c15b1bd8:9d2eca89

    Update Time : Mon Sep 29 15:13:15 2014
       Checksum : 86a7a3d6 - correct
         Events : 144270

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : A..A ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$

also, here is the mdstat output

Code:

badwolf@BadWolfNAS:~$ sudo cat /proc/mdstat
[sudo] password for badwolf: 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd[2](S) sdc[1](S) sdb[0](S)
      2929894536 blocks super 1.2
       
unused devices: <none>

I have had intermittent issues where the RAID device could not be loaded on startup, but have always been able to fix it by either restarting or by manually starting it after logging in. All I would like to do is get this functional (without any data loss of course) until i get the replacement drive. Thanks in advance!

purevw · 09-30-2014, 07:55 PM

Actually, more information would be helpful. What flavor of Linux do you run? How exactly is your RAID set up? Is it a RAID controller that may be accessed via BIOS during POST?

BadWolf88 · 09-30-2014, 08:21 PM

Sorry, I run Ubuntu Server 14.04.1 LTS, and I am running a software RAID through Ubuntu which I configured through Webmin. I do not have access to the array during POST.

hasienda · 10-02-2014, 12:53 PM

Quote:

Originally Posted by BadWolf88

I have a RAID 10 array split between 4 identical 1TB drives.

Although you've shown mdadm output for all four, the array summary changes gradually, so RAID status seems transient/unstable. The last has not only one but two non-actice discs.

Quote:

Originally Posted by BadWolf88

Now the entire array is inactive and i cannot seem to get it to come back up. Whenever I try to mount using "mount /dev/md0" i get the message "Mount: wrong fs type, bad option, bad superblock on /dev/md0..."

I am not sure if this is because of the failing disk (which is still attached) or due to another problem. Can somebody please give me some direction here? I will post the outputs from mdadm --examine for all 4 drives below.

Code:

badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sda
/dev/sda:
..
   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdb
/dev/sdb:
..
   Device Role : Active device 2
   Array State : A.AA ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdc
/dev/sdc:
..
   Device Role : Active device 3
   Array State : A..A ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$ sudo mdadm --examine /dev/sdf
/dev/sdf:
..
   Device Role : Active device 0
   Array State : A..A ('A' == active, '.' == missing)
badwolf@BadWolfNAS:~$

Despite of the mdadm output for sda and sdf these drives are unknown according to the `/proc/mdstat` output, but there is an sdd instead.

Quote:

Originally Posted by BadWolf88

also, here is the mdstat output

Code:

badwolf@BadWolfNAS:~$ sudo cat /proc/mdstat
[sudo] password for badwolf: 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd[2](S) sdc[1](S) sdb[0](S)
      2929894536 blocks super 1.2
       
unused devices: <none>

All this just doesn't fit.

BadWolf88 · 10-03-2014, 02:49 PM

Sorry for the late response, it's been a long week. Clearly something has gone awry here. What can I do to rebuild my RAID without losing all of my data? And if I can't rebuild, is there any way to recover the data onto another drive so I can start over? Much of this data does not have a backup, as this was my solution when I ran out of space on my other machines. BTW, In case it's not clear, there should be 5 drives total on this system - a small, independent primary drive that holds the OS and such, and 4 1-TB drives in a RAID 10 configuration.

also, The last drive listed in the Mdadm examine is the one that is failing and will be replaced.

suicidaleggroll · 10-03-2014, 03:49 PM

What is the output of

Code:

mdadm --detail --scan

?

And for each device in the result, post the output of

Code:

mdadm --detail <device>

eg:

Code:

# mdadm --detail --scan
ARRAY /dev/md/127_0 metadata=0.90 UUID=ada89324:7a461870:40065847:3426227a

# mdadm --detail /dev/md/127_0
/dev/md/127_0:
        Version : 0.90
  Creation Time : Wed Apr 29 08:11:33 2009
     Raid Level : raid10
     Array Size : 2930271744 (2794.52 GiB 3000.60 GB)
  Used Dev Size : 1465135872 (1397.26 GiB 1500.30 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 127
    Persistence : Superblock is persistent

    Update Time : Fri Oct  3 14:48:25 2014
          State : clean 
 Active Devices : 4
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 0

         Layout : near=2
     Chunk Size : 256K

           UUID : ada89324:7a461870:40065847:3426227a
         Events : 0.1504

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1
       2       8       49        2      active sync   /dev/sdd1
       3       8       65        3      active sync   /dev/sde1

I have a feeling your system has split your array into multiple, fragmented pieces. So one drive has its own "4-drive" raid10 that's missing 3 other drives, another drive has its own "4-drive" raid10 that's missing 3 other drives etc. Hopefully the above commands will clarify things.

BadWolf88 · 10-03-2014, 05:23 PM

mdadm --detail --scan returns

Code:

 mdadm: md device /dev/md0 does not appear to be active.

suicidaleggroll · 10-03-2014, 05:39 PM

Does "mdadm --detail /dev/md0" return anything?

BadWolf88 · 10-03-2014, 06:51 PM

no, it returns the same thing

BadWolf88 · 10-03-2014, 08:04 PM

So I have been trying to research this on my own as well, and while doing so have rebooted a couple of times and re-run a few things and have noticed that some of the outputs have changed, specifically the contents of mdstat. Here they are just in case it helps make sense of anything.

mdstat

Code:

badwolf@BadWolfNAS:~$ sudo cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md0 : inactive sdd[3](S) sdb[1](S) sdc[2](S) sda[0](S)
      3906526048 blocks super 1.2
       
unused devices: <none>

Am I correct in thinking that the (S) means spare? I shouldn't have any spare drives in this setup, so if that's the case, it is a problem.

the drive that needs to be replaced (according to S.M.A.R.T.) is sda (disk 0)

this may be of use as well. . .

Code:

badwolf@BadWolfNAS:~$ sudo mdadm --assemble --scan --verbose
mdadm: looking for devices for /dev/md/0
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdd has wrong uuid.
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 3.
mdadm: added /dev/sda to /dev/md0 as 0 (possibly out of date)
mdadm: added /dev/sdb to /dev/md0 as 1 (possibly out of date)
mdadm: added /dev/sdc to /dev/md0 as 2 (possibly out of date)
mdadm: added /dev/sdd to /dev/md0 as 3
mdadm: /dev/md0 assembled from 1 drive - not enough to start the array.