LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   mdadm RAID5 after power outtage (https://www.linuxquestions.org/questions/linux-software-2/mdadm-raid5-after-power-outtage-740429/)

Riallin 07-16-2009 01:32 AM

mdadm RAID5 after power outtage
 
I've searched around all over the internet, and I may have made the mistake of following advice one after another, which could possibly lead to my data loss...

Heres the story:

A few days ago, my RAID5 array of 3x1TB drives went down due to a power outtage. Bootup was stalled due to a failed mounting of my array. CTRL-D to bootup, and take a look at what I'm dealing with.

I "mdadm -E" all of my drives, and they are all clean and identical (not a single drive had been marked unclean or removed) so I tried to assemble using "mdadm -A --scan" which led to an array creation degraded.

The original array (md0) was set up as such:
Slot 0 - /dev/sdb1
Slot 1 - /dev/sdc1
Slot 2 - /dev/sda1

However, upon this assemble, I only had sdc1 and sda1. Odd thing though, instead of assembling in a _UU format, it came out to be U_U.

I hotadded the sdb1 drive, and checked /proc/mdstat, and to my surprise, it was now UU_ and it was resyncing sdc1 instead of sdb1. Not only that, but now the array was slotted: "sda1 sdb1 sdc1 - resync" which meant that my superblock was moved, right?

I couldn't mount, as the superblock was missing (which "made sense" in my mind), so I looked around at how to rearrange. After finding nothing, I allowed the array to resync, hoping that it would be all better.

This ended up with still wrong slots and unable to mount due to "wrong fs or superblock missing" errors, so I uninstalled mdadm, moved the /etc/mdadm.conf file elsewhere, rebooted, and reinstalled mdadm.

I then tried an assemble scan (which didn't work, obviously because I had moved the mdadm.conf file), so I put it back and assembled. Once again, wrong order, impossible to mount.

I then made the horrible mistake of creating a new array, in hopes that this would fix the problem:
mdadm --create -l 5 -n3 -x 0 /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sda1

Which caused an interesting output (which I've lost) that was along the lines of:
mdadm: /dev/sdb1 appears to contain an ext2fs file system
size=1953519872 mtime=[some time]
mdadm: /dev/sdb1 appears to be part of a raid array:
level=raid5 devices=3 ctime=[some time]

repeated for only two of the three drives, and a final line stating size was assumed to be "976759936" which obviously was wrong, BUT I didn't really pay attention to the output, and just pushed Y.

It up and created the array again in the wrong order.

So today I decided to try and outsmart it again.

I created a *new* array, this time with a missing slot:
mdadm --create -l 5 -n 3 -x 0 /dev/md1 /dev/sdb1 /dev/sdc1 missing
mdadm: /dev/sdb1 appears to contain an ext2fs file system
size=1953519872K mtime=Tue Jul 7 01:22:10 2009
mdadm: /dev/sdb1 appears to be part of a raid array:
level=raid5 devices=3 ctime=Mon Jul 13 23:13:59 2009
mdadm /dev/sdc1 appears to be part of a raid array:
level=raid5 devices=3 ctime=Mon Jul 13 23:13:59 2009
continue creating array? y
mdadm array /dev/md1 started

So then I checked mdstat, which showed:
md1 : active raid5 sdc1[1] sdb1[0]
1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_}

Success, right?
Wrong...

mdadm --detail /dev/md1 showed:
clean, degraded, superblock is persistent, array size is 1953519872, used dev size 976759936, and slot order correct.
However, unable to mount "wrong fs type, bad option, bad superblock on /dev/md1."

So then I added /dev/sda1 to the array (fit itself into the correct slot, thankfully), and has begun to rebuild.

Mount? Nope.

I even tried mounting on all the alternate superblocks, with no luck.

I really want to save this array, though I am already to through the denial and anger stages, and am now at bargaining/depression.

Any help would be *amazing* even if it is just telling me how I could mount this array (or even just one drive) to find what directories there were, so that I know what I had on here, as I do not specifically recall all of them, and if I knew, I could feasibly recover the entire data by collecting them all from other drives (i understand that this data could potentially be completely full of holes, but if so I would only want to know the filenames, not "save" each file).

All drives are ext4 created through GPARTED. I am running stock Mythbuntu (Ubuntu 9.04 Jaunty) with downloded mdadm. I have not run FSCK on any drives (unable to, which is a good thing I guess).




Here is potentially asked for information:


root@wakwak:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid5 sda1[3] sdc1[1] sdb1[0]
1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
[==>..................] recovery = 13.2% (129647696/976759936) finish=404.4min speed=34907K/sec

unused devices: <none>



root@wakwak:~# mdadm -E /dev/sda1
/dev/sda1:
Magic : a92b4efc
Version : 00.90.00
UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak)
Creation Time : Thu Jul 16 00:21:36 2009
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1

Update Time : Thu Jul 16 00:32:08 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 1
Spare Devices : 1
Checksum : ebbfba2b - correct
Events : 8

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 1 3 spare /dev/sda1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 0 0 2 faulty removed
3 3 8 1 3 spare /dev/sda1



root@wakwak:~# mdadm -E /dev/sdb1
/dev/sdb1:
Magic : a92b4efc
Version : 00.90.00
UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak)
Creation Time : Thu Jul 16 00:21:36 2009
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1

Update Time : Thu Jul 16 00:32:08 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 1
Spare Devices : 1
Checksum : ebbfba3b - correct
Events : 8

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 0 8 17 0 active sync /dev/sdb1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 0 0 2 faulty removed
3 3 8 1 3 spare /dev/sda1



root@wakwak:~# mdadm -E /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak)
Creation Time : Thu Jul 16 00:21:36 2009
Raid Level : raid5
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1

Update Time : Thu Jul 16 00:32:08 2009
State : clean
Active Devices : 2
Working Devices : 3
Failed Devices : 1
Spare Devices : 1
Checksum : ebbfba4d - correct
Events : 8

Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1

0 0 8 17 0 active sync /dev/sdb1
1 1 8 33 1 active sync /dev/sdc1
2 2 0 0 2 faulty removed
3 3 8 1 3 spare /dev/sda1



root@wakwak:~# mdadm --detail /dev/md1
/dev/md1:
Version : 00.90
Creation Time : Thu Jul 16 00:21:36 2009
Raid Level : raid5
Array Size : 1953519872 (1863.02 GiB 2000.40 GB)
Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Thu Jul 16 00:32:08 2009
State : clean, degraded, recovering
Active Devices : 2
Working Devices : 3
Failed Devices : 0
Spare Devices : 1

Layout : left-symmetric
Chunk Size : 64K

Rebuild Status : 11% complete

UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak)
Events : 0.8

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
3 8 1 2 spare rebuilding /dev/sda1
root@wakwak:~# mdadm -E --scan
ARRAY /dev/md1 level=raid5 num-devices=3 UUID=36c49ebd:eca81d19:59b9feb3:f6770ef0
spares=1



root@wakwak:~# mount /dev/md1 /mnt/raid
mount: wrong fs type, bad option, bad superblock on /dev/md1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so



root@wakwak:~# mount -t ext4 /dev/md1 /mnt/raid
mount: wrong fs type, bad option, bad superblock on /dev/md1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so



root@wakwak:~# dumpe2fs /dev/md1 | grep superblock
dumpe2fs 1.41.4 (27-Jan-2009)
Primary superblock at 0, Group descriptors at 1-117
Backup superblock at 32768, Group descriptors at 32769-32885
Backup superblock at 98304, Group descriptors at 98305-98421
Backup superblock at 163840, Group descriptors at 163841-163957
Backup superblock at 229376, Group descriptors at 229377-229493
Backup superblock at 294912, Group descriptors at 294913-295029
Backup superblock at 819200, Group descriptors at 819201-819317
Backup superblock at 884736, Group descriptors at 884737-884853
Backup superblock at 1605632, Group descriptors at 1605633-1605749
Backup superblock at 2654208, Group descriptors at 2654209-2654325
Backup superblock at 4096000, Group descriptors at 4096001-4096117
Backup superblock at 7962624, Group descriptors at 7962625-7962741
Backup superblock at 11239424, Group descriptors at 11239425-11239541
Backup superblock at 20480000, Group descriptors at 20480001-20480117
Backup superblock at 23887872, Group descriptors at 23887873-23887989
Backup superblock at 71663616, Group descriptors at 71663617-71663733
Backup superblock at 78675968, Group descriptors at 78675969-78676085
Backup superblock at 102400000, Group descriptors at 102400001-102400117
Backup superblock at 214990848, Group descriptors at 214990849-214990965



root@wakwak:~# dumpe2fs /dev/sdb1 | grep superblock
dumpe2fs 1.41.4 (27-Jan-2009)
Primary superblock at 0, Group descriptors at 1-117
Backup superblock at 32768, Group descriptors at 32769-32885
Backup superblock at 98304, Group descriptors at 98305-98421
Backup superblock at 163840, Group descriptors at 163841-163957
Backup superblock at 229376, Group descriptors at 229377-229493
Backup superblock at 294912, Group descriptors at 294913-295029
Backup superblock at 819200, Group descriptors at 819201-819317
Backup superblock at 884736, Group descriptors at 884737-884853
Backup superblock at 1605632, Group descriptors at 1605633-1605749
Backup superblock at 2654208, Group descriptors at 2654209-2654325
Backup superblock at 4096000, Group descriptors at 4096001-4096117
Backup superblock at 7962624, Group descriptors at 7962625-7962741
Backup superblock at 11239424, Group descriptors at 11239425-11239541
Backup superblock at 20480000, Group descriptors at 20480001-20480117
Backup superblock at 23887872, Group descriptors at 23887873-23887989
Backup superblock at 71663616, Group descriptors at 71663617-71663733
Backup superblock at 78675968, Group descriptors at 78675969-78676085
Backup superblock at 102400000, Group descriptors at 102400001-102400117
Backup superblock at 214990848, Group descriptors at 214990849-214990965



root@wakwak:~# dumpe2fs /dev/sdc1 | grep superblock
dumpe2fs 1.41.4 (27-Jan-2009)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sdc1
Couldn't find valid filesystem superblock.



root@wakwak:~# dumpe2fs /dev/sda1 | grep superblock
dumpe2fs 1.41.4 (27-Jan-2009)
dumpe2fs: Filesystem revision too high while trying to open /dev/sda1
Couldn't find valid filesystem superblock.


[NEW MDADM.CONF FILE]
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 level=raid5 num-devices=3 UUID=880c329a:c20bdf9c:59b9feb3:f6770ef0

# This file was auto-generated on Tue, 14 Jul 2009 21:26:36 -0500
# by mkconf $Id$




[OLD MDADM.CONF FILE]
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
#ARRAY /dev/md0 level=raid5 num-devices=3 UUID=bba00649:66fad961:59b9feb3:f6770ef0

# This file was auto-generated on Sun, 05 Jul 2009 23:42:22 -0500
# by mkconf $Id$
ARRAY /dev/md0 level=raid5 num-devices=3 metadata=00.90 UUID=bba00649:66fad961:59b9feb3:f6770ef0

chrism01 07-16-2009 01:44 AM

Actually, you could try mounting one disk at a time and running fsck, but not sure if it would say anything about disk being part of an array.
One thing though, ext4 is brand new, ie not the default on most Linux, so it may not be fully-baked (debugged). I'd stick with ext3 and let others 'test' it for you (in future).
You may need to pass extra params to the dumpe2fs cmd to make it recognise ext4. I think that's the case with ext3, otherwise it assumes real ext2 (and similarly with related cmds).
Sorry I can't be more help.
I'll be interested to see what others say.

r0b0 07-16-2009 07:46 AM

Quote:

Originally Posted by Riallin (Post 3609111)
A few days ago, my RAID5 array of 3x1TB drives went down due to a power outtage. Bootup was stalled due to a failed mounting of my array. CTRL-D to bootup, and take a look at what I'm dealing with.

I "mdadm -E" all of my drives, and they are all clean and identical (not a single drive had been marked unclean or removed) so I tried to assemble using "mdadm -A --scan" which led to an array creation degraded.

The original array (md0) was set up as such:
Slot 0 - /dev/sdb1
Slot 1 - /dev/sdc1
Slot 2 - /dev/sda1

However, upon this assemble, I only had sdc1 and sda1. Odd thing though, instead of assembling in a _UU format, it came out to be U_U.

I hotadded the sdb1 drive, and checked /proc/mdstat, and to my surprise, it was now UU_ and it was resyncing sdc1 instead of sdb1. Not only that, but now the array was slotted: "sda1 sdb1 sdc1 - resync" which meant that my superblock was moved, right?

I couldn't mount, as the superblock was missing (which "made sense" in my mind), so I looked around at how to rearrange. After finding nothing, I allowed the array to resync, hoping that it would be all better.

This ended up with still wrong slots and unable to mount due to "wrong fs or superblock missing" errors

Just an idea: If I understand you correctly, the first time you booted with only 2 disks plugged in. Then you hot-added the third one.

Now comes my speculation to explain your issues: When you booted with only 2 disks plugged in, they were assigned sda and sdb (and not sda and sdc as they did before). Then you added the third drive, it was assigned sdc and of course it said it was resyncing sdc1 which was *correct*.

The issue with "broken superblock" might have been solved by a simple fsck before or after the resync and you could have been just fine...

I know that this probably doesn't really help to get your data right now but still...

R.

Riallin 07-16-2009 01:57 PM

Quote:

Now comes my speculation to explain your issues: When you booted with only 2 disks plugged in, they were assigned sda and sdb (and not sda and sdc as they did before). Then you added the third drive, it was assigned sdc and of course it said it was resyncing sdc1 which was *correct*.
Actually, I had all disks plugged in, but I only created the array with "missing," so dev recognized all 3 drives, but the array was only built with 2, then I "mdadm --add /dev/md1 /dev/sda1" and it resynced.

Riallin 07-18-2009 09:08 PM

help!
 
so nobody has any ideas on this?

even if I just want to take a peek at the directories and files, without trying to copy them? If this thing actually is dead, it would be ideal to still be able to recover a *listing* of the files on there.

Under windows I can do such a thing with a simple partition saving program. Does nothing like that exist for linux?

/very frustrated


All times are GMT -5. The time now is 10:15 PM.