mdadm RAID5 after power outtage
I've searched around all over the internet, and I may have made the mistake of following advice one after another, which could possibly lead to my data loss...
Heres the story: A few days ago, my RAID5 array of 3x1TB drives went down due to a power outtage. Bootup was stalled due to a failed mounting of my array. CTRL-D to bootup, and take a look at what I'm dealing with. I "mdadm -E" all of my drives, and they are all clean and identical (not a single drive had been marked unclean or removed) so I tried to assemble using "mdadm -A --scan" which led to an array creation degraded. The original array (md0) was set up as such: Slot 0 - /dev/sdb1 Slot 1 - /dev/sdc1 Slot 2 - /dev/sda1 However, upon this assemble, I only had sdc1 and sda1. Odd thing though, instead of assembling in a _UU format, it came out to be U_U. I hotadded the sdb1 drive, and checked /proc/mdstat, and to my surprise, it was now UU_ and it was resyncing sdc1 instead of sdb1. Not only that, but now the array was slotted: "sda1 sdb1 sdc1 - resync" which meant that my superblock was moved, right? I couldn't mount, as the superblock was missing (which "made sense" in my mind), so I looked around at how to rearrange. After finding nothing, I allowed the array to resync, hoping that it would be all better. This ended up with still wrong slots and unable to mount due to "wrong fs or superblock missing" errors, so I uninstalled mdadm, moved the /etc/mdadm.conf file elsewhere, rebooted, and reinstalled mdadm. I then tried an assemble scan (which didn't work, obviously because I had moved the mdadm.conf file), so I put it back and assembled. Once again, wrong order, impossible to mount. I then made the horrible mistake of creating a new array, in hopes that this would fix the problem: mdadm --create -l 5 -n3 -x 0 /dev/md0 /dev/sdb1 /dev/sdc1 /dev/sda1 Which caused an interesting output (which I've lost) that was along the lines of: mdadm: /dev/sdb1 appears to contain an ext2fs file system size=1953519872 mtime=[some time] mdadm: /dev/sdb1 appears to be part of a raid array: level=raid5 devices=3 ctime=[some time] repeated for only two of the three drives, and a final line stating size was assumed to be "976759936" which obviously was wrong, BUT I didn't really pay attention to the output, and just pushed Y. It up and created the array again in the wrong order. So today I decided to try and outsmart it again. I created a *new* array, this time with a missing slot: mdadm --create -l 5 -n 3 -x 0 /dev/md1 /dev/sdb1 /dev/sdc1 missing mdadm: /dev/sdb1 appears to contain an ext2fs file system size=1953519872K mtime=Tue Jul 7 01:22:10 2009 mdadm: /dev/sdb1 appears to be part of a raid array: level=raid5 devices=3 ctime=Mon Jul 13 23:13:59 2009 mdadm /dev/sdc1 appears to be part of a raid array: level=raid5 devices=3 ctime=Mon Jul 13 23:13:59 2009 continue creating array? y mdadm array /dev/md1 started So then I checked mdstat, which showed: md1 : active raid5 sdc1[1] sdb1[0] 1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_} Success, right? Wrong... mdadm --detail /dev/md1 showed: clean, degraded, superblock is persistent, array size is 1953519872, used dev size 976759936, and slot order correct. However, unable to mount "wrong fs type, bad option, bad superblock on /dev/md1." So then I added /dev/sda1 to the array (fit itself into the correct slot, thankfully), and has begun to rebuild. Mount? Nope. I even tried mounting on all the alternate superblocks, with no luck. I really want to save this array, though I am already to through the denial and anger stages, and am now at bargaining/depression. Any help would be *amazing* even if it is just telling me how I could mount this array (or even just one drive) to find what directories there were, so that I know what I had on here, as I do not specifically recall all of them, and if I knew, I could feasibly recover the entire data by collecting them all from other drives (i understand that this data could potentially be completely full of holes, but if so I would only want to know the filenames, not "save" each file). All drives are ext4 created through GPARTED. I am running stock Mythbuntu (Ubuntu 9.04 Jaunty) with downloded mdadm. I have not run FSCK on any drives (unable to, which is a good thing I guess). Here is potentially asked for information: root@wakwak:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid5 sda1[3] sdc1[1] sdb1[0] 1953519872 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_] [==>..................] recovery = 13.2% (129647696/976759936) finish=404.4min speed=34907K/sec unused devices: <none> root@wakwak:~# mdadm -E /dev/sda1 /dev/sda1: Magic : a92b4efc Version : 00.90.00 UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak) Creation Time : Thu Jul 16 00:21:36 2009 Raid Level : raid5 Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Array Size : 1953519872 (1863.02 GiB 2000.40 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Thu Jul 16 00:32:08 2009 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Checksum : ebbfba2b - correct Events : 8 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 3 8 1 3 spare /dev/sda1 0 0 8 17 0 active sync /dev/sdb1 1 1 8 33 1 active sync /dev/sdc1 2 2 0 0 2 faulty removed 3 3 8 1 3 spare /dev/sda1 root@wakwak:~# mdadm -E /dev/sdb1 /dev/sdb1: Magic : a92b4efc Version : 00.90.00 UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak) Creation Time : Thu Jul 16 00:21:36 2009 Raid Level : raid5 Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Array Size : 1953519872 (1863.02 GiB 2000.40 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Thu Jul 16 00:32:08 2009 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Checksum : ebbfba3b - correct Events : 8 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 0 8 17 0 active sync /dev/sdb1 0 0 8 17 0 active sync /dev/sdb1 1 1 8 33 1 active sync /dev/sdc1 2 2 0 0 2 faulty removed 3 3 8 1 3 spare /dev/sda1 root@wakwak:~# mdadm -E /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak) Creation Time : Thu Jul 16 00:21:36 2009 Raid Level : raid5 Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Array Size : 1953519872 (1863.02 GiB 2000.40 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Update Time : Thu Jul 16 00:32:08 2009 State : clean Active Devices : 2 Working Devices : 3 Failed Devices : 1 Spare Devices : 1 Checksum : ebbfba4d - correct Events : 8 Layout : left-symmetric Chunk Size : 64K Number Major Minor RaidDevice State this 1 8 33 1 active sync /dev/sdc1 0 0 8 17 0 active sync /dev/sdb1 1 1 8 33 1 active sync /dev/sdc1 2 2 0 0 2 faulty removed 3 3 8 1 3 spare /dev/sda1 root@wakwak:~# mdadm --detail /dev/md1 /dev/md1: Version : 00.90 Creation Time : Thu Jul 16 00:21:36 2009 Raid Level : raid5 Array Size : 1953519872 (1863.02 GiB 2000.40 GB) Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 3 Total Devices : 3 Preferred Minor : 1 Persistence : Superblock is persistent Update Time : Thu Jul 16 00:32:08 2009 State : clean, degraded, recovering Active Devices : 2 Working Devices : 3 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 11% complete UUID : 36c49ebd:eca81d19:59b9feb3:f6770ef0 (local to host wakwak) Events : 0.8 Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 33 1 active sync /dev/sdc1 3 8 1 2 spare rebuilding /dev/sda1 root@wakwak:~# mdadm -E --scan ARRAY /dev/md1 level=raid5 num-devices=3 UUID=36c49ebd:eca81d19:59b9feb3:f6770ef0 spares=1 root@wakwak:~# mount /dev/md1 /mnt/raid mount: wrong fs type, bad option, bad superblock on /dev/md1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so root@wakwak:~# mount -t ext4 /dev/md1 /mnt/raid mount: wrong fs type, bad option, bad superblock on /dev/md1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so root@wakwak:~# dumpe2fs /dev/md1 | grep superblock dumpe2fs 1.41.4 (27-Jan-2009) Primary superblock at 0, Group descriptors at 1-117 Backup superblock at 32768, Group descriptors at 32769-32885 Backup superblock at 98304, Group descriptors at 98305-98421 Backup superblock at 163840, Group descriptors at 163841-163957 Backup superblock at 229376, Group descriptors at 229377-229493 Backup superblock at 294912, Group descriptors at 294913-295029 Backup superblock at 819200, Group descriptors at 819201-819317 Backup superblock at 884736, Group descriptors at 884737-884853 Backup superblock at 1605632, Group descriptors at 1605633-1605749 Backup superblock at 2654208, Group descriptors at 2654209-2654325 Backup superblock at 4096000, Group descriptors at 4096001-4096117 Backup superblock at 7962624, Group descriptors at 7962625-7962741 Backup superblock at 11239424, Group descriptors at 11239425-11239541 Backup superblock at 20480000, Group descriptors at 20480001-20480117 Backup superblock at 23887872, Group descriptors at 23887873-23887989 Backup superblock at 71663616, Group descriptors at 71663617-71663733 Backup superblock at 78675968, Group descriptors at 78675969-78676085 Backup superblock at 102400000, Group descriptors at 102400001-102400117 Backup superblock at 214990848, Group descriptors at 214990849-214990965 root@wakwak:~# dumpe2fs /dev/sdb1 | grep superblock dumpe2fs 1.41.4 (27-Jan-2009) Primary superblock at 0, Group descriptors at 1-117 Backup superblock at 32768, Group descriptors at 32769-32885 Backup superblock at 98304, Group descriptors at 98305-98421 Backup superblock at 163840, Group descriptors at 163841-163957 Backup superblock at 229376, Group descriptors at 229377-229493 Backup superblock at 294912, Group descriptors at 294913-295029 Backup superblock at 819200, Group descriptors at 819201-819317 Backup superblock at 884736, Group descriptors at 884737-884853 Backup superblock at 1605632, Group descriptors at 1605633-1605749 Backup superblock at 2654208, Group descriptors at 2654209-2654325 Backup superblock at 4096000, Group descriptors at 4096001-4096117 Backup superblock at 7962624, Group descriptors at 7962625-7962741 Backup superblock at 11239424, Group descriptors at 11239425-11239541 Backup superblock at 20480000, Group descriptors at 20480001-20480117 Backup superblock at 23887872, Group descriptors at 23887873-23887989 Backup superblock at 71663616, Group descriptors at 71663617-71663733 Backup superblock at 78675968, Group descriptors at 78675969-78676085 Backup superblock at 102400000, Group descriptors at 102400001-102400117 Backup superblock at 214990848, Group descriptors at 214990849-214990965 root@wakwak:~# dumpe2fs /dev/sdc1 | grep superblock dumpe2fs 1.41.4 (27-Jan-2009) dumpe2fs: Bad magic number in super-block while trying to open /dev/sdc1 Couldn't find valid filesystem superblock. root@wakwak:~# dumpe2fs /dev/sda1 | grep superblock dumpe2fs 1.41.4 (27-Jan-2009) dumpe2fs: Filesystem revision too high while trying to open /dev/sda1 Couldn't find valid filesystem superblock. [NEW MDADM.CONF FILE] # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # # by default, scan all partitions (/proc/partitions) for MD superblocks. # alternatively, specify devices to scan, using wildcards if desired. DEVICE partitions # auto-create devices with Debian standard permissions CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays ARRAY /dev/md0 level=raid5 num-devices=3 UUID=880c329a:c20bdf9c:59b9feb3:f6770ef0 # This file was auto-generated on Tue, 14 Jul 2009 21:26:36 -0500 # by mkconf $Id$ [OLD MDADM.CONF FILE] # mdadm.conf # # Please refer to mdadm.conf(5) for information about this file. # # by default, scan all partitions (/proc/partitions) for MD superblocks. # alternatively, specify devices to scan, using wildcards if desired. DEVICE partitions # auto-create devices with Debian standard permissions CREATE owner=root group=disk mode=0660 auto=yes # automatically tag new arrays as belonging to the local system HOMEHOST <system> # instruct the monitoring daemon where to send mail alerts MAILADDR root # definitions of existing MD arrays #ARRAY /dev/md0 level=raid5 num-devices=3 UUID=bba00649:66fad961:59b9feb3:f6770ef0 # This file was auto-generated on Sun, 05 Jul 2009 23:42:22 -0500 # by mkconf $Id$ ARRAY /dev/md0 level=raid5 num-devices=3 metadata=00.90 UUID=bba00649:66fad961:59b9feb3:f6770ef0 |
Actually, you could try mounting one disk at a time and running fsck, but not sure if it would say anything about disk being part of an array.
One thing though, ext4 is brand new, ie not the default on most Linux, so it may not be fully-baked (debugged). I'd stick with ext3 and let others 'test' it for you (in future). You may need to pass extra params to the dumpe2fs cmd to make it recognise ext4. I think that's the case with ext3, otherwise it assumes real ext2 (and similarly with related cmds). Sorry I can't be more help. I'll be interested to see what others say. |
Quote:
Now comes my speculation to explain your issues: When you booted with only 2 disks plugged in, they were assigned sda and sdb (and not sda and sdc as they did before). Then you added the third drive, it was assigned sdc and of course it said it was resyncing sdc1 which was *correct*. The issue with "broken superblock" might have been solved by a simple fsck before or after the resync and you could have been just fine... I know that this probably doesn't really help to get your data right now but still... R. |
Quote:
|
help!
so nobody has any ideas on this?
even if I just want to take a peek at the directories and files, without trying to copy them? If this thing actually is dead, it would be ideal to still be able to recover a *listing* of the files on there. Under windows I can do such a thing with a simple partition saving program. Does nothing like that exist for linux? /very frustrated |
All times are GMT -5. The time now is 10:15 PM. |