I have (had?) a four-disk (3 TB) RAID-5 array created with mdadm that has failed and I am trying to recover the data. One disk was inexplicably dropped from the RAID set, and then a second disk seemingly failed while performing a resync with a new disk. A fairly typical failure mode for mdadm, from what I can gather.
We made disk images (using dd) of all four disks (which all completed successfully). Some mildly reassuring information is returned when running mdadm --examine on the disk images:
Code:
0-W1F0B8MX.img
Update Time : Tue Jul 29 10:27:53 2014
Events : 83003
Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
1-Z1F3TLXG.img
Update Time : Tue Jul 29 10:27:53 2014
Events : 83003
Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
2-Z1F2PGD4.img
Update Time : Tue Jul 29 10:27:53 2014
Events : 83003
Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
3-W1F0C8M1.img
Update Time : Tue Jul 29 09:56:37 2014
Events : 82996
Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
We've mounted these disk images read-only on a loopback device with an overlay file. We have then tried to recreate the array using:
Code:
mdadm --assemble /dev/md123 /dev/mapper/overlay{0,1,2,3}
This seems to execute correctly and the RAID array seems to be properly created:
Code:
# mdadm -D /dev/md123
/dev/md123:
Version : 1.2
Creation Time : Mon Aug 4 15:50:47 2014
Raid Level : raid5
Array Size : 8790405120 (8383.18 GiB 9001.37 GB)
Used Dev Size : 2930135040 (2794.39 GiB 3000.46 GB)
Raid Devices : 4
Total Devices : 3
Persistence : Superblock is persistent
Update Time : Tue Aug 5 10:12:24 2014
State : clean, degraded
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Name : magenta:123 (local to host magenta)
UUID : 85e7f480:e00f8d18:abe8eb93:6f0b558e
Events : 6
Number Major Minor RaidDevice State
0 253 5 0 active sync /dev/dm-5
1 253 7 1 active sync /dev/dm-7
2 0 0 2 removed
3 253 6 3 active sync /dev/dm-6
However, it is not possible to mount the RAID system because the original ext4 filesystem is not present or is unreadable:
Code:
# mount /dev/md123 /mnt
mount: you must specify the filesystem type
# /sbin/fsck.ext4 /dev/md123
e2fsck 1.41.12 (17-May-2010)
/sbin/fsck.ext4: Superblock invalid, trying backup blocks...
/sbin/fsck.ext4: Bad magic number in super-block while trying to open /dev/md123
Where did I go wrong? Should I have used "--create --assume-clean" instead of "--assemble"?
The chunk sizes are the same on the old and new array, but the one discrepancy I found was in the Used Device Size. Examining the original disks gives a size of:
Code:
Used Dev Size : 5860530176 (2794.52 GiB 3000.59 GB)
whereas the disk images appear to be a bit smaller, which is reflected in the new array:
Code:
Used Dev Size : 2930135040 (2794.39 GiB 3000.46 GB)
Is this a problem and if so how to get around it? Thanks.