Running a server at work (FC3, kernel 2.6.12) which contains nearly all of the company's documents. I've been at this for going on 22 hours, so apologies if I leave out important details -- the whole thing is looking a bit fuzzy right now. MDADM RAID5 with /dev/md0 made from /dev/sda1 /dev/sdb1 and /dev/sdc1. Boot partition is on /dev/hda so I'm able to bring the machine up and down readily despite the raid problems. These are all (except /dev/hda) Seagate Barracuda 7200 160GB SATA drives.
Long story short, I noticed yesterday that one of the RAID5 drives (sdb) was offline with errors, swapped it for one of the hot spares we have, and let it start recreating. But
sda failed before it was done. I've gone through a bunch of different permutations of trying to get things to work (swapping out the old sdb and the new sdb, switching SATA controllers, etc.)
Somewhere along the way I did something
BAD and probably assembled the array incorrectly, followed by an fsck that showed a LOT of errors. (Damn.)
However, by doing the (A missing B) (C missing B) etc. permutations, I have been able to resurrect a /dev/md0 which, if I do a
Code:
dd if=/dev/md0 count=512 skip=xxxxxx | strings
shows me what looks like a lot of valid data. I can identify pieces of text documents, word files, etc. I'm still holding out a glimmer of hope that this means I'm not royally screwed.
The problem is, even though dumpe2fs works pretty well, e2fsck doesn't seem to be able to find valid superblocks no matter where I tell it to look. I'm trying things like -b 8192000 or 8192001 or 32768000/1 etc. (Not sure why all the docs show the -b argument using an odd number, while the dumpe2fs shows an even one, so I experimented.) Whatever I do, it just says 'invalid argument' and:
Code:
The superblock could not be read or does not describe a correct ext2
filesystem. If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
etc.
Can anyone give me suggestions for what I might try to recover/rebuild the filesystem?
Sorry I don't have more code examples, the system is booted single-user right now and I can't get files on or off of it. I'm pretty sure no LVM is involved here because /etc/fstab just lists
Code:
/dev/md0 /server ...
Very scared here... our last full backup seems to have been in January.
Thanks,
Greg