Andy Waddington 04-27-2007 06:15 AM

RAID 5 vanished in Ubuntu 6.10
I have a Sun Ultra 80 with eight SCSI discs, running Ubuntu 6.10 (kernel 2.6.17) for Sparc64.

The first (146 Gb) disc is the main system disc, the second (73 Gb) disc was a scratch disc used for temporary data and swap space. The other six (all 73 Gb) discs are in an external Sun 711 disc array and were configured as a software RAID 5 array.

I noticed that something bad was happening with the singleton 73 Gb drive (/dev/sdb) - in fact the device had failed and gone offline, so I rebooted the system with "shutdown -r now". On coming back up, the drive stayed offline. The first drive of my RAID array came up as /dev/sdb and the system tried to mount /dev/sdb1 in the usual place. fsck decided that something was wrong with the filing system and told me that my ext3 journal was corrupt and that it had deleted it and the filesystem was now ext2, then it decided that the superblock was wrong, and it declined to make any further attempt to mount the disc, asking me to fsck it manually.

I powered off, replaced the dead drive and rebooted. The replacement drive duly came up as /dev/sdb and I was able to mount things from this. The drives of my RAID 5 array were all now back in their proper places as /dev/sdc to /dev/sdh, but the RAID array would not start, presumably because the fsck of the previous boot had corrupted something on the first drive.

If I cat /proc/mdstat it says

unused devices: <none>

mdadm --monitor --oneshot /dev/md0
DeviceDisappeared on /dev/md0 unknown device

All six drives of my RAID array are physically fine, but the array doesn't start. Presumably the entire content of the corrupted drive could be restored from the other drives if it was marked as failed and then replaced, but this doesn't seem to be happening automatically. This is the first time I've tried to use RAID and I'm looking forward to seeing how the redundancy is going to save me, but the tools present in Ubuntu 6.10 didn't match the online RAID HOWTOs I've read (/etc/raidtab seems to be a thing of the past) and I had to find how to set it up a bit by trial and error and a few odd bits of info on the web. It doesn't look as though such documentation has yet caught up with telling me how to get my data back. Anyone have a simple recipe ?

I don't think I will actually lose anything if I have to rebuild the array from scratch, as this 350 Gb array is backing up other discs on other computers and I can restore everything, given enough time. It is fortunate that this failure has occurred before I started to accumulate unique data on this array. But obviously, a 350 Gb array is not much use as a backup if it can be made to disappear quite this easily, so I'm keen to learn how easily it can be restored, particularly before I risk creating original content on it - this array was meant to increase my data security !!

TIA, Andy

