Assistance using fsck to recover ext3 LVM partition
Hey all,
I am running kernel 2.6.17-1.2186_FC5smp. I have four 250GB drives, hda, hdb, sda, sdb. hdb has a boot, root, and swap partition, totalling about 20GB of space. The remainder of hdb is part of a LVM volume group, along with the other three drives. The LVM group has been configured to mount at /dev/video. It is formatted for ext3. This box is dedicated to running MythTV. I noticed a few days ago that MythTV wasn't running. My attempts to start the service failed. Upon further investigation I discoverd that I couldn't access /video, despite the rest of the computer working fine. It had probably been in this working-but-no-access-to-/video state for several days. I rebooted. Upon reboot, fsck fails on the /video partition. It says: "fsck.ext3: Attempt to read block from filesystem resulted in short read while trying to open /dev/VolGroup/Video. Could this be a zero-length partition?" I drop to single user mode, Repair filesystem prompt. Here things get sticky: pvscan says "Locking type 1 initialisation failed". I suspect that perhaps LVM support is not enabled in this single user mode, so I can't run proper diagnostics on the drive. Attempting to use a different superblock gets me nowhere (not surprising as I don't think this is working properly through LVM): e2fsck -b 32 /dev/VolGroup/Video "Bad magic number in super-block while trying to open /dev/VolGroup/Video" I know not to run fsck on the raw partitions themselves, of course this will hose things. How do I get these LVM volumes grouped together so I can run a proper check of the drive? I'm hoping I can try a different superblock and get things up and running again, but using fsck isn't exactly my specialty. I have run SMART diagnostics on all the drives and they all pass the long tests. Thanks! |
Okay, for those of you who may encounter a similar problem, here's how I resolved this:
I booted off the CD and went into rescue mode ("linux rescue" at the boot prompt). I then made a backup of my fstab. I then edited the fstab, and removed the reference to the LVM volume group (/video in my case). I then rebooted normally. Everything of course came back up just fine, but without the /video partition - but with LVM support. I was then able to run fsck on the disk like so: First I located the backup superblocks on my system like so: mke2fs -n /dev/VolGroup/Video The -n option makes it simply show what it WOULD do if you asked it to create a new file system, but doesn't actually create the file system itself. From the list of backup superblocks this generated, I picked the lowest one (I have no idea if it makes a difference which one you choose, I guessed, it worked). I then ran fsck using the -b option to specify the backup superblock to use: fsck -b 32768 /dev/VolGroup/Video This then went through and found all the problems in the drive and fixed them. Note that you might want to use the "yes" option (-y) to have it automatically say yes to everything - I'm paranoid and didn't, but it did mean hitting "y" about 4,983 times - good thing for fast key repeat rates. :-) Once fsck was done fixing everything, I put back my backup copy of fstab which included the reference to my /video mount and rebooted. Voila! All data back and happy. At this point I backed everything up and am investigating the cause of this problem - dmesg has some scary stuff about hard drive errors in it so I suspect I have a hard drive on the verge of death. Nevermind that they are all less than 3 months old. Grr. In either case, I was successfully in recovering 100% of my needed data (note that I had a backup of these files, so I only had to get about 40GB out of the 900GB - just the new files that had changed since last backup. I suspect some data may have been lost in this process, but apparently, none that I needed). Also, in the process of resolving this issue someone pointed out to me that I am mixing both SATA and PATA drives in the same LVM, plus multiple brands with minor size differences. LVM seems to support this, but I had several people tell me they suspect that is the cause of this failure. If anyone has anything else to add to that, let me know - I suspect a hardware failure, not a configuration problem is the cause here. Next up: Replacing this unreliable LVM mess with a software RAID 5 solution. Gonna need some bigger drives..... |
All times are GMT -5. The time now is 06:58 PM. |