Problem recovering LVM paritions using pvscan etc "File descriptor 20 left open"

elmaccco · 01-19-2012, 07:45 AM

Hi all,

I've had a crashed RAID5 system (HP Proliant ML530 running RHEL AS kernel 2.6.9-67.0.7) and am now frantically trying to recover the LVM paritions. The smaller non-LVM partitions
like /boot etc could be recovered using fsck aftier booting with a CentOS 4.4 and running 'linux rescue', but not the LVM paritions:

# fdisk -l |grep 'LVM\|LBA'

/dev/cciss/c0d0p4 6643 71394 520120440 f W95 Ext;d (LBA)
/dev/cciss/c0d0p5 6643 71394 520120408+ 8e Linux LVM

# fsck -y /dev/cciss/c0d0p4
fsck 1.35 (28-Feb-2004)
WARNING: couldn't open /etc/fstab: No such file or directory
e2fsck 1.35 (28-Feb-2004)
fsck.ext: Attempt to read block from filesystem resulted in sort read while trying to open /dev/cciss/c0d0p4
Could this be a zero-length partition?

# fsck -y /dev/cciss/c0d0p5
fsck 1.35 (28-Feb-2004)
WARNING: couldn't open /etc/fstab: No such file or directory
e2fsck 1.35 (28-Feb-2004)
Couldn't find ext2 superblock, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/cciss/c0d0p5

The superblock could not be read or does not describe a correct ext2 filesystem. If the devcice is valid and it really contains an ext2 filesystem (and not swap or ufs or
something else), then the superblock is cirrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 <device>

Then I tried what some other mailing list advice said:

# lvm pvscan
File descriptor 20 left open
File descriptor 21 left open
File descriptor 22 left open
No matching physical volumes found

# lvm pvgscan
File descriptor 20 left open
File descriptor 21 left open
File descriptor 22 left open
Reading all physical volumes. This may take a while...
No volume groups found

# lvm lvchange -ay /dev/Volume00/LogVol1 <--- This is where they were earlier
File descriptor 20 left open
File descriptor 21 left open
File descriptor 22 left open
Voulme group "Volume00 not" found

# lvm lvscan
File descriptor 20 left open
File descriptor 21 left open
File descriptor 22 left open
No volume groups found

# ls /dev/Vol*
ls: /dev/Vol*: No such file or directory

Any ideas how to find the Volume group anyone???

Many thanks in advance!
Marcus

Ser Olmy · 01-19-2012, 09:30 AM

Output from fdisk shows that /dev/cciss/c0d0p5 is the physical RAID volume on which (I assume) all your LVMs reside:

/dev/cciss/c0d0p4 6643 71394 520120440 f W95 Ext;d (LBA)
/dev/cciss/c0d0p5 6643 71394 520120408+ 8e Linux LVM

However, what you've tried to do is to run fsck directly on the LVM PV ("fsck -y /dev/cciss/c0d0p5") and on the extended partition containing the PV ("fsck -y /dev/cciss/c0d0p4"). This is a very bad idea, as none of these partitions contain a file system of any kind, and trying to fsck them could potentially destroy the entire LVM volume group.

Some googling revealed that the "file descriptor left open" message is probably just a warning an may be ignored. Your real problem is the message "no matching physical volumes found". If LVM doesn't recognize the PV it won't find the Volume Group either, and hence cannot find the Logical Volume.

LVM not even seeing the PV indicates a pretty serious error condition. You should check that the fsck you ran on /dev/cciss/c0d0p4 hasn't corrupted the partition ID on /dev/cciss/c0d0p5.

If the ID is correct and pvscan still can't find the physical volume, you should seriously consider restoring from backup. The data on the volume may have some serious corruption.

elmaccco · 01-19-2012, 09:38 AM

Thanks for your reply Ser Olmy!

Do you know how I could check that the fsck you ran on /dev/cciss/c0d0p4 hasn't corrupted the partition ID on /dev/cciss/c0d0p5? I'm a newbie when it comes to rescuing file systems and manage LVMs.

Do you really think running fsck could have had that effect? There was no message of any 'fixing' modification.

Ser Olmy · 01-19-2012, 09:58 AM

LVM looks for two things to identify a PV: the partition ID and the metadata that was created by pvcreate. Reboot your server and check that "fdisk -l /dev/cciss/c0d0" shows /dev/c0d0p5 as still having partition ID 8e. I would suspect that it does.

That would mean that the LVM metadata on /dev/cciss/c0d0p5 is corrupt. I can't give any advice on how to do this, as I don't know what you LVM setup used to look like. Is/was /dev/cciss/c0d0p5 the only physical volume in the volume group? Do you by any chance have an /etc/lvm/archive folder on your boot drive, and if so, what does it contain?

elmaccco · 01-20-2012, 03:21 AM

Thanks again, I really appreciate your help in this dire moment!

Yes you're right, after a reboot the /dev/c0d0p5 as still having partition ID 8e, so I guess that means the metadata is corrupt.

There was no /etc in the smaller recovered and mounted non-LVM paritions so I assume it must be on one of the LVM paritions.
Luckily I had an old backup of /etc after the LVM was set up. The /etc/lvm/archive contains nine ~2kb large files named Volume00_0000x.vg where x ranges from 0 to 8.

Could these be used somehow to recover the metadata?

Also, do you think there are other more advanced recovery CDs that would help in this case?

Ser Olmy · 01-20-2012, 03:56 AM

The .vg files can indeed be used to recover the Volume Group metadata. I found this description of the procedure; basically, you recreate the PV setup using pvcreate with the "--restorefile" parameter, and then restore the Volume Group configuration with vgcfgrestore.

I have NOT tried this myself, and before attempting something like that, I would seriously consider backing up the entire partition using, say, dd ("dd if=/dev/cciss/c0d0p5 of=my_backup_file bs=8192" should get the job done fairly quickly).

elmaccco · 01-20-2012, 09:28 AM

Ok well at least that's promising. I've been running the dd backup for a while now and estimate it will be done in another 6 hours.

Meanwhile, I have three morer questions for you if it's alrigh

I checked the pvremove/pvdisplay/pvcreate/vgcfgrestore commands and they are not part of the linux rescue cd mount, i.e. I can't find them (only 'lvm') in /usr/sbin etc. As a matter of fact, I do have a RAID0 drive for disk backups (/mnt/backup) that was never corrupt. This is where I'm dd:ing the LVM backup partition at the moment, and which contains the LVM meta-data backups at /mnt/backup/etc/lvm/archive. It has these LVM commands but when I test them I get:

# /mnt/backup/usr.backup/sbin/pvdisplay
/mnt/backup/usr.backup/sbin/pvdisplay: relocation error: /mnt/backup/usr.backup/sbin/pvdisplay: symbol dm_snprintf, version Base not define in file libdevmapper.so.1.02 with link time reference

Stupid question perhaps, but how can I a) get these commands as part of the rescue mount, or b) make use of the ones in /mnt/backup/usr.backup/sbin/?
Are there other and better rescue CDs out there you think?

Also, if I need to use the dd backup file, should I just run "dd if=my_backup_file of=/dev/cciss/c0d0p5 bs=8192"?

Finally, after looking into /etc/lvm I noticed that the file /etc/lvm/backup/Volume00 is identical to the latest /etc/lvm/archive/Volume00_00009.vg. I assume it's more kosher to use that one when recovering the meta-data.

Ser Olmy · 01-20-2012, 10:30 AM

It would seem the LVM commands on your backup drive was compiled against a different version of libdevmapper than the one on the rescue CD. I'd recommend System Rescue CD, as it includes all the relevant LVM commands and libraries.

Edit: Actually, I'd really recommend using a rescue CD with a more recent libdevmapper, as the one you're using now may have a fairly ancient version of LVM. You should NOT attempt any LVM operations using outdated tools.

It is possible to restore the image file with the dd command exactly as you suggest. And yes, the metadata backup in /etc/lvm/backup should be more recent than the one in /etc/lvm/archive, but unless you've made changes to the LVM setup recently, the files should be identical.

May I ask what happened to your hardware RAID to cause this level of data corruption? I've been using Smart Array controllers for 15+ years, and they're usually rock solid.

elmaccco · 01-20-2012, 11:00 AM

Ok I'll use that cd instead and will let you know how I'm getting on.

First one drive on the RAID5 array failed and then another one during recovery. What are the odds...
Actually another drive has failed now and I'm not sure whether I should put in the replacement in the middle of all this. My guess is that you think I should?

Ser Olmy · 01-20-2012, 11:11 AM

The odds of two drives failing roughly at the same time is actually quite large, if the drives are from the same manufacturer and the same batch. If these were original HP-branded drives or so-called "RAID drives" from WD or Seagate, then they were most likely not from the same batch, though.

It is possible for a drive with multiple bad sectors to go unnoticed in a RAID array, unless you run verify/scrubbing on a regular basis. You won't notice the problem until another drive fails, and the data in what turns out to be bad sectors is needed to rebuild the array.

As for the drive that just failed, if the rebuild operation was completed when it happened, then there's no reason not to replace the drive.

elmaccco · 01-21-2012, 12:19 PM

It's very weird, the System Rescue CD didn't work. I dowloaded the latest one (v2.4.1), verified md5sum, burned and verified it on a blank CD. A quick test boot of the CD on my laptop worked no problem, but when booting on the crashed server sysrcd.dat couldn't boot and tty couldn't be accessed:
http://biosc-sub2-169.ucc.ie/temp/IMG_0747.JPG
No tools of value were accessable.

All sysrcd boot options got the same error except "Alt E) Boot an existing Linux OS installed on the disk":
http://biosc-sub2-169.ucc.ie/temp/IMG_0748.JPG

I also tried KNOPPIX rescue CD and got the same tty error:
http://biosc-sub2-169.ucc.ie/temp/IMG_0749.JPG

Are these errors signs of corrupt LVM drive you think?

I'm recovering the newly failed RAID drive right now, but this is the message I got before I replaced the drive:
http://biosc-sub2-169.ucc.ie/temp/IMG_0746.JPG

Ser Olmy · 01-21-2012, 08:04 PM

I guess the error from System Rescue CD and the Knoppix CD could be related to LVM. System Rescue CD can be installed to a USB stick, you might want to try that.

The error message from the controller means that the drive has indicated through SMART attributes that it is failing. If this is the drive you just replaced, it was about time. How old are these drives, anyway?

elmaccco · 01-23-2012, 08:27 AM

I finally had some progress! The computer centre here at the college made the rescueCD available via network boot. This was necessary since the server didn't have USB boot as an option. I then could access all LVM tools and successfully follow the advice at the link you gave above.

However, it might take a while as it took 3 days to fsck another non-LVM 20GB partition. This partition is 240GB so if it's a linear trend it will be in February when I can access the long sought for web site folder...

By the way, is it very bad to Ctrl-C a running fsck? If I did, maybe I could still mount the LVM drive and access the files before having to wait all those weeks.

The original drives are ~10yrs so I guess they're pretty old... I now replaced the newly failed 146GB drive with a fresh one. However, the recovery is still going on after three days. I never remembered it taking so long, could something be wrong there? The RAID LV is ~500GB.

Ser Olmy · 01-23-2012, 03:00 PM

The time it takes to fsck a file system depends on the file system in question and the number of files stored in it. For a while I had an ext2 file system residing on a 1,5 Tb LVM volume, and fsck usually took about 3-4 hours. And that was just a regular maintenance fsck with no known file system errors.

AFAIK, aborting a fsck operation is not recommended, but I don't know if it could cause any damage to the file system. It really shouldn't, but I don't know even nearly enough about the inner workings of fsck to make any sort of authoritative statement.

If your drives are 10 years old, they're at least 5 years overdue for replacement. Considering the price of hard drives these days (even after the Thailand floods), running a production system on old drives makes little sense.

elmaccco · 01-24-2012, 05:18 AM

I googled it and it doesn't seem too dangerous:
http://serverfault.com/questions/575...errupt-an-fsck
After all, the most important thing I want to recover in this old server is a folder in /var/www, not necessarily for the
entire filesystem to work straight away.

So I did ctrlC
http://biosc-sub2-169.ucc.ie/temp/IMG_0756.JPG

This is the 'df -h' and 'fdisk -l' output:
http://biosc-sub2-169.ucc.ie/temp/IMG_0757.JPG

Read-only mounting didn't work:
http://biosc-sub2-169.ucc.ie/temp/IMG_0758.JPG (sorry for the quality...)

Unless you suggest something else I think I'll just run this fsck through.