LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   Debian Lenny - SAN - LVM fail (http://www.linuxquestions.org/questions/linux-server-73/debian-lenny-san-lvm-fail-771700/)

Vorik 11-26-2009 08:11 AM

Debian Lenny - SAN - LVM fail
 
Hi all,
(This was crossposted from serverfault.com because no answer after several days)

I've got a Lenny server that has got a SAN connection configured as the only PV for a VG named 'datavg'.

Yesterday, I've updated the box with Debian patches and gave it a reboot.

After the reboot, it didn't boot up saying that it couldnt find /dev/mapper/datavg-datalv.

This is what I did:
- booted in rescue-mode and commented the mount in /etc/fstab
- rebooted into full-user mode. (mountpoint is /data, only postgresql could not start)
- did vgdisplay, lvdisplay, pvdisplay to find out what happened to the volume group. (datavg was missing entirely)

After that, I noticed that the LUN is visible from Linux and that the LVM partition is also visible:

Code:

# ls -la /dev/mapper/mpath0*
    brw-rw---- 1 root disk 254, 6 2009-11-23 15:48 /dev/mapper/mpath0
    brw-rw---- 1 root disk 254, 7 2009-11-23 15:48 /dev/mapper/mpath0-part1

- Then, I tried pvscan in order to find out if it could find the PV. Unfortunately, it didnt detect the partition as a PV.
- I ran pvck on the partition, but it did not find any label:

Code:

# pvck /dev/mapper/mpath0-part1
      Could not find LVM label on /dev/mapper/mpath0-part1

- Then, I was wondering if the LUN was perhaps empty, so I made a dd of the first few MB. In this, I could see the LVM headers:

Code:

datavg {
    id = "removed-hwEK-Pt9k-Kw4F7e"
    seqno = 2
    status = ["RESIZEABLE", "READ", "WRITE"]
    extent_size = 8192
    max_lv = 0
    max_pv = 0
   
    physical_volumes {
   
    pv0 {
    id = "removed-AfF1-2hHn-TslAdx"
    device = "/dev/dm-7"
   
    status = ["ALLOCATABLE"]
    dev_size = 209712382
    pe_start = 384
    pe_count = 25599
    }
    }
   
    logical_volumes {
   
    datalv {
    id = "removed-yUMd-RIHG-KWMP63"
    status = ["READ", "WRITE", "VISIBLE"]
    segment_count = 1
   
    segment1 {
    start_extent = 0
    extent_count = 5120
   
    type = "striped"
    stripe_count = 1        # linear
   
    stripes = [
    "pv0", 0
    ]
    }
    }
    }
    }

Note that this came from the partition where pvck could not find an LVM label!

- I decided to write a new LVM label to the partition and restore the parameters from the backup file.

Code:

  pvcreate --uuid removed-AfF1-2hHn-TslAdx --restorefile /etc/lvm/backup/datavg  /dev/mapper/mpath0-part1
- Then I ran a vgcfgrestore -f /etc/lvm/backup/datavg datavg
- After that, I appears when I issue a pvscan.
- With a vgchange -ay datavg, I activated the VG and the LV came available.
- When I tried to mount the LV, it did not find any filesystem. I tried recovery in several ways, but did not succeed.
- After making a DD of the affected LV, I've tried to recreate the superblocks with

Code:

mkfs.ext3 -S /dev/datavg/backupdatalv
- but the result of this cannot be mounted:

Code:

# mount /dev/datavg/backupdatalv /mnt/
    mount: Stale NFS file handle

The fact that this can happen in the first place is not very nice to say the least, so I want to find out everything I can about this malfunction.

My questions:
- How can it be that the LVM label disappears after patches and a reboot?
- Why is the filesystem not there after salvaging the PV? (Did the pvcreate command trash the data?)
- Is the ext3 filesystem in the LV still salvageable?
- Is there anything I could have done to prevent this issue?

Thanks in advance,
Ger.

kschmitt 11-30-2009 12:38 PM

Just re-read your question, so I'm changing my post. I'm wondering if the problem is that you tried to restore the label before running vgchange -ay.

vgchange should find the physical volumes, volume groups, etc, and add them to the local system, but I don't believe it changes them at all. The label didn't disappear, your system just didn't have it yet. I'd guess it was the attempts to restore/recreate that blew away some of the data, but again, it's a guess.

I've mounted an iSCSI volume with LVM on it from multiple boxes, re-installs, etc, and vgchange -ay was all I needed to get the volume groups on that system.

Your SAN volume, and the LVM on it shouldn't be affected at all by an OS update, so you're not crazy for thinking this is weird.

Note, my san skills are still building, I'm very far from expert! I'm in no way positive how to recover the data in your situation, but I'm hoping you post it (if you're able to), in case I ever run into the same.

Good luck

Vorik 11-30-2009 01:42 PM

Quote:

Originally Posted by kschmitt (Post 3774590)
Just re-read your question, so I'm changing my post. I'm wondering if the problem is that you tried to restore the label before running vgchange -ay.

vgchange should find the physical volumes, volume groups, etc, and add them to the local system, but I don't believe it changes them at all. The label didn't disappear, your system just didn't have it yet. I'd guess it was the attempts to restore/recreate that blew away some of the data, but again, it's a guess.

I tried vgchange, pvscan etc. I even tried several commands (pvcheck /dev/mapper/mpath0-part1) which stated that no label was found.

I had to manually insert the label in the PV to get it recognized. (see the pvcreate command)

Thanks again,
Ger.


All times are GMT -5. The time now is 09:28 PM.