[SOLVED] One of hard disks of the logical volume failed

Lieta · 01-03-2016, 08:57 AM

Hi.
One of two hard disks that a logical volume was on failed.

Code:

                home {
                        id = "xYzC2U-xLAo-PfTs-5mjA-EwXj-d2c1-gWDhcN"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "debian"
                        creation_time = 1434962879      # 2015-06-22 11:47:59 +0300
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 69199    # 270,309 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 7050
                                ]
                        }
                        segment2 {
                                start_extent = 69199
                                extent_count = 59618    # 232,883 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv1", 0
                                ]
                        }
                }

Here "home" is a logical volume and "pv1" is a physical volume that failed. How to proceed to recover as much data as possible?

jpollard · 01-03-2016, 09:18 AM

I'm not sure you are going to get much help. You can try things like testdisk/photorec, but I don't think you will get much data back. That doesn't mean none, just not much, and not necessarily all of a file.

The problem is that with a linear concatenation, you lose the entire filesystem when either one fails.

The cause of such loss is due to the filesystem allocating both meta-data and data blocks scattered for opimum access. So such allocations do not/will not put all the data on one physical volume.

Had pv1 and pv2 been raid volumes (other than raid0...), the raid recovery would have preserved the data.

rknichols · 01-03-2016, 09:42 AM

Be prepared to cry a lot.

It's really hard to give exact instructions without knowing the content of the "physical_volumes { ... }" section of that LVM backup file, but I would start with a new disk drive (750 GB or larger), make a 550 GB partition there, use dd to copy segment1 to the new drive, zero out the rest of the partition (probably already zeros if it's a new drive). and then see what fsck can do to reconstruct that filesystem.

To do that copying of segment 1 you need to run:

Code:

dd if={device for pv0} of={your new partition} bs=1M count=$((69199*4)) skip=$((7050*4 + 1}))

For zeroing the rest of the partition (if necessary):

Code:

dd if=/dev/zero of={your new partition} bs=4194304 seek=69199

That "4194304" number is is 4 MiB extent size that pv0 appears to have from the numbers you gave.

If the pv1 drive is not totally dead, you can try to use ddrescue to recover as much data as possible rather than filling the rest of the partition with zeros. If that's the case, let me know and I can give more exact instructions.

rknichols · 01-03-2016, 10:08 AM

Quote:

Originally Posted by jpollard

The cause of such loss is due to the filesystem allocating both meta-data and data blocks scattered for opimum access.

Fortunately, not as scattered as you might think. To avoid excessive seeking, the allocator (for ext2/3/4, at least) tries to put the data blocks for a file together in the same block group as that file's inode, and the inodes for files tend to be near the inode for the directory that contains them. Of course all bets are off when the block groups start to fill up (one reason for that 5% reserved space is to have some space available in each block group), but if "home" was originally just on one PV and later extended to a second, all of the old data on that first PV would still be there.

Lieta · 01-03-2016, 11:18 AM

Thanks a lot, rknichols. pv1 is totally dead, not even detected. Here's the "physical_volumes { ... }" you requested.

Code:

        physical_volumes {

                pv0 {
                        id = "aycqo0-UjXU-eEma-UHny-JCTh-Qtmn-sSm3gO"
                        device = "/dev/sda5"    # Hint only

                        status = ["ALLOCATABLE"]
                        flags = []
                        dev_size = 624637952    # 297,851 Gigabytes
                        pe_start = 2048
                        pe_count = 76249        # 297,848 Gigabytes
                }

                pv1 {
                        id = "wLDgiA-zdQT-TVQN-miYn-4Zjo-mUge-bj6F0f"
                        device = "unknown device"       # Hint only

                        status = ["ALLOCATABLE"]
                        flags = ["MISSING"]
                        dev_size = 488397168    # 232,886 Gigabytes
                        pe_start = 2048
                        pe_count = 59618        # 232,883 Gigabytes
                }
        }

Lieta · 01-03-2016, 12:15 PM

Important is also to know what files have been lost or have corrupt contents. Will fsck show it?

rknichols · 01-03-2016, 02:03 PM

So, it looks like pv0 was on /dev/sda5 (~297851 GiB). That might not be /dev/sda in your rescue environment, so you'll want to use blkid to identify the partition unless it's obvious which disk is which. Also, I've changed the dd parameters for copying segment 1. They were wrong before since the pe_start offset is in units of 512-byte sectors, not 4 MiB extents. I'm pretty sure it's right, now. You can try this:

Code:

dd if=/dev/sda5 bs=1M count=2 skip=$((7050*4 + 1})) | file -s -

to be sure. The file command should pick up the identity of the filesystem. That says the start point is right, and I know the "count=$((69199*4))" is right.

There will be no indication from fsck about what files are lost since the directories they were in are probably gone too. Also, fsck just makes the filesystem metadata consistent. It has no way to check the content of files. I suppose one way to tell what files (the ones that still exist) have blocks in the missing area would be to create a dmsetup mapping with that whole region mapped to the error target. That might be something to try even without attempting fsck, but get that copying done first so that there is something to work with without risking the original data.

Lieta · 01-03-2016, 03:00 PM

Disk is /dev/sda, since it's the only one left now.

Code:

$ sudo dd if=/dev/sda5 bs=1M count=2 skip=$((7050*4 + 1)) | file -s -
/dev/stdin: Linux rev 1.0 ext4 filesystem data, UUID=7d25087b-a730-4621-a49d-360380911cd4 (errors) (extents) (large files) (huge files)

Is this ok?

rknichols · 01-03-2016, 03:05 PM

Excellent!! Proceed with the copying.

Lieta · 01-04-2016, 02:56 AM

I'll do the copy as soon as I get a new hard disk. In the meantime could you, please, tell how to do this?

Quote:

Originally Posted by rknichols

create a dmsetup mapping with that whole region mapped to the error target

I won't do this until I make a copy, of course.

rknichols · 01-04-2016, 08:41 AM

Quote:

Originally Posted by Lieta

I'll do the copy as soon as I get a new hard disk. In the meantime could you, please, tell how to do this?

After you create a partition on the new disk, just follow the instruction I gave back in #3:

Code:

dd if={device for pv0} of={your new partition} bs=1M count=$((69199*4)) skip=$((7050*4 + 1}))
# which is probably
dd if=/dev/sda5 of=/dev/sdb1 bs=1M count=$((69199*4)) skip=$((7050*4 + 1}))

Do be sure that /dev/sda and /dev/sdb are the correct disks first. I generally do "cat /proc/partitions" and look at the output for confirmation. (The sizes there are in units of 1K blocks.)

Lieta · 01-04-2016, 09:25 AM

This is clear, I mean how to do dmsetup mapping to identify which files are in the "lost" area?

rknichols · 01-04-2016, 10:54 AM

Sorry, totally misunderstood. I did some experimenting and found that the error mapping might not be very useful. Due to the I/O errors, the filesystem can't be mounted. You can get in and poke around with debugfs, but getting any info that way would be beyond tedious. I'll think about this for a while and see if I can come up with anything useful.

First, create a file /tmp/mymap with the following content:

Code:

0 566878208 linear /dev/sdb1 0
566878208 488390656 error

That 566878208 is the number of 512-byte sectors in the 69199 4MiB extents of pv0, and 488390656 is for the 59168 extents of pv1.

Now, run

Code:

dmsetup create badhome /tmp/mymap

You now have a device /dev/mapper/badhome that will return an I/O error for any reference to sectors beyond what was mapped from /dev/sdb1.

Lieta · 01-05-2016, 12:26 PM

I have great news. Today I connected the disk to another PC and it worked, then connected back to original - works as well. I recreated the logical volume setup to original state and tried mounting /home. I got an error:

Code:

[ 2404.541803] EXT4-fs (dm-4): bad geometry: block count 131908608 exceeds size of device (70859776 blocks)

My corrent logical volume setup:

Code:

root@lieta:/etc/lvm/archive# lvdisplay --units b
...
  --- Logical volume ---
  LV Path                /dev/debian-vg/home
  LV Name                home
  VG Name                debian-vg
  LV UUID                xYzC2U-xLAo-PfTs-5mjA-EwXj-d2c1-gWDhcN
  LV Write Access        read/write
  LV Creation host, time debian, 2015-06-22 11:47:59 +0300
  LV Status              available
  # open                 0
  LV Size                540297658368 B
  Current LE             128817
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:4

root@lieta:/etc/lvm/archive# ls -l /dev/debian-vg/home
lrwxrwxrwx 1 root root 7 jan  5 19:34 /dev/debian-vg/home -> ../dm-4
root@lieta:/etc/lvm/archive# ls -l /dev/dm-4

540297658368/(1024*4)==131908608. Why it doesn't mount?

EDIT: After reboot everything is fine.

rknichols · 01-05-2016, 12:40 PM

I had high hopes that testdisk file recovery would help find what files were corrupted, but when asked to recover all files it recovers a mish-mash of intact files, deleted files, and partially recovered files. You can identify the partially recovered files by the size difference. So, what I suggest is:

Copy the partial filesystem to a new partition and pad with zeros to the original size (at least) as previously described.
Run [fsck] on that new partition to get a sane filesystem there.
Create a mapped device from that partition with the padded region mapped to the error target as previously described.
Make a recovery directory somewhere with enough space to hold the recovered files.
Run testdisk on the mapped device, select "Unpartitioned device", and go into "Advanced file recovery".
Type "a" to select all files, then "C" (upper case) to copy selected files. Select your recovery directory as the target. Go have lunch while it works.

After exiting testdisk, mount the new partition (the whole thing -- not the error-mapped version) read-only on /mnt/tmp. Then you can run

Code:

cd {your recovery directory}
find . -type f -exec test -f "/mnt/tmp/{}" \; -exec cmp {} "/mnt/tmp/{}" \;

That should (a) skip over any deleted files that were recovered (files that don't exist in /mnt/tmp) and (b) cause an "EOF on ..." failure message from cmp for any files that were partially recovered.

As I said before, there will be no way to tell what files were totally lost. Their names exist only in the missing part of the filesystem.

That's the best I can come up with. I've already spent too much time on this, but it's been quite educational for me.