Linux LVM & Bad superBlock(s)

ckozler · 02-28-2011, 11:14 AM

Hi Guys,

This isnt too much of a question of how to resolve any kind of issue but more of a question to try to understand LVM better.

I understand what LVM is, does, and generally how it works and how to use it.

I had a server with a Logical Volume Group (LVG_STORE) with three physical volumes inside of it with a single Logical Volume (LV_OPT). Recently though, one of these three disks failed which lead to a number of issues.

Some more background information:

- Server had a total of 4 disks
- 1 disk -- sda -- two partitions -- one for root and one for swap
- 3 disks -- sdb, sdc, and sdd (143GB 3gb/s SAS disks) in the Logical Volume
- sdb failed
- The RAID disk controller reported all disks OK
- Uncommenting it in /etc/fstab allowed me to mount to it manually and read certain parts but a specific directory (probably that bad block on the disk) would always cause the disk to throw an error and subsequently be unmountable

When I uncommented it from fstab and was able to boot and do some manual work on it and copy some data, I got this error

Code:

Feb 28 10:43:18 demo1 kernel: Aborting journal on device dm-0.
Feb 28 10:43:18 demo1 kernel: ------------[ cut here ]------------
Feb 28 10:43:18 demo1 kernel: WARNING: at fs/buffer.c:1197 mark_buffer_dirty+0x23/0x72()
Feb 28 10:43:18 demo1 kernel: Modules linked in: joydev st ide_disk ide_cd_mod vboxnetadp(N) vboxnetflt(N) vboxdrv(N) ipv6 bonding binfmt_misc fuse loop dm_mod bnx2 rtc_cmos i2c_piix4 rtc_core serio_raw pcspkr rtc_lib shpchp button i2c_core sr_mod ses pci_hotplug sg cdrom enclosure usbhid hid ff_memless ohci_hcd sd_mod crc_t10dif ehci_hcd usbcore edd ext3 mbcache jbd fan ide_pci_generic serverworks ide_core ata_generic sata_svw pata_serverworks libata dock thermal processor thermal_sys hwmon aacraid scsi_mod
Feb 28 10:43:18 demo1 kernel: Supported: No
Feb 28 10:43:18 demo1 kernel: Pid: 5014, comm: umount Tainted: G          2.6.27.19-5-default #1
Feb 28 10:43:18 demo1 kernel: 
Feb 28 10:43:18 demo1 kernel: Call Trace:
Feb 28 10:43:18 demo1 kernel:  [<ffffffff8020da29>] show_trace_log_lvl+0x41/0x58
Feb 28 10:43:18 demo1 kernel:  [<ffffffff8049a3da>] dump_stack+0x69/0x6f
Feb 28 10:43:18 demo1 kernel:  [<ffffffff8023d562>] warn_on_slowpath+0x51/0x77
Feb 28 10:43:18 demo1 kernel:  [<ffffffff802d30c4>] mark_buffer_dirty+0x23/0x72
Feb 28 10:43:18 demo1 kernel:  [<ffffffffa00e1b55>] ext3_put_super+0x54/0x1ce [ext3]
Feb 28 10:43:18 demo1 kernel:  [<ffffffff802b33bf>] generic_shutdown_super+0x60/0xee
Feb 28 10:43:18 demo1 kernel:  [<ffffffff802b345a>] kill_block_super+0xd/0x1e
Feb 28 10:43:18 demo1 kernel:  [<ffffffff802b3518>] deactivate_super+0x60/0x79
Feb 28 10:43:18 demo1 kernel:  [<ffffffff802c7ebb>] sys_umount+0x87/0x91
Feb 28 10:43:18 demo1 kernel:  [<ffffffff8020bfbb>] system_call_fastpath+0x16/0x1b
Feb 28 10:43:18 demo1 kernel:  [<00007f1d146df1c7>] 0x7f1d146df1c7
Feb 28 10:43:18 demo1 kernel: 
Feb 28 10:43:18 demo1 kernel: ---[ end trace d044caa59498ad32 ]---

My dmesg & assorted logs are littered with

Code:

Feb 28 10:43:18 demo1 kernel: sd 0:0:1:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
Feb 28 10:43:18 demo1 kernel: sd 0:0:1:0: [sdb] Sense Key : Hardware Error [current] 
Feb 28 10:43:18 demo1 kernel: sd 0:0:1:0: [sdb] Add. Sense: Internal target failure
Feb 28 10:43:18 demo1 kernel: end_request: I/O error, dev sdb, sector 384
Feb 28 10:43:18 demo1 kernel: Buffer I/O error on device dm-0, logical block 0
Feb 28 10:43:18 demo1 kernel: lost page write due to I/O error on dm-0

Now, I know its pretty apparent that my drive died but here is my question--

It was evidently clear that sdb was the issue but for some reason the entire Logical Volume was unable to be mounted once the disk through an error. I tried everything from finding the backup superblocks and trying e2fsck -b on /dev/mapper/LVG_STORE-LV_OPT to backing up as much data as I could before I hit "X" amount of reads or before I hit that specific block on the disk and the error was thrown resulting in me unable to do anything else.

Why does a single disk failing cause an entire LVM to fail? This seems a little backwards unless I am missing something?

Any information would help. Thank you.

syg00 · 02-28-2011, 11:37 AM

Forget about the pv's - from a filesystem perspective, the lv is the equivalent of a partition in the non-LVM world.
Once it's broken, it's broken.

There ain't any substitute for backups.

ckozler · 02-28-2011, 11:40 AM

Quote:

Originally Posted by syg00

Forget about the pv's - from a filesystem perspective, the lv is the equivalent of a partition in the non-LVM world.
Once it's broken, it's broken.

There ain't any substitute for backups.

Thank you for the response.

So is that the equivalent of saying that if a single disk goes back in a LVM of three disks that just about the entire LVM is unusable because of that one block?

Also, the only way I was able to use the other two disks again was to delete the entire LVG of LVG_STORE and then re-create it with just the two good disks-- was this incorrect or was there another way that I may have missed?

syg00 · 02-28-2011, 11:52 AM

Theoretically you're supposed to be able to replace the failed drive(s) if you have access to the meta-data (UUID primarily).
Mirrored RAID would be a possible solution.
I don't like LVM and don't use it except under orders.

Others will hopefully give you better answers - I'll see what I can find as well.

EricTRA · 02-28-2011, 12:00 PM

Hello,

In my opinion you're confusing or mixing LVM with RAID. LVM by default doesn't provide any data redundancy or fault tolerance for as far as I know. Of course it provides possibilities and functionalities like snapshot or mirroring to provide the necessary 'backup' or redundancy. The way you set it up, having three disks in one VG and one LV in that VG that holds all the space, without any RAID or mirroring to other physical devices, I think you're out of luck. I sincerely hope for you that I'm wrong and that someone will come along with a solution. I for one would copy that solution in my personal wiki for future use.

Kind regards,

Eric

ckozler · 02-28-2011, 12:15 PM

Quote:

Originally Posted by syg00

Theoretically you're supposed to be able to replace the failed drive(s) if you have access to the meta-data (UUID primarily).
Mirrored RAID would be a possible solution.
I don't like LVM and don't use it except under orders.

Others will hopefully give you better answers - I'll see what I can find as well.

I didnt have a spare SAS disk so I didnt even try (you could be completely right)-- the box was just a demo box and it was a good 'test' place to have an issue with LVM.

Quote:

Originally Posted by EricTRA

Hello,

In my opinion you're confusing or mixing LVM with RAID. LVM by default doesn't provide any data redundancy or fault tolerance for as far as I know. Of course it provides possibilities and functionalities like snapshot or mirroring to provide the necessary 'backup' or redundancy. The way you set it up, having three disks in one VG and one LV in that VG that holds all the space, without any RAID or mirroring to other physical devices, I think you're out of luck. I sincerely hope for you that I'm wrong and that someone will come along with a solution. I for one would copy that solution in my personal wiki for future use.

Kind regards,

Eric

Hi Eric,

I am not confusing them-- I know the difference between RAID (hardware & software) and an LVM. The LVM journal was recovered when the disk had first failed and I restarted the server and the data/journal was about 3 months old (still completely usable). It wasnt until recently that the disk became worse and would error out much quicker.

I do believe you are correct in that since I did not provide any RAID behind the VLM (ideal would have been 6 disks in a RAID 1 -- resulting in 3 usable disks which was served in the LVG) that I am out of luck.

Yes, I have stored this in my memory bank of future use

-- if one physical disk fails in an LVG that the entire LVM becomes unusable which seems correct now because the entire LVM is presented as a block device to the OS (which explains why at a specific file/directory the disk would error out and unmount).

Thanks for all the responses though if anything is wrong/incorrect or I have mis-worded something, it would be great for someone to correct me.

Thanks!

EDIT: Guess I should have done this -- http://www.google.com/search?sourcei...=LVM+snapshots