Seeking help & support with recovery of a vol grp held on a degraded RAID 1 array
Good evening all.
I am looking for some support in the recovery of a corrupted or damaged volume group. This is a personal server I built a while ago as a learning exercise and useful backup server, it is, or was running CentOS 6. I have, I believe all the data off it, with the possible exception of a SQL server database that's not hugely important, but since I damaged this machine it has become a bit of a quest to get it back up and running. Learning exercise all the way. So. the machine was laid out like this, please refrain from criticising the build, I am not a certified professional! I think I am already adequately aware that it was not a great design. /SATA HDD1 --/15GB EXT4 partition marked as BOOT --/217 GB linux raid partition /SATA HDD2 --/15GB SWAP partition --/217 GB Linux raid partition /SATA HDD 3,4,5,6,7 --/2TB "whole disk" linux raid partition There are two multi device arrays in the machine, MD0 and MD1. MD0 consisted of the five 2TB disks in RAID5 configuration and was mounted at /home MD1 consisted of the two 217GB partitions in RAID1 configuration and on this device I had the volume group vg_boot The motherboard of this machine was originally an Asus M3A32-MVP. This board has trouble booting the CentOS install medium. (It requires you, iirc, to downgrade the BIOS and then re-upgrade it after the installation.) The short version is that last weekend I chucked the case and the motherboard in the bin and rebuild the machine into a spare chassis that had a slightly better motherboard and allowed me to boot the centos install media. I have NOT reinstalled the MD0 disks. I am working only on attempting to get the system back up and running on the two disks from the MD1 device and the vg_boot volume group. So. The commands I know, and have identified from my efforts thus far yield this: #mdadm --detail /dev/md1 /dev/md1 Version : 1.1 Creation Time : Thu Dec 29 19:09:10 2011 Raid Level : raid1 Array Size : 227812220 ( 217.26 GiB 233.28 GB) Used Dev Size : 227812220 ( 217 GiB 233 GB ) Raid Devices : 2 Total Devices : 1 Persistence : Superblock is Persistent Intent Bitmap : internal Update Time Fri Mar 22 :22:35:26 2013 State : Active, Degraded Active Devices : 1 Working Devices : 1 Failed Devices : 0 Spare Devices : 0 Name <machine name> UUID : df067917:790fb242:436a8a2e:d6ac85a5 events : 5760 Number Major Minor RaidDevice State 0 8 66 0 active sync /dev/sde2 1 0 0 1 removed #pvdisplay couldnt find device with uuid kRwuJT-VGMT-ufth-Jfcz-3m7K-WaC5-c1dEf9. -- Physical Volume -- PV Name unknown device VG NAme vg_boot PV Size 7.28TiB / Not usable 57.00Mib Allocatable yes (but full) PE Size 128MiB Total PE 59616 Free PE 0 Allocated PE 59616 PV UUID kRwuJT-VGMT-ufth-Jfcz-3m7K-WaC5-c1dEf9 -- Physical Volume -- PV Name /dev/md1 VG Name vg_boot PV Size 217.26Gib / Not Usable 8.87 MiB Allocatable Yes (but full ) PE Size 128 MB Total PE 1738 Free PE 0 Allocated PE 1378 PV UUID Yv9gUc-KyWf-lkqU-bd2e-ExO3-asc2-fdcG0w #vgscan -v -P PARTIAL MODE. Incomplete Logical Volumes will be prkRwuJT-VGMT-ufth-Jfcz-3m7K-WaC5-c1dEf9ocessed Wiping CAche of LVM-Capable Devices Wiping Internal VG Cache Reading all physical volumes. This may take a while finding all volume groups finding volume group vg_boot Couldn't find device with uuid kRwuJT-VGMT-ufth-Jfcz-3m7K-WaC5-c1dEf9 There are 1 physical volumes missing Found volume group "vg_boot" using metadata type lvm2 #lvs -a -o +devices -P PARTIAL MODE Incomplete volumes will be processed Couldnt find device with UUID kRwuJT-VGMT-ufth-Jfcz-3m7K-WaC5-c1dEf9 LV VG Attr Lsize <pool, origin, data, move, log all empty> Devices lv_home vg_boot -wi-----p 5.8t unknown device(5465) lv_home vg_boot -wi-----p 5.8t unknown device(29808) lv_home vg_boot -wi-----p 5.8t /dev/md1(0) lv_opt vg_boot -wi-----p 488.38G unknown device(21994) lv_root vg_boot -wi-----p 217.25G unknown device(3907) lv_usr vg_boot -wi-----p 488.38G unknown device(25901) lv_var vg_boot -wi-----p 488.38G unknown device(0) That is about all I can stand to transcribe off the screen in front of me. I guess from typing all this out its not inconceivable that the raid array is resynchronising itself and just needs to be left to get on with its thing, but that said I really don't know. My aim is to get the system booting. as far as I know I have made no changes to any partition tables, backups or data. the only thing that has changed is the order and position of the drives in the SATA ports. a final point is that depending on which way round the drives are I either get nothing at all, no grub no boot no nothing, OR I get a red message BAD PBR. There was some debate at the office today as to which was the better situation. I think we kinda concluded the error was better as it suggested that grub was getting up and running and then seeing something it didn't like. I would be delighted to provide any further information about the machine and the output of any commands you would like run. thanks. Ed |
All times are GMT -5. The time now is 02:20 PM. |