Recover data from full image disk
Hello.
Long story short, I,m running a server (Fedora 20) with 4 HD disk, with raid 10 and LVM. For some reason, the raid 10 got corrupted and I lost the LVM table, I was able to recover back the raid 10 and the LVM table, but when I try to mount one of the logical volume, I get the typical : mount: wrong fs type, bad option, bad superblock on ..etc...etc. Each disk of the raid+LVM is XFS full formated without partitions. (which means raid disk where formated as: mkfs.xfs /dev/sda -- No partitions --nothing of /dev/sda1, /dev/sda2, no, no, no boot flag, no nothing). I tried to fix each LV doing xfs_check and xfs_repair, it was useless...it didn,t work. I have data inside those LV I don,t want to lose, I have tried to dd=/dev/data/logical-name of=something.img , but I am not able to mount the final disk image either...using loop when I tried to mount the whole disk image, I get an error message: NTFS signature is missing. Failed to mount '/dev/loop0': Invalid argument The device '/dev/loop0' doesn't seem to have a valid NTFS. Maybe the wrong device is used? Or the whole disk instead of a partition (e.g. /dev/sda, not /dev/sda1)? Or the other way around? Is anything I can do to recover the data inside the LV?. Raid is ok according to mdstat, and I,m able to view VG and LV with vgscan and pvscan, and they,re active. Please help :) |
Are you mounting the filesystem - especially the loop device - with '-t xfs' option for XFS file system? Although if the disk itself refuses to get mounted then it seems unlikely that the loop device would either. Also, are you sure the LVM metadata you got was the latest version? Did you get it from the disks, or from the backup under /etc?
|
Can't hurt to try testdisk on a distro that supports lvm and xfs in the level that you have.
Odd such a disaster. |
I did tried several times to mount the img file using "-t xfs" , unfortunately I wasn,t able, I got the message of "Unknown filesystem type, etc, etc".
If I use testdisk, (I am not very clever with this tool :) ) and I select my corrupted LVM, I,m able to see my old data there (2 virtual machines) and I,m able to list files inside those 2 virtual machines, but how do I save an image of these 2 virtual machines inside my corrupted LVM, I only see an option to copy the files inside my 2 virtual machines, but I want to copy or save the whole virtual machine as an .img or something. Does testdisk allow this? or am I able only to save file per file of whatever testdisk find inside my corrupted LVM. |
I'm not really sure that I understand the situation. You're saying that full disks were formatted with mkfs.xfs, but that you have lvm on top of raid (and raid obviously on top of disks). From this I'm assuming what you mean is that the full disks are used as raid disks (instead of using raid partitions).. and the raid device would then likely be used as physical volume for LVM.
You're also talking about virtual machines inside the LVM (and seeing files inside them). I can only assume you mean logical volumes inside the physical volumes. If this is the situation, it's pretty similar to what I had - except I used partitions for raid, and my raid was level 6 instead of 10. If you've gotten far enough that you managed to recover the logical volumes, yet you can't actually mount them, then it actually doesn't sound very good to me. If you can see the files with testdisk, and can copy them per-file, then I'd try to recover a couple files like that, just to see if they come out fine, or if they are corrupt. Preferrably use files that are at least a few megabytes in size so they'll take up several blocks on disk - to see that they are consistent across multiple blocks. If the files come out fine, then it seems that the actual data on the disks is still fine - which of course is good. I'd assume then that there's either something wrong with the partition's superblock (which makes it unrecognizable for mount), or the LVM was recovered with outdated metadata (which might cause it to set the logical volumes to wrong addresses). I'm sure there are other possible explanations, but those two come to mind first. If the files themself turn out corrupt, or are unrecoverable, then something in the partition's structure itself would likely be wrong. Raid 10 is mirrored disks striped together, so possibly striping the mirrors in wrong order might cause something like that. In that case switching the stripe to right order might correct things (then again if it's initially right, and you swap it to wrong order, bad things might happen). I'm not sure what else to suggest. You could try taking dd dump out of the top level raid device - for example if the raid device is md1, doing: Code:
# dd if=/dev/md1 of=/some/path/MD1_DUMP.img bs=1M count=100 |
It is likely that you will have to carve data out of the disk using testdisk/photorec or foremost.
Remember to use ddrescue to image a bad drive as dd isn't designed to handle errors well. |
One might use dd or ddrescue to save off all data to external for recovery. It won't fix it as such. So, it could be a possible way to work on it on remote system.
I think I recall that testdisk offers a basic path to where you want to save data off. It may be possible to save data within corrupted filesystem but I'd only do that for files I didn't care about. Double check the testdisk docs for usage. |
Just one thing... I assumed at first what you wrote on the original post was some kind of mistake or oversight:
Quote:
If that is the case, then in all likelihood the best you can do is to try to recover bits and pieces of the data with recovery tools (such as the testdisk mentioned earlier in thread). However I'm not entirely sure of the situation.. can you repeat EXACTLY what you have done, every step of the way, with every detail? Depending on where and how your data actually was, it may or may not be possible to recover it. So what were the exact commands you used to partition the raids and lvm, to format the disks, to mount them, to try and recover them with testdisk? |
What he said.
If you followed a page on the web, post a link so we can see what you did - or were directed to do. As stated, we need to know exact details. And was this a working environment that "went bad" ?. What changed ?. |
Excuse me for my bad english :)
I,ll try to explain the best I can. I didn,t create the raid, lvm, format, etc. in this server, it,s a server here in the company I work at. It was working fine, but suddenly one morning the raid was corrupted (I don,t know the reasons), but I,m 100% sure it wasn,t someone that login and did something nasty, I think it was some hardware problem or sort of. Well, as far as I know, the thing initially was made : 1 - Raid 10 with 4 disk 2 - LVM on top of that 3 - Volume Group 4 - 4 Logical Volume 5 - Format each Logical Volume as XFS (each logical volume is a disk, there are 4 disk in the raid. I have 4 logical volume) Sorry if I express myself saying they format the whole raid disk with XFS, it wasn,t that, they did format each logical volume as XFS, each logical volume is 1 disk of the raid, they didn,t make partitions on each disk, they just create a LV on each disk and format as XFS each disk (again, sorry for my bad english). The case is, I,m able to dump with dd and image of the corrupted logical volume, but I am not able to mount it even with loop, I get the bad superblock thing error message. Linux, does recognize the raid disk, does recognize the LVM, but doesn,t recognize that those logical volume are XFS partitions, (like when you run cfdisk /dev/disk command and you see your partitions with their filesystem type added) well, in my case if I run cfdisk on the logical volume, I see the partition with filesystem type "Linux" instead of "XFS" (I just run cfdisk to see, I haven,t format, write, or anything). The type of filesystem is lost, I would like to recover the type of filesystem without losing data, I don,t know if this is possible, but in case someone knows :) I will be happy. |
Quote:
What do you get when you run Code:
hexdump -C /dev/mapper/<volgroup_name>-<lvol-name> | head -24 |
Ok, the next question is, what exactly did you do to recover back the RAID and LVM? For example, are you certain that the RAID was recovered with right disks in each mirror, and the stripes restored in right order, with right chunk size, correct RAID metadata version etc. And as for LVM, are you certain that the LVM metadata was restored from the latest version?
You said you tried using xfs_repair, and I believe it should look for backup superblocks. If it can't find them, or if those too are corrupted, then I think there's likely something fundamentally wrong with the filesystem. There's 'xfs_db' that might be of help in trying to further analyze what's wrong with the system - but it goes beyond my knowledge. Posting the output you get from 'xfs_check' and 'xfs_repair -n' on each of the logical volumes might also help for people to get a better understanding of what has gone wrong. |
Quote:
[root@storage-batch ~]# hexdump -C /dev/mapper/data-lab_templates | head -24 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001b0 00 00 00 00 00 00 00 00 f7 fa 55 3d 00 00 00 20 |..........U=... | 000001c0 21 00 83 2a ac fe 00 08 00 00 00 f8 7f 0c 00 00 |!..*............| 000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| 00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002000 58 46 53 42 00 00 10 00 00 00 00 00 01 90 00 00 |XFSB............| 00002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00002020 13 d8 d7 ac 53 7b 4f 48 86 56 56 ba 11 15 ce 35 |....S{OH.VV....5| 00002030 00 00 00 00 01 00 00 04 00 00 00 00 00 00 00 80 |................| 00002040 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 |................| 00002050 00 00 00 01 00 64 00 00 00 00 00 04 00 00 00 00 |.....d..........| 00002060 00 00 32 00 b4 b4 02 00 01 00 00 10 00 00 00 00 |..2.............| Sorry about the cfdisk cli, I though using cfdisk I was going to be able to see the filesystem type of the LVM (even if I,m sure is XFS, I just wanted to see if the system do recognize it as XFS). Although I didn,t write, edit, change, format, etc. I didn,t do anything under cfdisk, just watch :) Nogitsune Unfortunately, I can,t answer to you about what was done to recover the raid and LVM, I didn,t do it :( my boss did something to recover that, honestly I don,t know how he recover the Raid and the LVM, but whatever he did, it did work since raid and LVM were recovered. This is what I get with xfs_check: [root@storage-batch ~]# xfs_check /dev/mapper/data-lab_templates xfs_check: /dev/mapper/data-lab_templates is not a valid XFS filesystem (unexpected SB magic number 0x00000000) xfs_check: WARNING - filesystem uses v1 dirs,limited functionality provided. xfs_check: read failed: Argumento inválido cache_node_purge: refcount was 1, not zero (node=0x15c3950) xfs_check: cannot read root inode (22) bad superblock magic number 0, giving up And with xfs_repair: [root@storage-batch ~]# xfs_repair /dev/mapper/data-lab_templates Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... .................................................................................................... ................................................................found candidate secondary superblock... unable to verify superblock, continuing... [etc.] ...Sorry, could not find valid secondary superblock Exiting now. When I run testdisk, I am able to see all the data that is inside this corrupted LVM, but I am not able to dump that out. I,m affraid if I choose to "write" what testdisk found, maybe I can lose everything ..... that,s why I haven,t done it. Any help with this XFS LVM would be highly appreciate :) |
Quote:
Code:
Device Boot Start End Blocks Id System Code:
losetup -r -o 8192 -f --show /dev/mapper/data-lab_templates |
Hello rknichols.
You were right. Indeed there is a partition inside the logical volume: [root@storage-batch prueba]# fdisk /dev/mapper/data-lab_templates Welcome to fdisk (util-linux 2.23.2). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Orden (m para obtener ayuda): p Disk /dev/mapper/data-lab_templates: 107.4 GB, 107374182400 bytes, 209715200 sectors Units = sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 524288 bytes / 1048576 bytes Disk label type: dos Identificador del disco: 0x3d55faf7 Disposit. Inicio Comienzo Fin Bloques Id Sistema /dev/mapper/data-lab_templates1 2048 209715199 104856576 83 Linux Orden (m para obtener ayuda): q Since, I was unaware of this I ask my boss about it, he said he made it since linux "mount" command, require partition table definition to work???. However, he said he didn,t format the LV he just write the partition table, but insist the old data it still there.... I did tried what you suggest: [root@storage-batch prueba]# losetup -r -o 8192 -f --show /dev/mapper/data-lab_templates /dev/loop0 [root@storage-batch dev]# xfs_check /dev/loop0 xfs_check: error - read only 0 of 512 bytes ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_check. If you are unable to mount the filesystem, then use the xfs_repair -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. So, I losetup -d /dev/loop0 and try it again (this time I try to mount it): [root@storage-batch prueba]# losetup -r -o 8192 -f --show /dev/mapper/data-lab_templates /dev/loop0 [root@storage-batch dev]# mount /dev/loop0 -o loop /prueba mount: /dev/loop0 is write-protected, mounting read-only mount: cannot mount /dev/loop0 read-only And in dmesg: [850196.127770] loop0: rw=32, want=209715200, limit=209715184 [850196.127846] XFS (loop0): Mounting Filesystem [850196.135166] XFS (loop0): recovery required on read-only device. [850196.135229] XFS (loop0): write access unavailable, cannot proceed. [850196.135284] XFS (loop0): log mount/recovery failed: error 30 [850196.135492] XFS (loop0): log mount failed [850211.171322] attempt to access beyond end of device I do have 3 more LV in the same situation (they are inside the same VG), in the other 3 LV I didn,t see any partition inside. But I am not able to mount them with xfs either (I get the same results with xfs_repair or xfs_check, bad superblock thing). This is one of the other corrupted LV under XFS: [root@storage-batch /]# hexdump -C /dev/mapper/data-lab_vmimages | head -24 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001b0 00 00 00 00 00 00 00 00 93 25 59 88 00 00 00 00 |.........%Y.....| 000001c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| 00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002000 58 46 53 42 00 00 10 00 00 00 00 00 10 00 00 00 |XFSB............| 00002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00002020 0d a7 59 f1 f1 d6 4a af 8c ab 2e 5e 1f 79 95 1f |..Y...J....^.y..| 00002030 00 00 00 00 04 00 00 04 00 00 00 00 00 00 00 80 |................| 00002040 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 |................| 00002050 00 00 00 01 01 f4 00 00 00 00 00 09 00 00 00 00 |................| 00002060 00 00 fa 00 b4 b4 02 00 01 00 00 10 00 00 00 00 |................| 00002070 00 00 00 00 00 00 00 00 0c 09 08 04 19 00 00 19 |................| 00002080 00 00 00 00 00 00 15 40 00 00 00 00 00 00 09 fb |.......@........| 00002090 00 00 00 00 01 de 79 94 00 00 00 00 00 00 00 00 |......y.........| |
Quote:
Quote:
The thing that bothers me about those messages is that "attempt to access beyond end of device." That suggests that the whole LV structure is misaligned and the filesystem extends beyond the current end of the LV. I'm going to make a guess at what might have happened. If the old LVM settings were restored to the raw disk and not to the RAID member, everything would be offset by the amount that RAID had used as its data_offset to reserve space for the RAID superblock and bitmap, and 8K is a possibility for that. It would be useful to see the contents of file /etc/lvm/backup/data and a hex dump of the first 1024 bytes from one of the disks. Between RAID 10 and xfs, I'm getting pretty far outside my experience level here, so I'm not sure how much more I can help. |
rknichols
You,ve help me to understand a lot (I am extremely noob in raid stuff). When all this damage happened, I suggest to my boss to delete the whole raid+LVM and create everything again from zero, he said no, and whatever he did, he recover the raid 10 + LVM reference back (since we were unable to mount those LVM), but like you said, doing it this way had misaligned everything and soon or later I bet the raid is going to fail again, cause they,re forcing something without doing it clean. I,m pasting the contents of /etc/lvm/backup/data (kind of long): Code:
[root@storage-batch ~]# cat /etc/lvm/backup/data "lsblk" output for the raid containing Data VG is: Code:
sdc 8:32 0 1,8T 0 disk Code:
root@storage-batch ~]# hexdump -C /dev/sde | head -24 Code:
08000a40 3d 20 5b 0a 22 70 76 30 22 2c 20 30 0a 5d 0a 7d |= [."pv0", 0.].}| I know this is a whole mess...but there are a few virtual machines inside that corrupted LV that the company doesn,t want to lose (in case something still can be made). I am extremely noob in hardware errors, so I do appreciate a lot any explanation :) since it helps me to explain to my boss, why he did a mess from the beginning. |
Would you please edit your last post and change the [QUOTE]...[/QUOTE] tags to [CODE]...[/CODE] tags so that formatting is preserved.
|
Done :)
|
OK, the title of this thread mentions "full image disk." Do you indeed have backup images of these drives? I'm really hoping the answer is, "Yes," because I have to play with fire a bit here to proceed.
Make two copies of that /etc/lvm/backup/data file. I'll call them data.bak1 and data.bak2. A simple edit can effectively shift all the LVs by 8192 bytes. If that is the only problem, everything will come back like magic. In data.bak2, change the line Code:
pe_start = 2048 Code:
pe_start = 10240 Code:
vgcfgrestore -v -f data.bak2 Code:
blkid /dev/mapper/data-* The LVM change can be easily undone by Code:
vgcfgrestore -v -f data.bak1 |
I've always used mdadm to handle the raid, and when I use lvm, I've used it on the raid device from mdadm. If I understood things correctly this is actually a raid10 that was made straight from lvm. That's something I've never dabbled in, so I don't really know what to expect - in particular, I have no idea how lvm handles the headers and metadata for it's own raid volumes. We're going way beyond my area of knowledge here, so I'll have to basically bow out here.
But if the XFSB head is missaligned inside the LVM.. then I can think of a few possibilities: 1) the XFS system was always there, it was made into that partition that was wrapped inside the logical volume. In this case the 8192 aligned loop might have seen the real partition. But my mind refuses to wrap around the idea of how exactly you could mount such a partition. 2) the XFS system was always there. The LVM itself is missaligned now for some reason. Maybe because the physical volume itself is missaligned. This would make sense if what rknichols suggests is true, but since 10 is stripe-of-mirrors, looking at content of single disk wouldn't give you much anything meaningful. The LVM in particular (and as such the XFS partition) wouldn't hold anything useful if looked within context of any single disk. The lsblk shows a listing that suggests the current setup does recognize the disks as being in raid10 though, so I still don't really get it. The metadata shows /dev/md127 as the physical volume, which suggests that it was at least originally placed on the raid device. The hexdump you gave from the disk shows part of the LVM metadata at position 08000a40. The sequence number of that metadata block should be just a little earlier on the dump. I'm curious as to what sequence number that particular block has, whether it's the only metadata block stored on the device, and if not, what is around it on the device. Also, whether the other disks ( sdc, sdd, sdf ) have pieces of the lvm metadata on them. Piecing those together should give some idea on how and where the metadata was restored - if it was restored to the raid device, then it should be striped across the hard drives, and mirrored between two of them (two hard drives showing identical pices of metadata, and the pairs of disks having alternating blocks of it). If on the other hand it was restored on physical hard drive, then it should be present only in one drive, and there should be no mirror of it (the other drives might then hold pieces of the original 'pre-corruption' metadata. On the other hand I'm not sure what the raid resync would have done in that case). Also, if you do the restore with altered offset for the metadata, perhaps you could do something with xfs_db in a non-destructive way to check on the status of the filesystem. It's a command I've never used myself, though so I can't give much in way of advice there. |
The hexdump from the start of the disk has a RAID superblock magic number at offset 0x1000 followed by a "version-1" identifier, so at that offset it's metadata version 1.2. That indicates that the disk was first set up as a RAID member, and then the RAID device was used as an LVM PV. That's how the OP described it, and it's consistent with the device hint in the LVM metadata backup file. The creation time in that RAID metadata converts as "Thu Oct 17 02:43:27 UTC 2013," almost exactly 1 year ago. Either that is truly the year-old data, or perhaps this is all an artifact of the initial recovery effort with the year set wrong in the system clock (At this point, I'm not taking anything on faith.), and not how the system was previously set up at all. I would really need to see some of the older files in /etc/lvm/backup/ to get an idea of what might have been there before.
I too have no experience with xfs, so I would just be stumbling around in the dark with xfs_db. On my own, I might get somewhere, but I certainly can't give advice to anyone else. |
Quote:
Quote:
|
Hello guys.
Thanks a lot for all your explanation and help here. rknichols: Right now I don,t have a current backup of the whole raid+LVM, but, I can make one to an external disk before I made any type of change to the lvm data file. It,s kind of risky, since the change to the lv data file it,s going to affect the whole volume group. If I delete the partition that my boss created in one of the corrupted LV, wouldn,t this aligned back in some way this particular LV?. Nogitsune I think (and maybe I,m wrong), I think the LVM got misaligned when my boss create the partition inside of one of the corrupted LVM. Since, the other 2 corrupted LVM don,t show anything about "attempt to use extra disk space" or something similar, when I tried to mount them. I do get the same error indeed, about the bad superblock, unknown fs, etc, etc. The system does recognize as valid the raid 10, and do recognize the volume group and logical volume: Code:
[root@storage-batch ~]# vgscan |
Ohhh..and no, XFS didn,t let me to analyze the disk without a need to write on it :( even, if I did found out on the internet that it was possible to mount read only an xfs partition and use xfs_repair on it..in my case, it was impossible.
Code:
If xfs_repair failed in phase 2 or later, follow these steps: |
Quote:
FWIW, I find this relationship of timestamps to be an almost unbelievable coincidence: RAID creation time: Oct 17 02:43:27 UTC 2013Exactly one year plus 10 minutes 30 seconds??? No, that's not the original RAID superblock. I don't know what was done to "recover" the RAID array, but it looks like the system clock was set to the wrong year, and it's entirely possible that the RAID setup is the real cause of the problem here. I strongly suspect that all the LVs are misaligned, and that is why the xfs superblocks aren't being found. You can check that by running "hexdump -C" on the other LVs and looking for the xfs magic number (ASCII characters "XFSB") at offset 8192 (0x2000). If it's there, my suggestion to change the pe_start value in the LVM header should fix everything. |
Thanks rknichols.
You are right about searching for the XFS magic key on the corrupted LVM : Code:
[root@storage-batch ~]# hexdump -C /dev/mapper/data-lab_templates | head -24 Code:
root@storage-batch ~]# hexdump -C /dev/mapper/data-lab_vmimages | head -24 Thanks guys, for all your help. |
Keep in mind that you may still find some minor corruption from the partition tables that were intended (misguidedly) to go at the beginning of an LV but, due to the misalignment, actually went into an extent of the LV that immediately preceded that LV. I've mapped out the VG, and found 3 instances that might be of concern:
In each case, the overwrite is a single 512-byte sector. Partition tables written to any of the other LVs will not have hurt anything since those LVs don't have anything important 8192 bytes back from their proper starting location. |
it sounds like you are going through what i went through years ago when i took over things that someone else set up.
I learned my lesson, always have a separate data backup that is working. A raid setup does not prevent from data loss. what happened to me was a software raid using mdadm was set up across multiple drives in a storage array that did not support raid. our area was prone to power fluctuations and this specific hardware that i took over got corrupted because of that and i went through the same thing you are going through. after i got it working, it corrupted again within a month. my guess is something similiar is happening with you and your hardware. i got the same errors as you, the volume could be seen in linux but i would get that superblock error when mounting. my filesystem at the time was XFS, and i used the xfs_repair utility to recover things, but i got a lot of lost inodes which resulted in folder names and file names being lost. the files and data was there, but i had to go through lots of folders and files and deciper what was what... my ron.txt file in my ron folder came back as 34534562345/674142355626.... bunch of random numbers, not fun. some folder and file names came back though. from what i remember, you can set things up with your raid and volume a few different ways. you can format each disk as xfs then volume group them together then create an xfs file system on top of everything that is then mounted in linux. or with each disk not formatted and not paritioned, you let the volume group manager partition each disk as whatever the format code is and then they get raided together with the software mdadm, then you create the xfs file system on top of all that which is mounted. that's the best i can explain it from what i remember, so i would try to figure out exactly how things were set up previously... if that wasn't already said i didn't read this thread in detail. with the file system being xfs, what i think happens is the primary superblock gets corrupted which has the infomation of all the other superblocks and allocation tables and is what is read when using the mount command. the good news at least is you can use xfs_repair -n to see what it reports. good luck. http://docs.cray.com/books/S-2377-22...bhdyq9i26.html http://linux.die.net/man/8/xfs_repair |
-- edit --
If you haven't mounted the XFS yet with the changed logical volume alignment, please don't. Stop what you're doing now. Sorry, I may be totally off with everything I wrote here, and indeed it's possible that mounting it that way might restore everything. But if by chance what I'm thinking, actually happened.. then it's probably also possible that the mount would totally trash the partition. So I'm encouraging a bit more investigation to this, and if at all possible, a full backup. The problem is that IF the raid stripes are missaligned, then it's possible that it shows the whole partition with pieces aligned the wrong way.. and if XFS repair runs amok through it, I have no idea really what it would result in. Basically what should on partition look like: Code:
--------XFS-1234567890ABCDEFGHIJKLMNOPQR------------ Code:
--------XFS-12--56789034CDEFGHABKLMNOPIJ------QR---- -- end edit -- Quote:
Or am I thinking too much into this? Quote:
My solution in similar case was to arrange new set of disks, dd everything straight from disk-to-disk to the new disks, and then use those new disks to restore the data - leaving originals untouched, so that if I messed something up I could always just redo the dd from disk-to-disk. The new 'fake partition table' would then take the spot that used to be the end of the old LVM metadata - which would be the only reason it didn't corrupt the XFS header. Of course then the reason it was put there in the first place would probably be because the mounting of the logical volume failed - so the same error that prevented you from getting to data, might also have saved it afterwards. I don't know if that would make you incredibly lucky or incredibly unlucky. Maybe both. Here's some important questions: are you absolutely certain that the disks are put into the current raid in exactly same order and configuration they were previously (which disks are mirrored with which ones, which order the mirrors are striped in)? Are you absolutely certain the old raid was done with the same metadata level as the current one (the new 1.2 metadata, not the old 0.9 metadata that was standard years ago)? Are you absolutely certain that the raid is made with the same chunk size (currently 512) it was made originally? If any single one of these assumptions is wrong, it can result in disaster. |
Quote:
Quote:
Quote:
Here is a suggestion. Go ahead and adjust the LVM pe_start value, but don't try to mount or check anything. Now use dd to copy an image of just one of the LVs to another device. Start with one of the smaller ones, like prod_portables or lab_templates (just 100GB each). Then try to mount and check that new image. That will leave the original data safe, but tell you what would have happened had you tried that on the original. That's a lot quicker than imaging the whole 3.6TB VG. |
Quote:
Quote:
|
Hello guys,
From now on, my boss is into this too :) (I sent him the url link of this post, he read it and agree with all the comments). We haven,t made anything yet, since we,re talking about a production server, on working days, it needs to be up and running..except weekends. For the moment, we,re thinking to recreate some of your suggestions but on some dd image like rknichols suggest...is the safest way, is a task that,s gonna take a few days since we cannot overload too much the server on working hours (and we know how dd eats CPU). Probably a few of you already want to know how all this is going to end (me 2), I,ll keep you update with all the steps we,re going to go through on the following days. XFS wiki page, indicates that is possible to mount an XFS partition on read-only and use XFS tools to try to repair that, but for some reason, I am unable to mount it read only with XFS, so maybe the wiki was talking about an very old version of XFS, or this XFS is compile with some attributes or parameters that indicates that read-only is not allowed, this sucks since other filesystems do allow the system to be mounted read-only, I imagine if root partition were XFS and something nasty happens you can,t linux single your system?. It doesn,t matter now the real cause, guess we need to work on what we have. Nogitsune Unfortunately, I can,t tell if the current chunk size of the disk on raid, are the same as the original. But I can tell you, that like 2 weeks before this raid got "mysteriously" damage, one of the 4 disks failed, so my co-worker remove the failed disk and let the raid working with 3 disks and a few days later, my co-worker insert a new disk into this raid. The raid was working fine with the 4 disks, and suddenly one morning there were no raid, no LVM, nothing..We still don,t know what happened or what cause this. Quote:
|
Quote:
|
I can't really say how your raid implementation handles the spare disk. If it were the kind of software raid I use myself, I'd have to apply the mdadm command to add the spare to the array, and then the reconstruction would do it's magic and all would be fine (I use raid on disk partitions so I'd first need to partition the new disk and then add the partition to raid instead of whole disk). However in your case it's possible the whole thing worked automatically, or maybe the co-worker in question did the needed commands to add it. I suppose it's also possible that the disk was never actually added to the array.. and then 10 days later another disk from the same mirror fell off from the raid, and caused the whole thing to fail. I couldn't really say.
Either way it seems right now the raid is up and running, and hopefully there is no severe corruption on data. What I'm mostly worried is the possibility of the following scenario: 1) for one reason or the other, the raid failed, and wouldn't reassemble properly. to restore it, decision was to do something akin to 'mdadm --create' - basically to recreate the whole raid system from original disks, writing the new header etc. If this is done exactly right, then the raid will be restored, and you'll be able to access your original data with no further losses. This is basically what I did for my own failed raid6, and at the end of day I got everything important restored. But.. 2) something went wrong. What I feel is most likely is that, as rknichols pointed out, the raid is for some reason missaligned by 8k. And I'm worried about what this does to the original data. Striped raid on two disks works by writing the 'chunk-size' amount of data (512k in this case) alternating between disk 0 and disk 1. So for each 1M of data, first half is written to disk 0 and second half to disk 1. Now, ignoring the raid header, assume that the original raid was created aligned on each disk on position that starts from 8k. Chunks would be written at locations (0+8k) = 8k, (512+8k) = 520k, (1024+8k) = 1032k and so on. To read a long strip of data, you'd assemble it from disk0:8k->520k + disk1: 8k->520k + disk0:520k->1032k + disk1:520k->1032k and so forth. Now if the raid was recreated, but without that 8k chunk at the start of the disk (as it seems), then the original data is still placed on those same chunks (8-520,520-1032 and so on), but the new raid system will think that the data is instead on chunks 0-512, 512-1024 and so forth. 3) lvm was now recreated on top of this missaligned raid. This is where we would be now. For the most part the data would seem to be 8k missaligned, and shifting the logical volumes by 8k would cause this majority of data to correct itself. However, because of the error in the raid beneath the lvm error, the 8k chunk within each 512k of data would still be shuffled. You might be able to mount the system now since the XFS header would be in correct place (originally I believe this 8k missalignment is the reason you could not mount the system, not even in read-only mode). If you now mounted this 8k shifted system, and ran file check on it, then I believe one of two things would happen. either the repair would determine that it can't make sense of the data, and it would plain out fail. Or it would attempt to repair the structure, and probably corrupt the whole partition. How could this have happened? This I don't know for certain. What rknichols suggested was that something changed in raid settings that caused the header to be 8k shorter than it used to be. I don't know enough about raid headers to say anything about that. I don't know if their size changes, and what would cause it. Maybe different metadata version? Change between 1.0 and 1.2 or something? I don't know. What I was suspecting myself, was that the originally the disks were partitioned, with single partition that starts from 8k point on the disk.. and then the raid was created on these partitions (e.g. /dev/sdc1, /dev/sdd1 and so forth). Currently the raid is made from the whole disks (/dev/sdc, /dev/sdd and so forth), where as the other raid array seems to be made from partitions (/dev/sda1, /dev/sdb1) so this kind of error would seem feasible - IF the raid was recreated from scratch, which like rknichols pointed out, would seem likely based on the timestamps (assuming the year of the server was originally set to 2013 instead of 2014 by mistake). At this point I can't say anything for certain - but this is exactly why I'm suggesting to hold from doing ANY kind of writes on those disks until you know exactly what's going on there. Using dd to only read from the disks, and then doing whatever tests with those images, seems like the safest way at this point. -- edit -- one more thing - assuming a disk failed and was replaced properly.. and then few days later the whole raid failed.. there's a possibility that you're dealing with something other than regular disk failures. Faulty controller, bad memory chips, failing power source, irregular/spiking electricity, or some other issue like that. Something that keeps popping the disks out. Again, difficult to pinpoint and verify, but something to keep an eye out for at least. |
I have use xfs_check and xfs_repair with the -n a few days ago, not with the LV mounted of course, but the results are negative :( can,t find a valid superblock anywhere, which is consistent from what you said, if the filesystem inside the LV is misaligned, any temp to scan would be useless since is searching in some place that doesn,t belong probably to XFS. I did try before to mount a dd image of one of the corrupted LV as read-only to get my data back at least....I wasn,t able, I don,t remember the exact error, but I wasn,t able.
I though that inconsistent state of a disk, was metadata that was in buffer but wasn,t yet fully write to disk or were in proccess of writing when suddenly power failure or crash happens, if this is true (and correct me if I,m wrong) filesystem still can recover from journal. But in this particular case (of the corrupted LV) since everything is misaligned, and everything is scattered everywhere, is there a chance XFS tools can recover itself having the metadata this way?. I have an Slackware server with 1 free partition i,m gonna intentionally create a XFS partition, and change the offset of start and end using sfdisk (trying to make a similar escenario, like the corrupted LV). after, going to try to mount it read-only and see if xfs allows me. :) will share results after...still haven,t done anything on the production server, but can make some test on other servers before a final task is made on the production one. |
Quote:
What rknichols suggested with shifting the pe alignment and then dd'ing a small 100G partition into file.. and then mounting that file with loop device, sounds like a good test to try first. If the raid stripes are not missaligned, then all the data might magically just appear, and you could copy it out of that loopback without problems. if the raid stripes are wrong, then you might still be able to mount the loop (because it will find the XFS header), but the data will be either partially or entirely corrupted, and xfs repair will probably be like taking a blender to unhatched chick - you can't put it back together afterwards. |
Quote:
|
Quote:
It's not as good a test as copying a 100GB LV for testing, but as you said, that's hitting the server pretty hard. (It's not really the I/O that's the problem, but AFAIK there is no way to limit the amount of buffer cache that gets used for that operation, and that forces a lot of useful stuff for other processes out of the cache.) |
Quote:
Code:
dd bs=512k iflag=direct oflag=direct if=/dev/path/to/LV of=/path/to/LV.img If it's server you don't want to interrupt for installing more disks and such, and if you have a fast enough internal network (1G at least), then you could draw the image to another linux server through network using nc (netcat), sometime when the network is free - weekend maybe. |
Quote:
Will keep updated, in a few more hours. |
Quote:
|
Quote:
cp /etc/lvm/backup/data /etc/lvm/backup/data1 Edit data1 file: pe_start = 10240 Code:
[root@storage-batch backup]# vgcfgrestore -v -f data1 data Code:
[root@storage-batch backup]# vgcfgrestore -v -f data data vgcfgrestore -v -f data1 I get a message from stdout: [root@storage-batch backup]# vgcfgrestore -v -f data1 Please specify a *single* volume group to restore. Code:
[root@storage-batch backup]# vgcfgrestore -l data |
Quote:
Code:
pe_start = 2064 Sorry about that. |
Hello guys,
Yesterday nigth "raid-check" daemon start to run, and today morning one LV inside of data VG was also corrupted :( I tried to mount it since it was working nicely yesterday, and I couldn,t...I get a message of unknown fs, I tried to run testdisk on this new corrupted LV with no avail. It doesn,t find any type of filesystem inside the LV. I speak with my boss, and he decide that is better to delete all and create the raid again and the VG again. So, I want to thank you all for your help, but definetely this raid+lvm is extremely corrupted, so is better to backup what still is working and create all again, is for the better. :) you were all very helpfull, but sometimes is better to fix deleting all than to try to run over something that is damaged. :) |
Oh well, at least we tried. Better luck next time (which, hopefully, you will never experience).
|
Sorry to hear it turned out that way. I wish you the best, too.
|
All times are GMT -5. The time now is 11:42 PM. |