LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 10-20-2014, 02:42 PM   #16
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035

Quote:
Originally Posted by littleball View Post
I ask my boss about it, he said he made it since linux "mount" command, require partition table definition to work???. However, he said he didn,t format the LV he just write the partition table, but insist the old data it still there....
That would have overwritten the XFS superblock if the LVM configuration were right.
Quote:
I did tried what you suggest:

[root@storage-batch prueba]# losetup -r -o 8192 -f --show /dev/mapper/data-lab_templates
/dev/loop0
[root@storage-batch dev]# xfs_check /dev/loop0
xfs_check: error - read only 0 of 512 bytes
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_check. If you are unable to mount the filesystem, then use
the xfs_repair -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

So, I losetup -d /dev/loop0 and try it again (this time I try to mount it):
[root@storage-batch prueba]# losetup -r -o 8192 -f --show /dev/mapper/data-lab_templates
/dev/loop0
[root@storage-batch dev]# mount /dev/loop0 -o loop /prueba
mount: /dev/loop0 is write-protected, mounting read-only
mount: cannot mount /dev/loop0 read-only


And in dmesg:

[850196.127770] loop0: rw=32, want=209715200, limit=209715184
[850196.127846] XFS (loop0): Mounting Filesystem
[850196.135166] XFS (loop0): recovery required on read-only device.
[850196.135229] XFS (loop0): write access unavailable, cannot proceed.
[850196.135284] XFS (loop0): log mount/recovery failed: error 30
[850196.135492] XFS (loop0): log mount failed
[850211.171322] attempt to access beyond end of device
You really didn't want the "-o loop" on that mount command. I believe it's harmless, though, since it would just try to map a new loop device to the existing one. I just hate tools that won't let you examine or verify a filesystem without writing to it. You might see if "xfs_info /dev/loop0" will give anything useful.

The thing that bothers me about those messages is that "attempt to access beyond end of device." That suggests that the whole LV structure is misaligned and the filesystem extends beyond the current end of the LV.

I'm going to make a guess at what might have happened. If the old LVM settings were restored to the raw disk and not to the RAID member, everything would be offset by the amount that RAID had used as its data_offset to reserve space for the RAID superblock and bitmap, and 8K is a possibility for that. It would be useful to see the contents of file /etc/lvm/backup/data and a hex dump of the first 1024 bytes from one of the disks.

Between RAID 10 and xfs, I'm getting pretty far outside my experience level here, so I'm not sure how much more I can help.
 
1 members found this post helpful.
Old 10-20-2014, 03:19 PM   #17
littleball
Member
 
Registered: Jan 2011
Distribution: Slackware, Red Hat Enterprise
Posts: 47

Original Poster
Rep: Reputation: 8
rknichols

You,ve help me to understand a lot (I am extremely noob in raid stuff). When all this damage happened, I suggest to my boss to delete the whole raid+LVM and create everything again from zero, he said no, and whatever he did, he recover the raid 10 + LVM reference back (since we were unable to mount those LVM), but like you said, doing it this way had misaligned everything and soon or later I bet the raid is going to fail again, cause they,re forcing something without doing it clean.

I,m pasting the contents of /etc/lvm/backup/data (kind of long):

Code:
[root@storage-batch ~]# cat /etc/lvm/backup/data 
# Generated by LVM2 version 2.02.98(2) (2012-10-15): Fri Oct 17 02:53:57 2014

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing '/usr/sbin/lvremove -f data/prod_corporativos-snap'"

creation_host = "storage-batch" # Linux storage-batch 3.12.5-200.fc19.x86_64 #1 SMP Tue Dec 17 22:21:14 UTC 2013 x86_64
creation_time = 1413525237      # Fri Oct 17 02:53:57 2014

data {
        id = "GpVhPS-1oyi-YjDe-Slmf-my67-1eB8-mjaDtT"
        seqno = 1344
        format = "lvm2" # informational
        status = ["RESIZEABLE", "READ", "WRITE"]
        flags = []
        extent_size = 8192              # 4 Megabytes
        max_lv = 0
        max_pv = 0
        metadata_copies = 0

        physical_volumes {

                pv0 {
                        id = "6jurGs-cu74-U1lw-abrw-vay7-qSyu-wasbhD"
                        device = "/dev/md127"   # Hint only

                        status = ["ALLOCATABLE"]
                        flags = []
                        dev_size = 7813531632   # 3.63846 Terabytes
                        pe_start = 2048
                        pe_count = 953799       # 3.63845 Terabytes
                }
        }

        logical_volumes {

                prod_corporativos {
                        id = "F5HF9G-A37q-NoSr-AHZM-ylq3-h0hc-rl12sv"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "storage"
                        creation_time = 1381979562      # 2013-10-17 00:12:42 -0300
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 128000   # 500 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 128000
                                ]
                        }
                }

                lab_vmimages {
                        id = "r3a0ia-sCd7-S8DI-dNa5-t9J6-CLuc-DPuyPJ"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "storage"
                        creation_time = 1384277848      # 2013-11-12 14:37:28 -0300
                        segment_count = 3

                        segment1 {
                                start_extent = 0
                                extent_count = 25600    # 100 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 533504
                                ]
                        }
                        segment2 {
                                start_extent = 25600
                                extent_count = 102400   # 400 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 584704
                                ]
                        }
                        segment3 {
                                start_extent = 128000
                                extent_count = 134144   # 524 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 789504
                                ]
                        }
                }

                prod_portables {
                        id = "zfSkyG-cWXu-SMCh-fkNS-dKcj-XspN-NPkmGs"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "storage-batch"
                        creation_time = 1402930936      # 2014-06-16 11:02:16 -0400
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 25600    # 100 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 0
                                ]
                        }
                }

                lab_templates {
                        id = "8W1X92-azb4-BDcF-80wj-Rzwp-B7Lq-HXS05Z"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "storage-batch"
                        creation_time = 1402947863      # 2014-06-16 15:44:23 -0400
                        segment_count = 1

                        segment1 {
                                start_extent = 0
                                extent_count = 25600    # 100 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 923648
                                ]
                        }
                }

                prod_vmimages-batch {
                        id = "5t2LYA-pVox-Zmru-QsqF-C1EI-F0Eo-jLOyr4"
                        status = ["READ", "WRITE", "VISIBLE"]
                        flags = []
                        creation_host = "storage-batch"
                        creation_time = 1404154247      # 2014-06-30 14:50:47 -0400
                        segment_count = 2

                        segment1 {
                                start_extent = 0
                                extent_count = 102400   # 400 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 25600
                                ]
                        }
                        segment2 {
                                start_extent = 102400
                                extent_count = 25600    # 100 Gigabytes

                                type = "striped"
                                stripe_count = 1        # linear

                                stripes = [
                                        "pv0", 687104
                                ]
                        }
                }
        }
}
There are 1 "extra" LV inside the VG, which were inactive before all the damage happened. We only use 4 LV from the whole VG.

"lsblk" output for the raid containing Data VG is:

Code:
sdc                             8:32   0  1,8T  0 disk   
└─md127                         9:127  0  3,7T  0 raid10 
  ├─data-prod_corporativos    253:2    0  500G  0 lvm    
  ├─data-lab_vmimages         253:3    0    1T  0 lvm    
  ├─data-prod_portables       253:4    0  100G  0 lvm    
  ├─data-lab_templates        253:5    0  100G  0 lvm    
  └─data-prod_vmimages--batch 253:6    0  500G  0 lvm    /datos/glusterfs/prod_vmimages-batch/brick1
sdd                             8:48   0  1,8T  0 disk   
└─md127                         9:127  0  3,7T  0 raid10 
  ├─data-prod_corporativos    253:2    0  500G  0 lvm    
  ├─data-lab_vmimages         253:3    0    1T  0 lvm    
  ├─data-prod_portables       253:4    0  100G  0 lvm    
  ├─data-lab_templates        253:5    0  100G  0 lvm    
  └─data-prod_vmimages--batch 253:6    0  500G  0 lvm    /datos/glusterfs/prod_vmimages-batch/brick1
sde                             8:64   0  1,8T  0 disk   
└─md127                         9:127  0  3,7T  0 raid10 
  ├─data-prod_corporativos    253:2    0  500G  0 lvm    
  ├─data-lab_vmimages         253:3    0    1T  0 lvm    
  ├─data-prod_portables       253:4    0  100G  0 lvm    
  ├─data-lab_templates        253:5    0  100G  0 lvm    
  └─data-prod_vmimages--batch 253:6    0  500G  0 lvm    /datos/glusterfs/prod_vmimages-batch/brick1
sdf                             8:80   0  1,8T  0 disk   
└─md127                         9:127  0  3,7T  0 raid10 
  ├─data-prod_corporativos    253:2    0  500G  0 lvm    
  ├─data-lab_vmimages         253:3    0    1T  0 lvm    
  ├─data-prod_portables       253:4    0  100G  0 lvm    
  ├─data-lab_templates        253:5    0  100G  0 lvm    
  └─data-prod_vmimages--batch 253:6    0  500G  0 lvm    /datos/glusterfs/prod_vmimages-batch/brick1
sdg                             8:96   0  1,8T  0 disk   
├─sdg1                          8:97   0  200G  0 part   
└─sdg2                          8:98   0  1,6T  0 part   /datos/backup
And this is some hexdump of one of the raid disk:

Code:
root@storage-batch ~]# hexdump -C /dev/sde | head -24 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  fc 4e 2b a9 01 00 00 00  00 00 00 00 00 00 00 00  |.N+.............|
00001010  e6 20 ac ff 62 8d a2 1e  96 08 80 b7 3b 7b c4 87  |. ..b.......;{..|
00001020  73 74 6f 72 61 67 65 3a  31 00 00 00 00 00 00 00  |storage:1.......|
00001030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00001040  cf 4e 5f 52 00 00 00 00  0a 00 00 00 02 01 00 00  |.N_R............|
00001050  00 84 dc e8 00 00 00 00  00 04 00 00 04 00 00 00  |................|
00001060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001080  00 00 04 00 00 00 00 00  b0 88 dc e8 00 00 00 00  |................|
00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000010a0  02 00 00 00 00 00 00 00  25 a9 56 18 13 07 c9 d5  |........%.V.....|
000010b0  15 4c 04 19 b4 2c 2c b6  00 00 00 00 00 00 00 00  |.L...,,.........|
000010c0  77 6d 45 54 00 00 00 00  c0 28 00 00 00 00 00 00  |wmET.....(......|
000010d0  00 00 00 00 00 00 00 00  37 7d 0b e9 80 00 00 00  |........7}......|
000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001100  00 00 01 00 02 00 03 00  fe ff fe ff fe ff fe ff  |................|
00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
*
00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
08000000  2c 20 39 34 39 32 34 38  0a 5d 0a 7d 0a 7d 0a 0a  |, 949248.].}.}..|
And this is a hexdump a little more far away:

Code:
08000a40  3d 20 5b 0a 22 70 76 30  22 2c 20 30 0a 5d 0a 7d  |= [."pv0", 0.].}|
08000a50  0a 7d 0a 0a 6c 61 62 5f  74 65 6d 70 6c 61 74 65  |.}..lab_template|
08000a60  73 20 7b 0a 69 64 20 3d  20 22 38 57 31 58 39 32  |s {.id = "8W1X92|
08000a70  2d 61 7a 62 34 2d 42 44  63 46 2d 38 30 77 6a 2d  |-azb4-BDcF-80wj-|
08000a80  52 7a 77 70 2d 42 37 4c  71 2d 48 58 53 30 35 5a  |Rzwp-B7Lq-HXS05Z|
08000a90  22 0a 73 74 61 74 75 73  20 3d 20 5b 22 52 45 41  |".status = ["REA|
08000aa0  44 22 2c 20 22 57 52 49  54 45 22 2c 20 22 56 49  |D", "WRITE", "VI|
08000ab0  53 49 42 4c 45 22 5d 0a  66 6c 61 67 73 20 3d 20  |SIBLE"].flags = |
08000ac0  5b 5d 0a 63 72 65 61 74  69 6f 6e 5f 68 6f 73 74  |[].creation_host|
08000ad0  20 3d 20 22 73 74 6f 72  61 67 65 2d 62 61 74 63  | = "storage-batc|
08000ae0  68 22 0a 63 72 65 61 74  69 6f 6e 5f 74 69 6d 65  |h".creation_time|
08000af0  20 3d 20 31 34 30 32 39  34 37 38 36 33 0a 73 65  | = 1402947863.se|
08000b00  67 6d 65 6e 74 5f 63 6f  75 6e 74 20 3d 20 31 0a  |gment_count = 1.|
Since, everything is misaligned here, can I for any chance lose data or damage part or the whole LVM of the raid if for example I write to disk, data recovered in one LV using testdisk?. Or testdisk will only affect exclusively that LV leaving the rest of the LV alone?.

I know this is a whole mess...but there are a few virtual machines inside that corrupted LV that the company doesn,t want to lose (in case something still can be made). I am extremely noob in hardware errors, so I do appreciate a lot any explanation since it helps me to explain to my boss, why he did a mess from the beginning.

Last edited by littleball; 10-20-2014 at 04:11 PM.
 
Old 10-20-2014, 03:55 PM   #18
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035
Would you please edit your last post and change the [QUOTE]...[/QUOTE] tags to [CODE]...[/CODE] tags so that formatting is preserved.
 
Old 10-20-2014, 04:12 PM   #19
littleball
Member
 
Registered: Jan 2011
Distribution: Slackware, Red Hat Enterprise
Posts: 47

Original Poster
Rep: Reputation: 8
Done
 
Old 10-20-2014, 04:28 PM   #20
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035
OK, the title of this thread mentions "full image disk." Do you indeed have backup images of these drives? I'm really hoping the answer is, "Yes," because I have to play with fire a bit here to proceed.

Make two copies of that /etc/lvm/backup/data file. I'll call them data.bak1 and data.bak2. A simple edit can effectively shift all the LVs by 8192 bytes. If that is the only problem, everything will come back like magic. In data.bak2, change the line
Code:
                        pe_start = 2048
to
Code:
                        pe_start = 10240
Then, say a prayer and run
Code:
vgcfgrestore -v -f data.bak2
and see what happens. You should then be able to run
Code:
blkid /dev/mapper/data-*
and see your filesystems. Now, it would be nice if xfs allowed you to check them without writing to them, but it looks like the only choice is to try to mount one read/write (that's why I hope you have a backup).

The LVM change can be easily undone by
Code:
vgcfgrestore -v -f data.bak1
Whatever the xfs mount attempt does to the contents might not be so easily undone.
 
Old 10-20-2014, 06:14 PM   #21
Nogitsune
Member
 
Registered: Oct 2014
Posts: 33

Rep: Reputation: Disabled
I've always used mdadm to handle the raid, and when I use lvm, I've used it on the raid device from mdadm. If I understood things correctly this is actually a raid10 that was made straight from lvm. That's something I've never dabbled in, so I don't really know what to expect - in particular, I have no idea how lvm handles the headers and metadata for it's own raid volumes. We're going way beyond my area of knowledge here, so I'll have to basically bow out here.

But if the XFSB head is missaligned inside the LVM.. then I can think of a few possibilities:

1) the XFS system was always there, it was made into that partition that was wrapped inside the logical volume. In this case the 8192 aligned loop might have seen the real partition. But my mind refuses to wrap around the idea of how exactly you could mount such a partition.

2) the XFS system was always there. The LVM itself is missaligned now for some reason. Maybe because the physical volume itself is missaligned. This would make sense if what rknichols suggests is true, but since 10 is stripe-of-mirrors, looking at content of single disk wouldn't give you much anything meaningful. The LVM in particular (and as such the XFS partition) wouldn't hold anything useful if looked within context of any single disk.

The lsblk shows a listing that suggests the current setup does recognize the disks as being in raid10 though, so I still don't really get it. The metadata shows /dev/md127 as the physical volume, which suggests that it was at least originally placed on the raid device.

The hexdump you gave from the disk shows part of the LVM metadata at position 08000a40. The sequence number of that metadata block should be just a little earlier on the dump. I'm curious as to what sequence number that particular block has, whether it's the only metadata block stored on the device, and if not, what is around it on the device. Also, whether the other disks ( sdc, sdd, sdf ) have pieces of the lvm metadata on them. Piecing those together should give some idea on how and where the metadata was restored - if it was restored to the raid device, then it should be striped across the hard drives, and mirrored between two of them (two hard drives showing identical pices of metadata, and the pairs of disks having alternating blocks of it).

If on the other hand it was restored on physical hard drive, then it should be present only in one drive, and there should be no mirror of it (the other drives might then hold pieces of the original 'pre-corruption' metadata. On the other hand I'm not sure what the raid resync would have done in that case).

Also, if you do the restore with altered offset for the metadata, perhaps you could do something with xfs_db in a non-destructive way to check on the status of the filesystem. It's a command I've never used myself, though so I can't give much in way of advice there.

Last edited by Nogitsune; 10-20-2014 at 06:18 PM.
 
Old 10-20-2014, 07:33 PM   #22
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035
The hexdump from the start of the disk has a RAID superblock magic number at offset 0x1000 followed by a "version-1" identifier, so at that offset it's metadata version 1.2. That indicates that the disk was first set up as a RAID member, and then the RAID device was used as an LVM PV. That's how the OP described it, and it's consistent with the device hint in the LVM metadata backup file. The creation time in that RAID metadata converts as "Thu Oct 17 02:43:27 UTC 2013," almost exactly 1 year ago. Either that is truly the year-old data, or perhaps this is all an artifact of the initial recovery effort with the year set wrong in the system clock (At this point, I'm not taking anything on faith.), and not how the system was previously set up at all. I would really need to see some of the older files in /etc/lvm/backup/ to get an idea of what might have been there before.

I too have no experience with xfs, so I would just be stumbling around in the dark with xfs_db. On my own, I might get somewhere, but I certainly can't give advice to anyone else.

Last edited by rknichols; 10-20-2014 at 07:35 PM. Reason: Typos
 
Old 10-21-2014, 06:42 AM   #23
Nogitsune
Member
 
Registered: Oct 2014
Posts: 33

Rep: Reputation: Disabled
Post

Quote:
Originally Posted by rknichols View Post
The hexdump from the start of the disk has a RAID superblock magic number at offset 0x1000 followed by a "version-1" identifier, so at that offset it's metadata version 1.2. That indicates that the disk was first set up as a RAID member, and then the RAID device was used as an LVM PV. That's how the OP described it, and it's consistent with the device hint in the LVM metadata backup file. The creation time in that RAID metadata converts as "Thu Oct 17 02:43:27 UTC 2013," almost exactly 1 year ago. Either that is truly the year-old data, or perhaps this is all an artifact of the initial recovery effort with the year set wrong in the system clock (At this point, I'm not taking anything on faith.), and not how the system was previously set up at all. I would really need to see some of the older files in /etc/lvm/backup/ to get an idea of what might have been there before.

I too have no experience with xfs, so I would just be stumbling around in the dark with xfs_db. On my own, I might get somewhere, but I certainly can't give advice to anyone else.
Quote:
[root@storage-batch ~]# cat /etc/lvm/backup/data
# Generated by LVM2 version 2.02.98(2) (2012-10-15): Fri Oct 17 02:53:57 2014

contents = "Text Format Volume Group"
version = 1

description = "Created *after* executing '/usr/sbin/lvremove -f data/prod_corporativos-snap'"

creation_host = "storage-batch" # Linux storage-batch 3.12.5-200.fc19.x86_64 #1 SMP Tue Dec 17 22:21:14 UTC 2013 x86_64
creation_time = 1413525237 # Fri Oct 17 02:53:57 2014
creation_host matches the shell prompt, date from the creation_host comment is from 2013, date from creation_time and LVM2 line seems accurate, so I'd say (tentatively) that the dates are probably right.
 
Old 10-21-2014, 06:59 AM   #24
littleball
Member
 
Registered: Jan 2011
Distribution: Slackware, Red Hat Enterprise
Posts: 47

Original Poster
Rep: Reputation: 8
Hello guys.

Thanks a lot for all your explanation and help here.

rknichols:
Right now I don,t have a current backup of the whole raid+LVM, but, I can make one to an external disk before I made any type of change to the lvm data file. It,s kind of risky, since the change to the lv data file it,s going to affect the whole volume group.

If I delete the partition that my boss created in one of the corrupted LV, wouldn,t this aligned back in some way this particular LV?.

Nogitsune
I think (and maybe I,m wrong), I think the LVM got misaligned when my boss create the partition inside of one of the corrupted LVM. Since, the other 2 corrupted LVM don,t show anything about "attempt to use extra disk space" or something similar, when I tried to mount them. I do get the same error indeed, about the bad superblock, unknown fs, etc, etc. The system does recognize as valid the raid 10, and do recognize the volume group and logical volume:

Code:
[root@storage-batch ~]# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "os" using metadata type lvm2
  Found volume group "data" using metadata type lvm2
[root@storage-batch ~]# lvscan
  ACTIVE            '/dev/os/swap' [7,78 GiB] inherit
  ACTIVE            '/dev/os/root' [66,23 GiB] inherit
  ACTIVE            '/dev/data/prod_corporativos' [500,00 GiB] inherit
  ACTIVE            '/dev/data/lab_vmimages' [1,00 TiB] inherit
  ACTIVE            '/dev/data/prod_portables' [100,00 GiB] inherit
  ACTIVE            '/dev/data/lab_templates' [100,00 GiB] inherit
  ACTIVE            '/dev/data/prod_vmimages-batch' [500,00 GiB] inherit
[root@storage-batch ~]# cat /proc/mdstat 
Personalities : [raid1] [raid10] 
md127 : active raid10 sdf[3] sde[2] sdd[1] sdc[0]
      3906765824 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      
md0 : active raid1 sdb1[0] sda1[2]
      78058368 blocks super 1.2 [2/2] [UU]

Last edited by littleball; 10-21-2014 at 07:20 AM.
 
Old 10-21-2014, 07:14 AM   #25
littleball
Member
 
Registered: Jan 2011
Distribution: Slackware, Red Hat Enterprise
Posts: 47

Original Poster
Rep: Reputation: 8
Ohhh..and no, XFS didn,t let me to analyze the disk without a need to write on it even, if I did found out on the internet that it was possible to mount read only an xfs partition and use xfs_repair on it..in my case, it was impossible.

Code:
If xfs_repair failed in phase 2 or later, follow these steps:

    Mount the file system using mount -r (read-only).

    Make a file system backup with xfsdump.
Maybe, in this server XFS have some flag, parameter or attribute compile inside the kernel that doesn,t allow this condition.
 
Old 10-21-2014, 10:22 AM   #26
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035
Quote:
Originally Posted by littleball View Post
Right now I don,t have a current backup of the whole raid+LVM, but, I can make one to an external disk before I made any type of change to the lvm data file. It,s kind of risky, since the change to the lv data file it,s going to affect the whole volume group.

If I delete the partition that my boss created in one of the corrupted LV, wouldn,t this aligned back in some way this particular LV?.
The change to the LVM data won't affect anything but the LVM header, and it's completely reversible. No problem there. It's allowing write access to the xfs filesystem that has the potential for destruction. It is totally safe to make the LVM change and then run "blkid /dev/mapper/data-*" to see if the filesystem headers are now found. Creating that partition table in the LV would overwrite an xfs superblock (if it were in its proper location), but it cannot move one. The LV was already misaligned before that partition table was written, and deleting that partition isn't going to magically shift it back. (Deleting a partition doesn't delete the partition table -- it just zeros one table entry.)

FWIW, I find this relationship of timestamps to be an almost unbelievable coincidence:
RAID creation time: Oct 17 02:43:27 UTC 2013

LVM creation time: Oct 17 02:53:57 2014
Exactly one year plus 10 minutes 30 seconds??? No, that's not the original RAID superblock. I don't know what was done to "recover" the RAID array, but it looks like the system clock was set to the wrong year, and it's entirely possible that the RAID setup is the real cause of the problem here.

I strongly suspect that all the LVs are misaligned, and that is why the xfs superblocks aren't being found. You can check that by running "hexdump -C" on the other LVs and looking for the xfs magic number (ASCII characters "XFSB") at offset 8192 (0x2000). If it's there, my suggestion to change the pe_start value in the LVM header should fix everything.
 
Old 10-21-2014, 10:58 AM   #27
littleball
Member
 
Registered: Jan 2011
Distribution: Slackware, Red Hat Enterprise
Posts: 47

Original Poster
Rep: Reputation: 8
Thanks rknichols.

You are right about searching for the XFS magic key on the corrupted LVM :

Code:
[root@storage-batch ~]# hexdump -C /dev/mapper/data-lab_templates | head -24 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  f7 fa 55 3d 00 00 00 20  |..........U=... |
000001c0  21 00 83 2a ac fe 00 08  00 00 00 f8 7f 0c 00 00  |!..*............|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  58 46 53 42 00 00 10 00  00 00 00 00 01 90 00 00  |XFSB............|
00002010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00002020  13 d8 d7 ac 53 7b 4f 48  86 56 56 ba 11 15 ce 35  |....S{OH.VV....5|
Another corrupted LVM:

Code:
root@storage-batch ~]# hexdump -C /dev/mapper/data-lab_vmimages | head -24 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001b0  00 00 00 00 00 00 00 00  93 25 59 88 00 00 00 00  |.........%Y.....|
000001c0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  58 46 53 42 00 00 10 00  00 00 00 00 10 00 00 00  |XFSB............|
00002010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00002020  0d a7 59 f1 f1 d6 4a af  8c ab 2e 5e 1f 79 95 1f  |..Y...J....^.y..|
They both start at offset 8192. Since this is a production server, let me send this url link to my boss and see what he decide we should do I truly hope I can come back and post good news.

Thanks guys, for all your help.
 
Old 10-21-2014, 12:39 PM   #28
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 4,446

Rep: Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035Reputation: 2035
Keep in mind that you may still find some minor corruption from the partition tables that were intended (misguidedly) to go at the beginning of an LV but, due to the misalignment, actually went into an extent of the LV that immediately preceded that LV. I've mapped out the VG, and found 3 instances that might be of concern:
  1. prod_portables at offset 107374123008 (0x18ffff1800) has the table written to prod_vmimages-batch,
  2. prod_vmimages-batch at offset 429496516608 (0x63fffcc000) has the table written to prod_corporativos,
  3. lab_vmimages at offset 562640439296 (0x82fffbc800) has the table written to lab_templates.
For prod_portables and lab_vimages, those overwrites are near the end of the LV. For prod_vimages_batch, it's about 100GB back from the end of the LV.

In each case, the overwrite is a single 512-byte sector.

Partition tables written to any of the other LVs will not have hurt anything since those LVs don't have anything important 8192 bytes back from their proper starting location.
 
Old 10-21-2014, 01:58 PM   #29
ron7000
Member
 
Registered: Nov 2007
Location: CT
Posts: 248

Rep: Reputation: 26
it sounds like you are going through what i went through years ago when i took over things that someone else set up.
I learned my lesson, always have a separate data backup that is working. A raid setup does not prevent from data loss.
what happened to me was a software raid using mdadm was set up across multiple drives in a storage array that did not support raid.
our area was prone to power fluctuations and this specific hardware that i took over got corrupted because of that and i went through the same thing you are going through. after i got it working, it corrupted again within a month. my guess is something similiar is happening with you and your hardware.
i got the same errors as you, the volume could be seen in linux but i would get that superblock error when mounting.
my filesystem at the time was XFS, and i used the xfs_repair utility to recover things, but i got a lot of lost inodes which resulted in folder names and file names being lost. the files and data was there, but i had to go through lots of folders and files and deciper what was what... my ron.txt file in my ron folder came back as 34534562345/674142355626.... bunch of random numbers, not fun. some folder and file names came back though.

from what i remember, you can set things up with your raid and volume a few different ways. you can format each disk as xfs then volume group them together then create an xfs file system on top of everything that is then mounted in linux. or with each disk not formatted and not paritioned, you let the volume group manager partition each disk as whatever the format code is and then they get raided together with the software mdadm, then you create the xfs file system on top of all that which is mounted. that's the best i can explain it from what i remember, so i would try to figure out exactly how things were set up previously... if that wasn't already said i didn't read this thread in detail. with the file system being xfs, what i think happens is the primary superblock gets corrupted which has the infomation of all the other superblocks and allocation tables and is what is read when using the mount command. the good news at least is you can use xfs_repair -n to see what it reports. good luck.

http://docs.cray.com/books/S-2377-22...bhdyq9i26.html
http://linux.die.net/man/8/xfs_repair
 
Old 10-21-2014, 05:20 PM   #30
Nogitsune
Member
 
Registered: Oct 2014
Posts: 33

Rep: Reputation: Disabled
-- edit --
If you haven't mounted the XFS yet with the changed logical volume alignment, please don't. Stop what you're doing now.

Sorry, I may be totally off with everything I wrote here, and indeed it's possible that mounting it that way might restore everything. But if by chance what I'm thinking, actually happened.. then it's probably also possible that the mount would totally trash the partition. So I'm encouraging a bit more investigation to this, and if at all possible, a full backup.

The problem is that IF the raid stripes are missaligned, then it's possible that it shows the whole partition with pieces aligned the wrong way.. and if XFS repair runs amok through it, I have no idea really what it would result in.

Basically what should on partition look like:
Code:
--------XFS-1234567890ABCDEFGHIJKLMNOPQR------------
Could instead look like:
Code:
--------XFS-12--56789034CDEFGHABKLMNOPIJ------QR----
All the data would still be there, but shuffled around. If you then try to run repair on piece of disk that starts from 'XFS', no good will come out of it.
-- end edit --

Quote:
Originally Posted by rknichols View Post
FWIW, I find this relationship of timestamps to be an almost unbelievable coincidence:
RAID creation time: Oct 17 02:43:27 UTC 2013

LVM creation time: Oct 17 02:53:57 2014
Exactly one year plus 10 minutes 30 seconds??? No, that's not the original RAID superblock. I don't know what was done to "recover" the RAID array, but it looks like the system clock was set to the wrong year, and it's entirely possible that the RAID setup is the real cause of the problem here.

I strongly suspect that all the LVs are misaligned, and that is why the xfs superblocks aren't being found. You can check that by running "hexdump -C" on the other LVs and looking for the xfs magic number (ASCII characters "XFSB") at offset 8192 (0x2000). If it's there, my suggestion to change the pe_start value in the LVM header should fix everything.
Aah. Totally missed the month there, I was just reading the 2013 from LVM header. That one pointed to December. Recovering the raid would then likely have been done by just recreating the array. As long as you do it with exactly same parameters as before, it works fine. On the other hand if you mess anything up - wrong chunk size, wrong order of disks.. and let it slip to running the resync, it's all gone. What I still don't understand then is, why the LVM alignment would be messed up. If there's extra padding between LVM start and start of XFS, and we assume this XFS header is the original one.. wouldn't that mean that LVM blocks are starting from a spot earlier than they used to? If they were restored from earlier backup, does that mean that the PVM itself starts earlier than it used to? If the PVM is the raid partition, then that would make the raid itself misaligned, and with raid 10 wouldn't that also mean that the stripes themself are wrongly aligned? That would make all the data on the volume basically garbage. Luckily though level 0 can't run a resync and the mirror portion should still find both disks identical.. so I guess a resync wouldn't hopelessly shuffle the data like it would on level 5 or 6. What exactly could cause this though? Maybe if the raid was originally made on partitions, and was now made straight on disk.. that would push the raid itself back the length of partition table.

Or am I thinking too much into this?

Quote:
md127 : active raid10 sdf[3] sde[2] sdd[1] sdc[0]
3906765824 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]

md0 : active raid1 sdb1[0] sda1[2]
78058368 blocks super 1.2 [2/2] [UU]
Maybe I'm not so far off the mark this time..? md0 is made from partitions, but md127 is in fact made straight from the disks. If that's the case, I don't think you'll recover the data just by bumping the LVM forward because the underlying RAID itself is probably striped from wrong segments. If (and that's a big IF still), if that's really the case, you'd have to complitely recreate the RAID, this time on actual partitions on the disk, and then restore the PVM/LVM back on top of it. But that's not something you want to do on a whim, without backups.

My solution in similar case was to arrange new set of disks, dd everything straight from disk-to-disk to the new disks, and then use those new disks to restore the data - leaving originals untouched, so that if I messed something up I could always just redo the dd from disk-to-disk.

The new 'fake partition table' would then take the spot that used to be the end of the old LVM metadata - which would be the only reason it didn't corrupt the XFS header. Of course then the reason it was put there in the first place would probably be because the mounting of the logical volume failed - so the same error that prevented you from getting to data, might also have saved it afterwards. I don't know if that would make you incredibly lucky or incredibly unlucky. Maybe both.

Here's some important questions: are you absolutely certain that the disks are put into the current raid in exactly same order and configuration they were previously (which disks are mirrored with which ones, which order the mirrors are striped in)? Are you absolutely certain the old raid was done with the same metadata level as the current one (the new 1.2 metadata, not the old 0.9 metadata that was standard years ago)? Are you absolutely certain that the raid is made with the same chunk size (currently 512) it was made originally?

If any single one of these assumptions is wrong, it can result in disaster.

Last edited by Nogitsune; 10-21-2014 at 06:21 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to recover from FULL hard disk? lucmove Linux - Software 5 02-16-2013 07:42 AM
Recover Data from Hard Disk vikas027 General 5 12-22-2011 01:59 PM
Get files from a disk image ? (How do I recover a floppy using Linux ?) me3 Linux - Software 7 06-11-2006 05:28 PM
How to recover data from hard disk dongyan_liu Linux - Newbie 1 12-01-2005 05:56 PM
can I recover from disk full - without reboot ? can't open display dtimms Linux - Software 3 09-19-2004 12:27 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 08:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration