LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   How to recover from fatal: invalid metadata block (GFS2) (https://www.linuxquestions.org/questions/linux-server-73/how-to-recover-from-fatal-invalid-metadata-block-gfs2-4175505818/)

trusst 05-22-2014 07:37 PM

How to recover from fatal: invalid metadata block (GFS2)
 
Hi,

I'm looking for advice on how to recover from what looks like gfs2 filesystem corruption. We have an EqualLogic PS6500E, connected to RHEL 5.2 host (Dell PowerEdge 2950) and all the volumes are configured into one logical device. Yesterday we lost connection to this storage device. It cannot be umounted, if I try to access it I get "input/output error." The error I see in the messages log is below. I connected directly to the EqualLogic device through its management software and the device itself seems fine - no failed discs or any other obvious problems. What should be my course of actions in this case?

Thank you for your help.

Alex.

May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: fatal: invalid metadata block
May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: bh = 368448653 (magic number)
May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: function = gfs2_meta_indirect_buffer, file = fs/gfs2/meta_io.c, line = 334
May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: about to withdraw this file system
May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: telling LM to withdraw
May 21 16:19:24 emil kernel: GFS2: fsid=dm-57.0: withdrawn
May 21 16:19:24 emil kernel:
May 21 16:19:24 emil kernel: Call Trace:
May 21 16:19:24 emil kernel: [<ffffffff885c1526>] :gfs2:gfs2_lm_withdraw+0xc1/0xd0
May 21 16:19:24 emil kernel: [<ffffffff80063ae7>] __wait_on_bit+0x60/0x6e
May 21 16:19:24 emil kernel: [<ffffffff80015008>] sync_buffer+0x0/0x3f
May 21 16:19:24 emil kernel: [<ffffffff80063b61>] out_of_line_wait_on_bit+0x6c/0x78
May 21 16:19:24 emil kernel: [<ffffffff8009db4f>] wake_bit_function+0x0/0x23
May 21 16:19:24 emil kernel: [<ffffffff8001a370>] submit_bh+0x10a/0x111
May 21 16:19:24 emil kernel: [<ffffffff885d4697>] :gfs2:gfs2_meta_check_ii+0x2c/0x38
May 21 16:19:24 emil kernel: [<ffffffff885c4de4>] :gfs2:gfs2_meta_indirect_buffer+0x104/0x160
May 21 16:19:24 emil kernel: [<ffffffff885bfcff>] :gfs2:gfs2_inode_refresh+0x22/0x2cf
May 21 16:19:24 emil kernel: [<ffffffff885c7829>] :gfs2:gfs2_get_dentry+0x14c/0x203
May 21 16:19:24 emil kernel: [<ffffffff80031736>] sock_common_recvmsg+0x2d/0x43
May 21 16:19:24 emil kernel: [<ffffffff885bea7e>] :gfs2:gfs2_glock_nq_num+0x3b/0x68
May 21 16:19:24 emil kernel: [<ffffffff8866c366>] :exportfs:find_exported_dentry+0x43/0x47b
May 21 16:19:24 emil kernel: [<ffffffff8867971e>] :nfsd:nfsd_acceptable+0x0/0xd8
May 21 16:19:24 emil kernel: [<ffffffff8867d5e3>] :nfsd:exp_get_by_name+0x5b/0x71
May 21 16:19:24 emil kernel: [<ffffffff8867dbd2>] :nfsd:exp_find_key+0x89/0x9c
May 21 16:19:24 emil kernel: [<ffffffff885c75e0>] :gfs2:gfs2_decode_fh+0xa9/0xae
May 21 16:19:24 emil kernel: [<ffffffff88679a90>] :nfsd:fh_verify+0x29a/0x4be
May 21 16:19:24 emil kernel: [<ffffffff8867ab14>] :nfsd:nfsd_open+0x1f/0x17f
May 21 16:19:24 emil kernel: [<ffffffff8867ae3c>] :nfsd:nfsd_write+0x89/0xd5
May 21 16:19:24 emil kernel: [<ffffffff88681986>] :nfsd:nfsd3_proc_write+0xea/0x109
May 21 16:19:24 emil kernel: [<ffffffff886771db>] :nfsd:nfsd_dispatch+0xd8/0x1d6
May 21 16:19:24 emil kernel: [<ffffffff884c048b>] :sunrpc:svc_process+0x454/0x71b
May 21 16:19:24 emil kernel: [<ffffffff800646f5>] __down_read+0x12/0x92
May 21 16:19:24 emil kernel: [<ffffffff886775a1>] :nfsd:nfsd+0x0/0x2cb
May 21 16:19:24 emil kernel: [<ffffffff88677746>] :nfsd:nfsd+0x1a5/0x2cb
May 21 16:19:24 emil kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
May 21 16:19:24 emil kernel: [<ffffffff886775a1>] :nfsd:nfsd+0x0/0x2cb
May 21 16:19:24 emil kernel: [<ffffffff886775a1>] :nfsd:nfsd+0x0/0x2cb
May 21 16:19:24 emil kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11

unSpawn 05-23-2014 07:22 AM

Quote:

Originally Posted by trusst (Post 5175513)
What should be my course of actions in this case?

The problem is 0) corruption may have occurred any time before the warning was given and you 1) can not run fsck.gfs2 until you have properly umounted the file system(s). Since you run RHEL I'd strongly suggest you check your entitlements then contact their Support for options that are available to you. I'd imagine that would start with trying to migrate existing volumes that aren't corrupt to another SAN and filling in the rest from backups before you can safely diagnose the cause of the problem.


All times are GMT -5. The time now is 02:40 PM.