LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 05-21-2016, 03:50 PM   #1
anonymouser
LQ Newbie
 
Registered: Feb 2009
Posts: 4

Rep: Reputation: 0
OCFS2 - Bad magic number


Hi LQ friends,

I have a problem with our OCFS2 cluster, which I couldn't solve by myself.
In short, I have a OCFS2 cluster with 3 nodes and a shared storage LUN. I have mapped the LUN to all 3 of the nodes, and split the LUN into 2 partitions, formatted them as OCFS2 filesystems and mounted them successfully. The system has been running OK for nearly 2 years, but today the partition 1 suddenly is not accessible. I have to reboot 1 node. After rebooting, the partition 2 is mounted OK, but the partition 1 cannot be mounted.
The error is below:

Code:
# mount -t ocfs2 /dev/mapper/mpath3p1 /test
mount.ocfs2: Bad magic number in inode while trying to determine heartbeat information

# fsck.ocfs2  /dev/mapper/mpath3p1 
fsck.ocfs2 1.6.3
fsck.ocfs2: Bad magic number in inode while initializing the DLM

# fsck.ocfs2 -r 2 /dev/mapper/mpath3p1 
fsck.ocfs2 1.6.3
[RECOVER_BACKUP_SUPERBLOCK] Recover superblock information from backup block#1048576? <n> y
fsck.ocfs2: Bad magic number in inode while initializing the DLM

# parted /dev/mapper/mpath3
GNU Parted 1.8.1
Using /dev/mapper/mpath3
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            

Model: Linux device-mapper (dm)
Disk /dev/mapper/mpath3: 20.0TB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      17.4kB  10.2TB  10.2TB               primary       
 2      10.2TB  20.0TB  9749GB               primary
Usually, the bad magic number happens when the super block is corrupted, and I have experienced several similar cases before, which can be solved quickly by using backup super blocks. But this case is different, I cannot fix the problem by simply replacing the super block, thus I'm out of ideas.

Please take a look and suggest me how to solve this problem, as I need to recover the data, it's the most important goal now.

Thanks in advance.
 
Old 05-23-2016, 09:12 AM   #2
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
/dev/mapper/mpath3 implies you're using Linux multipathing with "friendly names".

If you run "multipath -l -v2" it should show you the component disks of that multipath device. Have you checked one or more of those components?

What is/are the underlying component disks? Typically this would be a disk array with multiple paths (usually fiber channel SCSI but maybe iSCSI or something else). Is the underlying disk array OK? Is the underlying component a LUN (e.g. RAID1, RAID5 etc...) in a disk array. If so what is the status of that LUN?
 
Old 05-24-2016, 02:47 AM   #3
anonymouser
LQ Newbie
 
Registered: Feb 2009
Posts: 4

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by MensaWater View Post
/dev/mapper/mpath3 implies you're using Linux multipathing with "friendly names".

If you run "multipath -l -v2" it should show you the component disks of that multipath device. Have you checked one or more of those components?

What is/are the underlying component disks? Typically this would be a disk array with multiple paths (usually fiber channel SCSI but maybe iSCSI or something else). Is the underlying disk array OK? Is the underlying component a LUN (e.g. RAID1, RAID5 etc...) in a disk array. If so what is the status of that LUN?
Hi MensaWater,
Thanks for your reply.
I have checked the underlying storage first hand. I have an FC storage with one 20 TB LUN, which is accessed by the server via 4 paths (2 FC port of the server HBAs and 2 FC port / storage controller), all the paths are OK, the storage has no warnings. The LUN is created on top of a RAID5 disk group. The second partition which resides in the same LUN is still OK.

I thinks there's something about OCFS2 filesystem has gone wrong. The error seems to occur in the filesystem level, which affects only 1st partition.

Code:
# parted /dev/mapper/mpath3
GNU Parted 1.8.1
Using /dev/mapper/mpath3
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print                                                            

Model: Linux device-mapper (dm)
Disk /dev/mapper/mpath3: 20.0TB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name     Flags
 1      17.4kB  10.2TB  10.2TB               primary       
 2      10.2TB  20.0TB  9749GB               primary
 
Old 05-24-2016, 11:16 AM   #4
MensaWater
LQ Guru
 
Registered: May 2005
Location: Atlanta Georgia USA
Distribution: Redhat (RHEL), CentOS, Fedora, CoreOS, Debian, FreeBSD, HP-UX, Solaris, SCO
Posts: 7,831
Blog Entries: 15

Rep: Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669Reputation: 1669
I think were it me I'd do a web search for:
"fsck.ocfs2: Bad magic number in inode while initializing the DLM"

There seem to be a fair number of hits for that.

I haven't seen this error and we haven't done OCFS2 in a while (we use ASM these days). One thought that occurred to me though was generally speaking you can't do fsck on mounted filesystems. Since this is a cluster and I'm assuming you have this mounted on your other 2 servers I wonder if it simply isn't preventing the fsck due to the other mounts. I don't know if fsck.ocfs2 allows for fsck when mounted on other nodes or not.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bad Magic Number in SuperBlock devinmcelheran Linux - General 2 02-03-2014 07:21 PM
Bad Magic Number I Use Dial Linux - Newbie 1 07-28-2009 07:02 AM
Bad Magic Number??? Acej Linux - Newbie 4 10-07-2008 08:33 AM
Bad Magic Number joseph Linux - Software 1 01-05-2004 03:22 AM
Bad Magic Number? Thewyzewun Linux - General 17 10-19-2003 05:59 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 03:39 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration