LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 03-03-2014, 09:52 AM   #1
gpriyad
LQ Newbie
 
Registered: Mar 2014
Posts: 2

Rep: Reputation: Disabled
Unable to Configure a 2-Node Fault Resistant Cluster in RHEL 6


Hi,

I have some problem implementing redhat cluster using RHCS 6.

I have implemented the redhat cluster and it is showing the running nodes. I am using luci interface to look upon the status of the node in cluster.

My cluster configuration is as follows:-

<?xml version="1.0"?>
<cluster config_version="63" name="ExCluster">
<clusternodes>
<clusternode name="Node-1" nodeid="1">
<fence>
<method name="Method-1">
<device name="fence_dev" port="0"/>
</method>
</fence>
<unfence>
<device action="on" name="fence_dev" port="0"/>
</unfence>
</clusternode>
<clusternode name="Node-2" nodeid="2">
<fence>
<method name="Method-1">
<device name="fence_dev" port="1"/>
</method>
</fence>
<unfence>
<device action="on" name="fence_dev" port="1"/>
</unfence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_brocade" ipaddr="192.168.2.166" login="admin" name="fence_dev" passwd="xyz"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="Domain-1" restricted="1">
<failoverdomainnode name="Node-1"/>
<failoverdomainnode name="Node-2"/>
</failoverdomain>
</failoverdomains>
<resources>
<fs device="/dev/mapper/mpath1p1" force_fsck="1" fsid="49353" fstype="ext3" mountpoint="/mnt/d000" name="fs0" quick_status="1"/>
<fs device="/dev/mapper/mpath2p1" force_fsck="1" fsid="61535" fstype="ext3" mountpoint="/mnt/d001" name="fs1" quick_status="1"/>
<fs device="/dev/mapper/mpath3p1" force_fsck="1" fsid="41445" fstype="ext3" mountpoint="/mnt/d002" name="fs2" quick_status="1"/>
<script file="/etc/init.d/script-1" name="Scr-1"/>
</resources>
<service domain="Domain-1" exclusive="1" name="Service-1" recovery="relocate">
<fs ref="fs0">
<script ref="Scr-1"/>
</fs>
</service>
</rm>
</cluster>

Following are the problems I am facing:

Problem-1
The main problem is that whenever I am starting the service by giving the command " service cman restart", I am getting the following error:

Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]
Starting cluster:
Checking if cluster has been disabled at boot... [ OK ]
Checking Network Manager... [ OK ]
Global setup... [ OK ]
Loading kernel modules... [ OK ]
Mounting configfs... [ OK ]
Starting cman... [ OK ]
Waiting for quorum... [ OK ]
Starting fenced... [ OK ]
Starting dlm_controld... [ OK ]
Tuning DLM kernel config... [ OK ]
Starting gfs_controld... [ OK ]
Unfencing self... unfence rhcs1.systec.local failed
[FAILED]
Stopping cluster:
Leaving fence domain... [ OK ]
Stopping gfs_controld... [ OK ]
Stopping dlm_controld... [ OK ]
Stopping fenced... [ OK ]
Stopping cman... [ OK ]
Waiting for corosync to shutdown: [ OK ]
Unloading kernel modules... [ OK ]
Unmounting configfs... [ OK ]

Problem-2
Also Node-1 goes offline from the cluster by giving the message "Not a cluster member".

Problem-3
I am not able to add other instances to my fence device.

Could anyone please tell me the solution for the above issues.

Thanks in advance.
 
Old 03-03-2014, 10:35 PM   #2
Marios Zindilis
LQ Newbie
 
Registered: Feb 2012
Location: Limassol, Cyprus
Posts: 8

Rep: Reputation: Disabled
Since your problem is with fencing (jugding from when the first error message appears), try to disable it temporarily and see if the cluster starts normally without it. It's worth mentioning that you have only one fencing device, whereas in a cluster with N nodes you would normally have N fencing devices. In your case you would need two.
 
Old 03-04-2014, 12:40 AM   #3
gpriyad
LQ Newbie
 
Registered: Mar 2014
Posts: 2

Original Poster
Rep: Reputation: Disabled
Hi Marios,

Thanks for your quick reply.

Yes, after disabling the fence device from the cluster the cluster is able to start normally without it.

But as per my requirement, I can use only one fence device(Brocade Switch). And the moment I add one of the port of this fence device as Instance to the Fence method, it does not allow me to add the other port of the Fence device.

I have attahced an architecture diagram of my setup for your reference.

Also the status of the Service Group is displayed as "disabled", and when ever I try to start I am not able to do so.


Kinldy look into the above issue and help me in resolving it.


Thanks in advance.
Attached Thumbnails
Click image for larger version

Name:	Cluster Architecture(RHCS-6).jpg
Views:	57
Size:	47.8 KB
ID:	14877  
 
Old 03-04-2014, 10:07 AM   #4
Marios Zindilis
LQ Newbie
 
Registered: Feb 2012
Location: Limassol, Cyprus
Posts: 8

Rep: Reputation: Disabled
The the error " Unfencing self... unfence rhcs1.systec.local failed " shows that there is something wrong with your fence/unfence configuration. Is the name "rhcs1.systec.local" correct? Are the credentials correct? Is the fencing agent for Brocade sending the proper commands?

You need to check for errors that appear in files /var/log/cluster/*.log and /var/log/messages, and experiment with settings until you get it right. Keep in mind that setting up a cluster is not a simple task, you need to experiment a lot and break the system many times before you succeed.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
RHEL 5 two node cluster getting error thirupathi Linux - Server 8 02-17-2014 11:30 AM
How to make a two node RHEL 5.3 Cluster ? salman108 Linux - Server 2 04-24-2013 04:21 AM
RHEL CLuster - Node 2 _ Auto Reboot rajaniyer123 Linux - Server 1 08-07-2012 07:28 AM
How it is better to configure diskless cluster node? Dims Linux - Newbie 1 06-16-2009 10:51 PM
Frequent RHEL cluster node crash/restarts aix_tiger Linux - Enterprise 0 07-07-2007 08:04 AM


All times are GMT -5. The time now is 02:48 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration