Hello,
I am trying to setup a 2 node cluster, using RHEL 5.5.
Quote:
[root@IBRMAPPPSV02 etc]# uname -a
Linux IBRMAPPPSV02 2.6.18-194.el5 #1 SMP Tue Mar 16 21:52:39 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@IBRMAPPPSV02 etc]#
|
I have followed the following steps
1. Host files done.
2. Quorum disk, allocated LUN, did the qdisk -c /dev/sdd1 -l brmquorum
3. Installed Luci/ricci. Luci is running on one of the cluster nodes.
4. Made the cluster.conf file as follows.
PHP Code:
<?xml version="1.0"?>
<cluster alias="BRMCLUSTER" config_version="10" name="BRMCLUSTER">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="IBRMAPPPSV02" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="Manual02" nodename="IBRMAPPPSV02"/>
</method>
</fence>
</clusternode>
<clusternode name="IBRMAPPPSV01" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="Manual01" nodename="IBRMAPPPSV01"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="3"/>
<fencedevices>
<fencedevice agent="fence_manual" name="Manual01"/>
<fencedevice agent="fence_manual" name="Manual02"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="BRMFAIL" nofailback="1" ordered="1" restricted="1">
<failoverdomainnode name="IBRMAPPPSV02" priority="2"/>
<failoverdomainnode name="IBRMAPPPSV01" priority="1"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="10.10.192.61" monitor_link="1"/>
</resources>
<service autostart="1" domain="BRMFAIL" exclusive="1" name="BRMSERVICE" recovery="relocate"/>
</rm>
<quorumd device="/dev/sdd1" interval="1" min_score="1" tko="3" votes="1">
<heuristic interval="1" program="/usr/share/cluster/check_eth_link.sh bond0" score="1"/>
</quorumd>
</cluster>
Now, I have some problems.
on first node when I do clustat i get this
Quote:
[root@IBRMAPPPSV01 ~]# clustat
Cluster Status for BRMCLUSTER @ Tue May 21 09:47:04 2013
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
IBRMAPPPSV02 1 Offline
IBRMAPPPSV01 2 Online, Local
/dev/sdd1 0 Online, Quorum Disk
|
but on the other node when I do clustat I get this
Quote:
[root@IBRMAPPPSV02 ~]# clustat
Cluster Status for BRMCLUSTER @ Tue May 21 10:06:30 2013
Member Status: Inquorate
Member Name ID Status
------ ---- ---- ------
IBRMAPPPSV02 1 Online, Local
IBRMAPPPSV01 2 Offline
|
There are no problem logs on the first node, however on the second node, the continuous log is
Quote:
May 21 10:07:16 IBRMAPPPSV02 ccsd[14668]: Cluster is not quorate. Refusing connection.
May 21 10:07:16 IBRMAPPPSV02 ccsd[14668]: Error while processing connect: Connection refused
May 21 10:07:17 IBRMAPPPSV02 ccsd[14668]: Cluster is not quorate. Refusing connection.
May 21 10:07:17 IBRMAPPPSV02 ccsd[14668]: Error while processing connect: Connection refused
May 21 10:07:17 IBRMAPPPSV02 ccsd[14668]: Cluster is not quorate. Refusing connection.
|
On the problem node
When I try to start
service start cman, it hangs up on starting
fenced. This will remain hanged and won't allow the node to be turned off not will it ever start.
When I try to do
service clvmd start I get this on the problem node.
Quote:
May 21 09:53:27 IBRMAPPPSV02 kernel: dlm: no local IP address has been set
May 21 09:53:27 IBRMAPPPSV02 kernel: dlm: cannot start dlm lowcomms -107
May 21 09:53:27 IBRMAPPPSV02 clvmd: Unable to create lockspace for CLVM: Transport endpoint is not connected
|
Can someone please help me, and point out where I have gone wrong ?
Lastly,
Quote:
openais[29491]: [TOTEM] position [0] member 10.10.192.45
|
Why is openais working with ifcfg-eth2 ? whereas it is an unused interface ?
best regards