Cluster Suite - No Failover
Hi,
I have a two-node cluster with the cluster.conf shown below. (I'm aware that I don't have any fence devices, and that I need them. I'm just trying to get something to work in test for now) 1. If I move the service via "clusvcadm -r MQ_HA -m gateway-ifdev-mq2" then it fails over fine. 2. If I "shutdown -h" one of the nodes, then it fails-over to other node. 3. But if I power off one of the nodes nothing happens. It does not attempt to start the fail-over. The messages from /var/log/messages are below: Very grateful for anyone that can spot why I do not get a failover starting with this cluster.conf config. Avery ----- cluster.conf ------------- <?xml version="1.0"?> <cluster alias="MQ_HA_IFDEV" config_version="57" name="MQ_HA_IFDEV"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="GATEWAY-IFDEV-MQ1" nodeid="1" votes="1"> <fence/> </clusternode> <clusternode name="GATEWAY-IFDEV-MQ2" nodeid="2" votes="1"> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" two_node="1"/> <fencedevices/> <rm> <failoverdomains> <failoverdomain name="MQ_HA_Fail_Domain" ordered="0" restricted="1"> <failoverdomainnode name="GATEWAY-IFDEV-MQ1" priority="1"/> <failoverdomainnode name="GATEWAY-IFDEV-MQ2" priority="1"/> </failoverdomain> </failoverdomains> <resources> <ip address="172.16.8.196" monitor_link="1"/> <netfs export="/MQHA/QM1/data" force_unmount="1" fstype="nfs" host="GATEWAY-IFDEV-WAS1" mountpoint="/MQHA/QM1/data" name="MQ_HA_Mount_data" options=""/> <netfs export="/MQHA/QM1/log" force_unmount="1" fstype="nfs" host="GATEWAY-IFDEV-WAS1" mountpoint="/MQHA/QM1/log" name="MQ_HA_Mount_log" options=""/> <script file="/MQHA/bin/mqOCF_Script_QM1.sh" name="mqOCF_Script_QM1"/> </resources> <service autostart="0" name="MQ_HA" recovery="relocate"> <netfs ref="MQ_HA_Mount_data"> <netfs ref="MQ_HA_Mount_log"> <script ref="mqOCF_Script_QM1"> <ip ref="172.16.8.196"/> </script> </netfs> </netfs> </service> </rm> </cluster ---------- /var/log/messages -------------- Jul 14 11:29:59 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] The token was lost in the OPERATIONAL state. Jul 14 11:29:59 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Receive multicast socket recv buffer size (288000 bytes). Jul 14 11:29:59 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes). Jul 14 11:29:59 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] entering GATHER state from 2. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] entering GATHER state from 0. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Creating commit token because I am the rep. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Saving state aru 1ac high seq received 1ac Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Storing new sequence id for ring 78 Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] entering COMMIT state. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] entering RECOVERY state. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] position [0] member 172.16.8.149: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] previous ring seq 116 rep 172.16.8.148 Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] aru 1ac high delivered 1ac received flag 1 Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Did not need to originate any messages in recovery. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] Sending initial ORF token Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 kernel: dlm: closing connection to node 1 Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] CLM CONFIGURATION CHANGE Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] New Configuration: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] r(0) ip(172.16.8.149) Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] Members Left: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] r(0) ip(172.16.8.148) Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] Members Joined: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] CLM CONFIGURATION CHANGE Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] New Configuration: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] r(0) ip(172.16.8.149) Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] Members Left: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] Members Joined: Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [SYNC ] This node is within the primary component and will provide servic e. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [TOTEM] entering OPERATIONAL state. Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CLM ] got nodejoin message 172.16.8.149 Jul 14 11:30:04 GATEWAY-IFDEV-MQ2 openais[3274]: [CPG ] got joinlist message from node 2 |
Looks like your Inter cluster communication is not working.
For this, nodes can not 'sense' each other. Did you enable multicasting in switch ? open-ais require multicasting for inter cluster communication. |
Thanks aquaregia, it is working now. I created a hack of the fence_ilo script to 'pretend' that I have HP iLO devices.
|
All times are GMT -5. The time now is 09:17 AM. |