Red Hat Cluster Problem
I have a problem with my RedHat 5.6 2 node cluster and fencing....
problem is at random time the cluster de-clusters and then shuts down the system, powers it off.... from what i can tell its in the fencing process when this happens. here is sample from log, other server logs just say system is going down at same time of time stamp: Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] entering GATHER state from 11. Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] Creating commit token because I am the rep. Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] entering RECOVERY state. Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] position [0] member 10.10.10.2: Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] previous ring seq 188 rep 10.10.10.1 Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] aru 631 high delivered 631 received flag 1 Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] Did not need to originate any messages in recovery. Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [TOTEM] Sending initial ORF token Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] CLM CONFIGURATION CHANGE Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] New Configuration: Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] r(0) ip(10.10.10.2) Mar 15 16:41:13 sys1-cmdb2 kernel: dlm: closing connection to node 1 Mar 15 16:41:13 sys1-cmdb2 fenced[3974]: sys1-cmdb1. not a cluster member after 0 sec post_fail_delay Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] Members Left: Mar 15 16:41:13 sys1-cmdb2 fenced[3974]: fencing node "sys1-cmdb1." Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] r(0) ip(10.10.10.1) Mar 15 16:41:13 sys1-cmdb2 openais[3955]: [CLM ] Members Joined: Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] CLM CONFIGURATION CHANGE Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] New Configuration: Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] r(0) ip(10.10.10.2) Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] Members Left: Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] Members Joined: Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [SYNC ] This node is within the primary component and will pr ovide service. Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] entering OPERATIONAL state. Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CLM ] got nodejoin message 10.10.10.2 Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [CPG ] got joinlist message from node 2 Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] entering GATHER state from 9. Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] Storing new sequence id for ring c4 Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] entering COMMIT state. Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] entering RECOVERY state. Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] position [0] member 10.10.10.1: Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] previous ring seq 192 rep 10.10.10.1 Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] aru 11 high delivered 11 received flag 1 Mar 15 16:41:14 sys1-cmdb2 openais[3955]: [TOTEM] position [1] member 10.10.10.2: Mar 15 16:41:14 sys1-cmdb2 gfs_controld[3986]: cluster is down, exiting Mar 15 16:41:14 sys1-cmdb2 clurgmgrd[10489]: <warning> #67: Shutting down uncleanly Mar 15 16:41:14 sys1-cmdb2 dlm_controld[3980]: cluster is down, exiting |
Quote:
it might be over heating |
Quote:
|
Since you are using RedHat and paying for that, right? So you can take RedHat support: https://www.redhat.com/wapps/sso/log...port/cases/new you need to login first.
|
All times are GMT -5. The time now is 07:59 AM. |