RHEL4 cluster
We have a three node RHEL 4 cluster that is set up solely to just share several GFS file systems. We recently had to reboot one of the three nodes and now I cannot get it to rejoin the cluster. It hangs for a long time on starting cman then eventually fails and moves on but this node cannot join the cluster. I see the following in logs:
Feb 7 15:05:13 lakeside kernel: CMAN: Waiting to join or form a Linux-cluster
Feb 7 15:05:13 lakeside ccsd[4801]: Connected to cluster infrastruture via: CMAN/SM Plugin v1.1.7.6
Feb 7 15:05:13 lakeside ccsd[4801]: Initial status:: Inquorate
cman_tool status one one of the working nodes shows (I removed the IP):
Protocol version: 5.0.1
Config version: 158
Cluster name: webfarm
Cluster ID: 13957
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 2
Total_votes: 2
Quorum: 2
Active subsystems: 64
Node name: ironstone
Node ID: 1
Node addresses: x.x.x.x
cman_tool status on node that won't join shows:
[root@lakeside log]# cman_tool status
Protocol version: 5.0.1
Config version: 158
Cluster name: webfarm
Cluster ID: 13957
Cluster Member: No
Membership state: Joining
cman_tool nodes on working node shows:
Node Votes Exp Sts Name
1 1 2 M ironstone
2 1 3 X lakeside
3 1 2 M dawson
Why does the broken node expect 3 votes? Is this causing the issue? Any way to have it expect 2 like the others?
We're retiring this cluster in the next month, but would really like to get it working.... Any way without rebooting other nodes to get things working again?
|