| eantoranz |
10-11-2012 02:52 PM |
problems setting corosync/pacemaker to do virtual ip
Hi!
I'm giving corosync/pacemaker a try (after giving heartbeat/pacemaker a try). I want to do something simple as providing a virtual IP.
I have built/installed corosync from source (prefix for both is /usr/local/ha) and now would like to start the services to then do the pacemaker configuration. I'm working on a VM with ubuntu 10.04 installed on it (that's why I'm building from source in order to have the latest version of both).
If I start corosync, everything looks normal (though I'm not sure of how to make sure the node is up besides seeing the multicast messages in the network):
Code:
Oct 11 15:12:39 ha3 corosync[3719]: [MAIN ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service.
Oct 11 15:12:39 ha3 corosync[3719]: [MAIN ] Corosync built-in features: nss
Oct 11 15:12:39 ha3 corosync[3719]: [MAIN ] Successfully read main configuration file '/usr/local/ha/etc/corosync/corosync.conf'.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Token Timeout (5000 ms) retransmit timeout (247 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] token hold (187 ms) retransmits before loss (20 retrans)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] join (1000 ms) send_join (0 ms) consensus (7500 ms) merge (200 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (20 messages)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] missed count const (5 messages)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] send threads (0 threads)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP token expired timeout (247 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP token problem counter (2000 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP threshold (10 problem count)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP multicast threshold (100 problem count)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] RRP mode set to none.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] heartbeat_failures_allowed (0)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] max_network_delay (50 ms)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Oct 11 15:12:39 ha3 corosync[3719]: [IPC ] you are using ipc api v2
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Receive multicast socket recv buffer size (225280 bytes).
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Transmit multicast socket send buffer size (225280 bytes).
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] The network interface [192.168.55.13] is now up.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Created or loaded sequence id c.192.168.55.13 for this ring.
Oct 11 15:12:39 ha3 corosync[3719]: [pcmk ] Logging: Initialized pcmk_startup
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.1.8
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync extended virtual synchrony service
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync configuration service
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync cluster config database access v1.01
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync profile loading service
Oct 11 15:12:39 ha3 corosync[3719]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1
Oct 11 15:12:39 ha3 corosync[3719]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] entering GATHER state from 15.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Creating commit token because I am the rep.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Saving state aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Storing new sequence id for ring 10
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] entering COMMIT state.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] got commit token
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] entering RECOVERY state.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] position [0] member 192.168.55.13:
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] previous ring seq c rep 192.168.55.13
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] aru 0 high delivered 0 received flag 1
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Did not need to originate any messages in recovery.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] got commit token
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Sending initial ORF token
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Resetting old ring state
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] recovery to regular 1-0
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering to app 1 to 0
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] entering OPERATIONAL state.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 0 to 1
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 1 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization actions starting for (dummy CLM service)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 1 to 2
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 2 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 1
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 2 to 3
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 3 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Committing synchronization for (dummy CLM service)
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization actions starting for (dummy AMF service)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 2
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 3
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 3 to 4
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 4 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Committing synchronization for (dummy AMF service)
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization actions starting for (dummy CKPT service)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 4
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 4 to 5
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 5 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Committing synchronization for (dummy CKPT service)
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization actions starting for (dummy EVT service)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 5 to 6
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 6 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 5
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 6 to 7
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 7 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 6
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 7
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 7 to 8
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 8 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Committing synchronization for (dummy EVT service)
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization actions starting for (corosync cluster closed process group service v1.01)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering 8 to a
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq 9 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq a to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [CPG ] comparing: sender r(0) ip(192.168.55.13) ; members(old:0 left:0)
Oct 11 15:12:39 ha3 corosync[3719]: [CPG ] chosen downlist: sender r(0) ip(192.168.55.13) ; members(old:0 left:0)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including 8
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering a to b
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq b to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Barrier completion status for nodeid 221751488 = 1.
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]: [SYNC ] Committing synchronization for (corosync cluster closed process group service v1.01)
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including a
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering b to c
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] Delivering MCAST message with seq c to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]: [MAIN ] Completed service synchronization, ready to provide service.
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including b
Oct 11 15:12:39 ha3 corosync[3719]: [TOTEM ] releasing messages up to and including c
Then, when I start pacemaker I see this in corosync's log:
Code:
Oct 11 15:14:09 ha3 crmd[3967]: error: crmd_ais_dispatch: Recieving messages from a node we think is dead: ha3[221751488]
Oct 11 15:14:09 ha3 crmd[3967]: error: do_log: FSA: Input I_ERROR from check_dead_member() received in state S_STARTING
Oct 11 15:14:09 ha3 crmd[3967]: warning: do_state_transition: State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=check_dead_member ]
Oct 11 15:14:09 ha3 crmd[3967]: error: do_recover: Action A_RECOVER (0000000001000000) not supported
Oct 11 15:14:09 ha3 crmd[3967]: error: do_started: Start cancelled... S_RECOVERY
Oct 11 15:14:09 ha3 crmd[3967]: error: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
Oct 11 15:14:09 ha3 crmd[3967]: notice: terminate_cs_connection: Disconnecting from Corosync
Oct 11 15:14:09 ha3 crmd[3967]: error: do_exit: Could not recover from internal error
Oct 11 15:14:09 ha3 pacemakerd[3748]: error: pcmk_child_exit: Child process crmd exited (pid=3967, rc=2)
|