LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 10-11-2012, 02:52 PM   #1
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Rep: Reputation: 83
problems setting corosync/pacemaker to do virtual ip


Hi!

I'm giving corosync/pacemaker a try (after giving heartbeat/pacemaker a try). I want to do something simple as providing a virtual IP.

I have built/installed corosync from source (prefix for both is /usr/local/ha) and now would like to start the services to then do the pacemaker configuration. I'm working on a VM with ubuntu 10.04 installed on it (that's why I'm building from source in order to have the latest version of both).

If I start corosync, everything looks normal (though I'm not sure of how to make sure the node is up besides seeing the multicast messages in the network):

Code:
Oct 11 15:12:39 ha3 corosync[3719]:   [MAIN  ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service.
Oct 11 15:12:39 ha3 corosync[3719]:   [MAIN  ] Corosync built-in features: nss
Oct 11 15:12:39 ha3 corosync[3719]:   [MAIN  ] Successfully read main configuration file '/usr/local/ha/etc/corosync/corosync.conf'.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Token Timeout (5000 ms) retransmit timeout (247 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] token hold (187 ms) retransmits before loss (20 retrans)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] join (1000 ms) send_join (0 ms) consensus (7500 ms) merge (200 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] downcheck (1000 ms) fail to recv const (2500 msgs)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] seqno unchanged const (30 rotations) Maximum network MTU 1402
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] window size per rotation (50 messages) maximum messages per rotation (20 messages)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] missed count const (5 messages)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] send threads (0 threads)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP token expired timeout (247 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP token problem counter (2000 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP threshold (10 problem count)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP multicast threshold (100 problem count)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP automatic recovery check timeout (1000 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] RRP mode set to none.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] heartbeat_failures_allowed (0)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] max_network_delay (50 ms)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Initializing transport (UDP/IP Multicast).
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Oct 11 15:12:39 ha3 corosync[3719]:   [IPC   ] you are using ipc api v2
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Receive multicast socket recv buffer size (225280 bytes).
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Transmit multicast socket send buffer size (225280 bytes).
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] The network interface [192.168.55.13] is now up.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Created or loaded sequence id c.192.168.55.13 for this ring.
Oct 11 15:12:39 ha3 corosync[3719]:   [pcmk  ] Logging: Initialized pcmk_startup
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: Pacemaker Cluster Manager 1.1.8
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync extended virtual synchrony service
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync configuration service
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync cluster config database access v1.01
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync profile loading service
Oct 11 15:12:39 ha3 corosync[3719]:   [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
Oct 11 15:12:39 ha3 corosync[3719]:   [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] entering GATHER state from 15.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Creating commit token because I am the rep.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Saving state aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Storing new sequence id for ring 10
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] entering COMMIT state.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] got commit token
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] entering RECOVERY state.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] position [0] member 192.168.55.13:
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] previous ring seq c rep 192.168.55.13
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] aru 0 high delivered 0 received flag 1
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Did not need to originate any messages in recovery.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] got commit token
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Sending initial ORF token
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 0, aru 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 1, aru 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 2, aru 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] token retrans flag is 0 my set retrans flag0 retrans queue empty 1 count 3, aru 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] install seq 0 aru 0 high seq received 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] retrans flag count 4 token aru 0 install seq 0 aru 0 0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Resetting old ring state
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] recovery to regular 1-0
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering to app 1 to 0
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] This node is within the primary component and will provide service.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] entering OPERATIONAL state.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 0 to 1
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 1 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization actions starting for (dummy CLM service)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 1 to 2
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 2 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 1
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 2 to 3
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 3 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Committing synchronization for (dummy CLM service)
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization actions starting for (dummy AMF service)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 2
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 3
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 3 to 4
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 4 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Committing synchronization for (dummy AMF service)
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization actions starting for (dummy CKPT service)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 4
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 4 to 5
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 5 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Committing synchronization for (dummy CKPT service)
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization actions starting for (dummy EVT service)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 5 to 6
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 6 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 5
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 6 to 7
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 7 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 6
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 7
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 7 to 8
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 8 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Committing synchronization for (dummy EVT service)
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization actions starting for (corosync cluster closed process group service v1.01)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering 8 to a
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq 9 to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq a to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [CPG   ] comparing: sender r(0) ip(192.168.55.13) ; members(old:0 left:0)
Oct 11 15:12:39 ha3 corosync[3719]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.55.13) ; members(old:0 left:0)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including 8
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering a to b
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq b to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] confchg entries 1
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier Start Received From 221751488
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Barrier completion status for nodeid 221751488 = 1. 
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Synchronization barrier completed
Oct 11 15:12:39 ha3 corosync[3719]:   [SYNC  ] Committing synchronization for (corosync cluster closed process group service v1.01)
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] mcasted message added to pending queue
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including a
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering b to c
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] Delivering MCAST message with seq c to pending delivery queue
Oct 11 15:12:39 ha3 corosync[3719]:   [MAIN  ] Completed service synchronization, ready to provide service.
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including b
Oct 11 15:12:39 ha3 corosync[3719]:   [TOTEM ] releasing messages up to and including c
Then, when I start pacemaker I see this in corosync's log:
Code:
Oct 11 15:14:09 ha3 crmd[3967]:    error: crmd_ais_dispatch: Recieving messages from a node we think is dead: ha3[221751488]
Oct 11 15:14:09 ha3 crmd[3967]:    error: do_log: FSA: Input I_ERROR from check_dead_member() received in state S_STARTING
Oct 11 15:14:09 ha3 crmd[3967]:  warning: do_state_transition: State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=check_dead_member ]
Oct 11 15:14:09 ha3 crmd[3967]:    error: do_recover: Action A_RECOVER (0000000001000000) not supported
Oct 11 15:14:09 ha3 crmd[3967]:    error: do_started: Start cancelled... S_RECOVERY
Oct 11 15:14:09 ha3 crmd[3967]:    error: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY
Oct 11 15:14:09 ha3 crmd[3967]:   notice: terminate_cs_connection: Disconnecting from Corosync
Oct 11 15:14:09 ha3 crmd[3967]:    error: do_exit: Could not recover from internal error
Oct 11 15:14:09 ha3 pacemakerd[3748]:    error: pcmk_child_exit: Child process crmd exited (pid=3967, rc=2)
 
Old 10-11-2012, 02:54 PM   #2
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Original Poster
Rep: Reputation: 83
By the way, ha3 is the node where I'm running the test... and it's the only node I'm running at the moment.
 
Old 10-11-2012, 02:55 PM   #3
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Original Poster
Rep: Reputation: 83
corosync.conf:

Code:
totem {
 
        version: 2
 
        # How long before declaring a token lost (ms)
        token:          5000
 
        # How many token retransmits before forming a new configuration
        token_retransmits_before_loss_const: 20
 
        # How long to wait for join messages in the membership protocol (ms)
        join:           1000
 
        # How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
        consensus:      7500
 
        # Turn off the virtual synchrony filter
        vsftype:        none
 
        # Number of messages that may be sent by one processor on receipt of the token
        max_messages:   20
 
        # Disable encryption
        secauth:        off
 
        # How many threads to use for encryption/decryption
        threads:        0
 
        # Limit generated nodeids to 31-bits (positive signed integers)
        clear_node_high_bit: yes
 
        # Optionally assign a fixed node id (integer)
        # nodeid:         1234
 
        interface {
                ringnumber: 0
 
                # The following three values need to be set based on your environment
                bindnetaddr: 192.168.55.13
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
 }
 
 logging {
        fileline: off
        to_syslog: yes
        to_stderr: no
        syslog_facility: daemon
        debug: on
        timestamp: on
 }
 
 amf {
        mode: disabled
 }
/usr/local/ha/etc/corosync/service.d/pcmk
Code:
service {
        # Load the Pacemaker Cluster Resource Manager
        name: pacemaker
        ver:  1
}
 
Old 10-11-2012, 03:33 PM   #4
eantoranz
Senior Member
 
Registered: Apr 2003
Location: Colombia
Distribution: Kubuntu, Debian, Knoppix
Posts: 1,982
Blog Entries: 1

Original Poster
Rep: Reputation: 83
By the way, when running crm_mon:

Code:
$ sudo ../../sbin/crm_mon -1
[sudo] password for cps: 
Last updated: Thu Oct 11 16:01:49 2012
Last change: Thu Oct 11 11:36:56 2012
Current DC: NONE
0 Nodes configured, unknown expected votes
0 Resources configured.
 
  


Reply

Tags
corosync, pacemaker


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
2 node clustering with corosync/pacemaker kirukan Linux - Software 0 09-01-2012 10:33 AM
Pacemaker Corosync DRBD - Problema after SplitBrain Yena Linux - Server 0 08-24-2012 01:35 AM
Debian Corosync/Pacemaker Cluster Frustrations mpapet Linux - Server 1 05-09-2012 12:40 AM
MySQL HA-cluster with DRBD, Pacemaker and Corosync Patric.F Linux - Server 2 01-28-2012 05:27 AM
LXer: Openfiler 2.99 Active/Passive With Corosync, Pacemaker And DRBD LXer Syndicated Linux News 0 04-29-2011 09:10 AM


All times are GMT -5. The time now is 06:05 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration