Failover
Hi all,
Thank every one that replied to my post especially those who helped me solve my problems. The latest problem am having is, I am trying to setup a failover server for qmail server. One server is the promary and the second one is scondary. I want that, when the promary goes down, then the secondary should take over automatically with out the users knowing of failurs. In this case, my choice was heartbeat. But I did every thing I could but the is not working. Please any one have an idea what I can use us a failover? Or even how I can use heartbeat for the qmail server? Also, if any one have a script that can do replication some directories to a secondary qmail server as well as mysql. Thank you all for you assistant. Regards Emmanuel |
Hello,
Can you post your ha.cf and IP configuration. Also are you using ldirectord or pacemaker in combination with HeartBeat to offer virtual service? Post config of what you're using too. Have you checked your logs for errors regarding heartbeat (syslog)? If so what's that saying? I use Unison File Synchronizer on my servers to synchronize any changes I make. What version of MySQL are you using? It's most likely that you can set it up on both servers to replicate, either master-slave setup or master-master. Kind regards, Eric |
ha.cf file:
logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 ping 209.191.122.70 udpport 694 baud 38400 warntime 10 initdead 120 bcast eth0 node mail.domain.com node mail2.domain.com auto_failback on haresources: mail10.teledataict.com 0.0.0.0 mirror httpd qmail mysqld webmin I am not using any other thing apart from heartbeat. I will appreciate it if you can send me to why I can find sample of the unison ryncronizer configuration. |
Hi,
If you state you're not using anything apart from HeartBeat, what do you mean by that? You'll need to have at least one virtual server configured in order to offer it from the cluster. HeartBeat only monitors the status of a server, service, IP and so on. Basically if you only have HeartBeat configured you have the primary layer to build upon. If one server goes down the other will take over. But since you're not offering any virtual services, there is nothing else to see in my opinion. Did you install HeartBeat with Pacemaker, with ldirectord or with another resource manager? I imagine you're using ldirectord since you refer to haresources. Is that one line all you have in the haresources file? Doesn't seem correct to me since you're referring to 0.0.0.0 as IP while that should be the virtual IP you configure to offer the cluster service. If you're using ldirectord please post your ldirectord.cf file. Here you can find a great manual for Unison File Synchronizer. And here's a simple example of a configuration file: Code:
# Roots of the synchronization There are a lot more possibilities, have a look at the manual pointed to above. Kind regards, Eric |
I only use yum to install the heartbeat and created those .cf files my self.
In oder to achieve my aim, setting up two mail server which one should serve as primary and the other a the secondary, what do I need? |
Quote:
Heartbeat only takes care of the layer on which the other services are offered, so basically it just checks if the other node(s) are alive, maintains the virtual IP available, and so on. In order to have a setup like you want you'll need also a cluster resource manager in which you define your virtual services (http, mail, and so on). The previous load balancer/resource manager was ldirectord, now it's Pacemaker which has lot's of features. The two sites you need to visit to find all the necessary information are: Linux HA Pacemaker Kind regards, Eric |
Thank you so much. I will check and give you feed back what happen next.
|
I am having trouble installing pacemaker. It is telling me, pacemaker conflicts with heartbeat. Well, right now, I want to use ldirectord for the clustering. But my question is, do I need a three machine to do the, where one will serve as a virtual server, one as master and slave?
Sorry for my ignorance. |
Quote:
Don't apoligize ever for trying and willing to learn! There's no need for it, we all have to start somewhere. I imagine you got the HeartBeat package that comes with ldirectord included and that might be the reason why Pacemaker is complaining. Not sure about that though since I have yet to start with Pacemaker myself (migrate from ldirectord). You can set up a perfect testing environment with two machines, it all depends on your choice and the configuration you set up. You can have for example two identical nodes with the same config and OS. Both of them will be configured with HeartBeat and ldirectord and will be listening on the virtual IP. However the task of HeartBeat is to monitor that 'virtual layer' so that if the first server goes down, the second one automatically gets the virtual layer activated. Only one of the two machines can have the virtual IP assigned at the time, and ldirectord will run on the same server. Hope that makes it a bit clearer. If you need more info just ask and I'll try to explain more in detail (or with the configuration I have setup). Kind regards, Eric |
Thank you very much for your reply.
Actually, I removed the heartbeat and ldirectord from the system before installing the pacemaker. But what am thinking is that, may be the versions of heartbeat on the epel repo is different from the other repo. And both pacemaker and heartbeat is suppose to come from the epel repo together. This is getting interesting though. I am doing this project on two CentOS running qmail email server. I hope my earlier post on this explain all that I wanted to do already. I will be happy if you can give me sample of your setup from A to Z and some detail explanation, so that I can follow that achieve what am after. Happy SFD in advance!!!! |
I have configured the hearbeat and ldirectord. The web seem to be working that is failover, but in the case of the mail server, when I send mail from yahoo for example, the slave is suppose to recive the mail on behalf of the master when it is down, but that is not happening. Rather, the mails keep hanging till the master is up before it gets delivered. I can open the webmail alright and login, but just that the mails are not troping.
My ldirectord.cf on both machine: checktimeout=30 checkinterval=2 autoreload=yes logfile="/var/log/ldirectord.log" quiescent=no virtual=41.211.31.60:80 fallback=127.0.0.1:80 real=192.168.2.3:80 gate real=192.168.2.2:80 gate service=http httpmethod=GET receive="webserverisworking" persistent=100 scheduler=lblc protocol=tcp checktype=negotiate virtual=41.211.31.60:25 real=192.168.2.3:25 gate real=192.168.2.2:25 gate service=smtp httpmethod=GET receive="mailserverisworking" persistent=100 scheduler=lblc protocol=tcp checktype=negotiate virtual=41.211.31.60:110 real=192.168.2.3:110 gate real=192.168.2.2:110 gate service=pop httpmethod=GET receive="mailserverisworking" persistent=100 scheduler=lblc protocol=tcp checktype=negotiate virtual=41.211.31.60:143 real=192.168.2.3:143 gate real=192.168.2.2:143 gate service=imap httpmethod=GET receive="mailserverisworking" persistent=100 scheduler=lblc protocol=tcp checktype=negotiate haresources: mail10.teledataict.com ldirectord::ldirectord.cf LVSSyncDaemonSwap::master IPaddr2::41.211.31.60/25/eth1/41.211.31.127 qmailctl ha.cf: debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 ping 209.191.122.70 udpport 694 baud 38400 warntime 10 initdead 120 bcast eth0 node mail10.teledataict.com node mail2.dot.com.gh auto_failback on respawn hacluster /usr/lib/heartbeat/ipfail ifconfig (Master): eth0 Link encap:Ethernet HWaddr 00:1D:09:10:C0:5F inet addr:192.168.2.3 Bcast:192.168.2.255 Mask:255.255.255.0 inet6 addr: fe80::21d:9ff:fe10:c05f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:68 errors:0 dropped:0 overruns:0 frame:0 TX packets:84 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:10549 (10.3 KiB) TX bytes:20849 (20.3 KiB) Interrupt:16 Memory:dfbf0000-dfc00000 eth1 Link encap:Ethernet HWaddr 00:15:E9:43:C3:C1 inet addr:41.211.31.60 Bcast:41.211.31.127 Mask:255.255.255.128 inet6 addr: fe80::215:e9ff:fe43:c3c1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:195421996 errors:0 dropped:0 overruns:0 frame:0 TX packets:21251583 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3866922044 (3.6 GiB) TX bytes:3410977679 (3.1 GiB) Interrupt:22 Base address:0xcf00 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:212613 errors:0 dropped:0 overruns:0 frame:0 TX packets:212613 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:34993565 (33.3 MiB) TX bytes:34993565 (33.3 MiB) ifconfig (Slave): eth0 Link encap:Ethernet HWaddr 00:0D:88:F4:87:DD inet6 addr: fe80::20d:88ff:fef4:87dd/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:2929547 errors:0 dropped:0 overruns:0 frame:0 TX packets:2931094 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:225039794 (214.6 MiB) TX bytes:227110561 (216.5 MiB) Interrupt:22 Base address:0x8f00 eth1 Link encap:Ethernet HWaddr 00:1D:09:31:3F:76 inet addr:41.211.4.108 Bcast:41.211.4.111 Mask:255.255.255.248 inet6 addr: fe80::21d:9ff:fe31:3f76/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1972567 errors:0 dropped:0 overruns:0 frame:0 TX packets:98015 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:193390299 (184.4 MiB) TX bytes:16102523 (15.3 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:58148 errors:0 dropped:0 overruns:0 frame:0 TX packets:58148 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:16702337 (15.9 MiB) TX bytes:16702337 (15.9 MiB) peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:2605794 errors:15929 dropped:0 overruns:0 frame:674 TX packets:170634 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:294586487 (280.9 MiB) TX bytes:29368602 (28.0 MiB) Interrupt:16 Memory:dfbf0000-dfc00000 vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:170442 errors:0 dropped:0 overruns:0 frame:0 TX packets:2593744 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:28476730 (27.1 MiB) TX bytes:282699585 (269.6 MiB) virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:139 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:29821 (29.1 KiB) xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF eth0 Link encap:Ethernet HWaddr 00:0D:88:F4:87:DD inet6 addr: fe80::20d:88ff:fef4:87dd/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:2929547 errors:0 dropped:0 overruns:0 frame:0 TX packets:2931094 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:225039794 (214.6 MiB) TX bytes:227110561 (216.5 MiB) Interrupt:22 Base address:0x8f00 eth1 Link encap:Ethernet HWaddr 00:1D:09:31:3F:76 inet addr:41.211.4.108 Bcast:41.211.4.111 Mask:255.255.255.248 inet6 addr: fe80::21d:9ff:fe31:3f76/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1972567 errors:0 dropped:0 overruns:0 frame:0 TX packets:98015 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:193390299 (184.4 MiB) TX bytes:16102523 (15.3 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:58148 errors:0 dropped:0 overruns:0 frame:0 TX packets:58148 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:16702337 (15.9 MiB) TX bytes:16702337 (15.9 MiB) peth1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:2605794 errors:15929 dropped:0 overruns:0 frame:674 TX packets:170634 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:294586487 (280.9 MiB) TX bytes:29368602 (28.0 MiB) Interrupt:16 Memory:dfbf0000-dfc00000 vif0.1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:170442 errors:0 dropped:0 overruns:0 frame:0 TX packets:2593744 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:28476730 (27.1 MiB) TX bytes:282699585 (269.6 MiB) virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:139 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:29821 (29.1 KiB) xenbr1 Link encap:Ethernet HWaddr FE:FF:FF:FF:FF:FF UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:2252932 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:159853838 (152.4 MiB) TX bytes:0 (0.0 b) UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 RX packets:2252932 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:159853838 (152.4 MiB) TX bytes:0 (0.0 b) Please any help? |
Hi,
I decided not to use heartbeat and ldirectord for the failover any more, due to the difficulties am facing. Right now, I used keepalived and it is fantastic. The only problem I have with it right now and will need help is that, when I take off the network cable from the master, the slave is able to detect that, the interface is down, takeover is not taking place. But when I stop the keepadlived service on the master or shatdown, the slave is able to takeover and act as the master until I start the keepalived or the machine. my configurations is as follows: Master:/etc/keepalived/keepalived.conf Configuration File for Keepalived # Global Configuration global_defs { notification_email { emmanuel.buamah@teledataict.com } notification_email_from keepalived@mail10.teledataict.com smtp_server localhost smtp_connect_timeout 30 router_id LVS_MASTER # string identifying the machine } # describe virtual service ip vrrp_instance VI_1 { # initial state state MASTER interface eth0 # arbitary unique number 0..255 # used to differentiate multiple instances of vrrpd virtual_router_id 1 # for electing MASTER, highest priority wins. # to be MASTER, make 50 more than other machines. priority 100 authentication { auth_type PASS auth_pass 1q2w3e } virtual_ipaddress { 41.211.31.60/25 dev eth1 } # Invoked to master transition notify_master "/etc/keepalived/bypass_ipvs.sh del 41.211.31.60" # Invoked to slave transition notify_backup "/etc/keepalived/bypass_ipvs.sh add 41.211.31.60" # Invoked to fault transition notify_fault "/etc/keepalived/bypass_ipvs.sh add 41.211.31.60" } # describe virtual mail server virtual_server 41.211.31.60 25 { delay_loop 15 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.2.3 25 { TCP_CHECK { connect_timeout 3 } } } # describe virtual web server virtual_server 41.211.31.60 80 { delay_loop 30 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.2.3 80 { HTTP_GET { url { path /var/www/html/index.html digest 9c740d4a0e350371229915359a614d63 } connect_timeout 3 nb_get_retry 3 delay_before_retry 2 } } real_server 192.168.2.2 80 { HTTP_GET { url { path /var/www/html/index.html digest 98049d2520fe2ae1cfefc647b9e7e95d } connect_timeout 3 nb_get_retry 3 delay_before_retry 2 } } } And on the slave: # Configuration File for Keepalived # Global Configuration global_defs { notification_email { emmanuel.buamah@teledataict.com } notification_email_from keepalived@mail2.dot.com.gh smtp_server localhost smtp_connect_timeout 30 router_id LVS_MASTER # string identifying the machine } # describe virtual service ip vrrp_instance VI_1 { # initial state state BACKUP interface eth0 # arbitary unique number 0..255 # used to differentiate multiple instances of vrrpd virtual_router_id 1 # for electing MASTER, highest priority wins. # to be MASTER, make 50 more than other machines. priority 50 authentication { auth_type PASS auth_pass 1q2w3e } virtual_ipaddress { 41.211.31.60/25 dev eth1 } # Invoked to master transition notify_master "/etc/keepalived/bypass_ipvs.sh del 41.211.31.60" # Invoked to slave transition notify_backup "/etc/keepalived/bypass_ipvs.sh add 41.211.31.60" # Invoked to fault transition notify_fault "/etc/keepalived/bypass_ipvs.sh add 41.211.31.60" } # describe virtual mail server virtual_server 41.211.31.60 25 { delay_loop 15 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.2.3 25 { TCP_CHECK { connect_timeout 3 } } real_server 192.168.2.2 25 { TCP_CHECK { connect_timeout 3 } } } # describe virtual web server virtual_server 41.211.31.60 80 { delay_loop 30 lb_algo rr lb_kind DR persistence_timeout 50 protocol TCP real_server 192.168.2.3 80 { HTTP_GET { url { path /var/www/html/index.html digest 9c740d4a0e350371229915359a614d63 } connect_timeout 3 nb_get_retry 3 delay_before_retry 2 } } real_server 192.168.2.2 80 { HTTP_GET { url { path /var/www/html/index.html digest 98049d2520fe2ae1cfefc647b9e7e95d } connect_timeout 3 nb_get_retry 3 delay_before_retry 2 } } } Any one to help? |
All times are GMT -5. The time now is 11:27 AM. |