Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
RHEL4 U3 (Kernel version 2.6.8-34)
Cluster Suite ( CMan kernel/headers 2.6.9-43.8 )
GFS shared storage
I have created a simple 2 node cluster running Apache httpd server. When it starts up as normal the virtual IP is in place and the apache daemon is running on the 'owning' server. However whenever I fail over, the floating IP doesn't get bonded to the standby server, and the apache daemon never starts on that standby server.
I was also having deadlocks between CMan and RGManager and found that this was due to a known and fixed (sort of) bug in RHEL4U3 and Cman so I upgraded them to the following:
RHEL4 U3 kernel version 2.6.9-34.0.1
CMan kernel/headers 2.6.9-43.8.3
The lockups stopped but the initial problem persists (floating IP and service not relocated).
as root on the failover server:
run a cron script that checks for the primary server to respond.
[ ie ping ip number, with an echo request, set your primary to reply to the failover server ip to echo request ]
if no reply, the script runs apachectl start, sends and email to alter to server failure and exits.
just configure the failover to have the floating ip in apache's httpd.conf. it only runs when the primary is down so having the ip assigned in the conf isn't an issue.
as root on the failover server:
run a cron script that checks for the primary server to respond.
[ ie ping ip number, with an echo request, set your primary to reply to the failover server ip to echo request ]
if no reply, the script runs apachectl start, sends and email to alter to server failure and exits.
just configure the failover to have the floating ip in apache's httpd.conf. it only runs when the primary is down so having the ip assigned in the conf isn't an issue.
Thanks - you're basically saying "write our own clustering app"? I actually considered this but we have to use RHCS.
Anyway once I updated the kernel, rgmanager and cman, I tried again and now I find that the services aren't listed at all!
I rolled back the kernel and CMan versions to the following:
RHEL4 U3 (Kernel version 2.6.8-34)
Cluster Suite ( CMan kernel/headers 2.6.9-43.8 )
While keeping the RG Manager version at rgmanager-1.9.54-1. This combination enables me to view and manage the services but the failover problem persists...
I've never used the RHCs, I've just used a shell script and cron job. completely reliable and simple to do.
With the bonus of adding needed functionality to the shell script is simple, such as alerting the tech on shift that the primary server failed.
But it does sound like your RHCS config is most likely the issue, since it's designed to perform the same tasks as the simple shell script / cron job method.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.