LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Search this Thread
Old 05-22-2008, 01:23 PM   #1
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Rep: Reputation: 30
heartbeat cluster help


I am experimenting with cluster of 2 linux machines (nlb0 & nlb2) using heartbeat for httpd service. Each machine has 2 eternet cards, et0 connects to public network, and et1 is connect to a local router (10.0.0.1 for nlb0, and 10.0.0.2 for nlb2)

I ran into a problem that i did not understand.
When i start a heartbeat service on nlb0,
the machine creates an alias ip, that is
et0:0 192.168.0.200 and start a httpd service.
I can access a web service from other computer with that ip (http://192.168.0.200)

On the nlb2, i start the heartbeat service, once it starts,
the nlb2 takes over the ip et0:0 192.168.0.200.
Now, I can still access the web service of nlb0 only with http://nlb0/index.html, but the http://192.168.0.200 is now on nlb2.

I am confused because the nlb0 has not been down. I though the nlb2 will take the ip only when nlb0 is down.

I do more experiment by restart the heartbeat service on nlb0.
Once the heartbeat service start, the nlb0 take back the ip 192.168.0.200
Of course, nlb2 has not been down.

Can anyone help me understand the concept? The setting is below.

haresource (same for both nlb0 and nlb2)
Code:
nlb0  192.168.0.200 httpd

ha.cf (nlb2 has the same file except, "ucast eth1 10.0.0.1")
Code:
debugfile /var/log/ha-debug
logfile /var/log/ha-log
keepalive 2
deadtime 20
warntime 10
initdead 80
udpport 694
ucast eth1 10.0.0.2
auto_failback on
node    nlb0
node    nlb2
#
ping 192.168.0.54
respawn hacluster /usr/lib/heartbeat/ipfail
 
Old 05-23-2008, 10:45 AM   #2
p_s_shah
Member
 
Registered: Mar 2005
Location: India
Distribution: RHEL 3/4, Solaris 8/9/10, Fedora 4/8, Redhat Linux 9
Posts: 228
Blog Entries: 1

Rep: Reputation: 34
Upto my understanding, HA should work in following way:

I think you have configure nlb0 as a Primary node.

So, The time nlb0 is working it should be primary responsible for http requests.
Please note that nlb2 is also running Heartbeat at the same time and running it in passive mode.

As soon as, nlb0 goes down, nlb2 should take over 192.168.0.200 IP and satisfy http requests.

Now, as you have mentioned "auto_failback on", in your config file, whenever nlb0 is up, it will take over IP from nlb2 automatically.

I think the way you are testing the HA is wrong. You are stopping/starting Heartbeat service. In which case, HA will not be able to communicate with other node.

If you want to test the application, do it following way :
1. Make sure Heartbeat service is running on both the servers all the time.
2. Start to monitor logs on both the severs: tail -f /var/log/messages
Now, Heartbeat is checking ip in 10.0.0.? range for connectivity. So, just bring down the 10.0.0.1 (nlb0) and check logs on nlb2. It should show that IP take over is done.
3. Now bring the 10.0.0.1 (nlb0) up and check logs on both the servers again.

I hope this will help you out.
If any doubt, please reply back.
 
Old 05-23-2008, 04:56 PM   #3
ufmale
Member
 
Registered: Feb 2007
Posts: 385

Original Poster
Rep: Reputation: 30
I think taking down the 10.0.0.? is kind of unrealistic. What I did was taking down the network service of nlb0, i.e. service network restart.

Once I did the nlb2 pick up the httpd service. However, eventhough the nlb0 came back alive (after the network service restarted), nlb0 never pick up the httpd service back from the nlb2.

By the way, i changed the auto_failback to off on the nlb2's ha.cf
 
Old 05-23-2008, 07:16 PM   #4
p_s_shah
Member
 
Registered: Mar 2005
Location: India
Distribution: RHEL 3/4, Solaris 8/9/10, Fedora 4/8, Redhat Linux 9
Posts: 228
Blog Entries: 1

Rep: Reputation: 34
First thing, upto my experience, files should be identical on both the servers.

Secondly, If you mentioned auto_failback off, then nlb0 won't pick up http requests automatically, you have to manually reassign IP addresses for the same.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
heartbeat cluster delayed response time problem...... raklo Linux - Software 1 02-17-2008 11:07 PM
heartbeat cluster configuration problem!!!!!! raklo Linux - Software 0 01-31-2008 05:43 AM
Cluster accounting failed at 135593 (0x211a9): missing cluster in $Bitmap fakie_flip Linux - Software 1 01-02-2008 03:08 AM
heartbeat cluster on different subnets blanks Linux - Networking 1 04-03-2007 03:50 PM
High availability Samba cluster DRBD + Heartbeat djalex Linux - Networking 3 09-05-2006 01:14 PM


All times are GMT -5. The time now is 11:22 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration