LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices



Reply
 
Search this Thread
Old 09-30-2008, 09:41 AM   #1
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 655

Rep: Reputation: 40
Fence question with Red Hat Cluster Suite


Hello.


I'm testing Red Hat Cluster Suite for high availability managment on one of my clusters. For now I'm managing only Apache in the cluster, just so that I can get to know the clustering software.

Anyways: The nodes are not set up with any shared storage, and will ramain so in the future, so I've simply omitted setting up fencing. My understanding is that fencing is used for protecting shared resources, such as shared filesystems.

Having omitted fencing, I do experience a lot of problem regarding it. For example, I did a "ifdown eth0" on node A, which caused node B to take over the httpd service. But then node B started trying to fence node A (why and how I'm not sure), resulting in numerous "fencing node A...fence failed" messages in /var/log/messages.

Does anyone know if fencing is required for a setup like mine? And why do I get a lot of fencing errors even when I don't have any fencing defined for the cluster?


Regards,
kenneho
 
Old 09-30-2008, 10:21 AM   #2
brianmcgee
Member
 
Registered: Jun 2007
Location: Munich, Germany
Distribution: RHEL, CentOS, Fedora, SLES (...)
Posts: 399

Rep: Reputation: 38
The cluster communication uses a heartbeat to detect if all nodes are alive.

If you disconnect a node with force (e.g. removing the network link) the cluster notices this and wants to make sure that the failed node is dead for sure. Thus it tries to fence the node to get back to a known state.

That behaviour is not only useful if you configure GFS but also an important necessity for a working application failover when using rgmanager.

In case that a application is hung the fencing of the cluster node makes sure that there are no complications with the application switchover to another node.

In your case maybe keepalived [1] is a better choice. There the failing node gets removed from the loadbalancing.

[1] http://www.keepalived.org/
 
Old 10-02-2008, 10:03 AM   #3
kenneho
Member
 
Registered: May 2003
Location: Oslo, Norway
Distribution: Ubuntu, Red Hat Enterprise Linux
Posts: 655

Original Poster
Rep: Reputation: 40
Quote:
Originally Posted by brianmcgee View Post
The cluster communication uses a heartbeat to detect if all nodes are alive.

If you disconnect a node with force (e.g. removing the network link) the cluster notices this and wants to make sure that the failed node is dead for sure. Thus it tries to fence the node to get back to a known state.

That behaviour is not only useful if you configure GFS but also an important necessity for a working application failover when using rgmanager.

In case that a application is hung the fencing of the cluster node makes sure that there are no complications with the application switchover to another node.

In your case maybe keepalived [1] is a better choice. There the failing node gets removed from the loadbalancing.

[1] http://www.keepalived.org/

Thank you for your quick outline.

Regarding my scenario, do you know what node B is actually trying to do when it tries to fence node A? I know that I can define different fencing methods, like reboot and so on, but as I've not defined any fencing methods I'm not sure what node B is trying to do.

I did a failover test in which I on node A did a "ifdown eth0". As this was the only link from node A to the rest of the world (including node B) node B wasn't able to really fence node A. But node A restarted all by itself. Is this the default behavior - nodes that loose contact with the cluster reboots themselves?

And one last thing: I did a "shutdown -h now" on node A, and it started the shutdown process. But when it came to bringing down the cluster software it hung on "Stopping fencing...". To kill the machine I had to push the power off button. Why did the node hang on this?

Phew, this was a lot of info and questions, but I would very much appreciate some input on this. I need to get a better understanding of how fencing actually works, and I haven't found any good resources on this particular subject.


Regards,
kenneho
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
red hat cluster suite and mysql stefano65 Linux - Enterprise 6 09-12-2010 11:45 AM
Red Hat cluster suite subscription for RHEL 5 server ??? LinuxLover Linux - Enterprise 1 07-18-2008 09:33 AM
How to use Red Hat Cluster Suite for Apache guages7 Linux - Enterprise 6 12-19-2007 05:49 AM
red hat cluster suite version 5 renegade7 Linux - Software 0 06-27-2007 10:17 AM
MySQL InnoDB Failover and Red Hat Cluster Suite stefano65 Red Hat 1 11-18-2006 02:35 PM


All times are GMT -5. The time now is 05:16 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration