LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
LinkBack Search this Thread
Old 02-15-2012, 10:34 AM   #1
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Rep: Reputation: Disabled
Thumbs down Redhat cluster malfuctioning


Hi experts,

I am new to this forum, the reason why i am in this forum now is because of a production server cluster related issue that makes me commpletly disturbed.I am new to Redhat cluster as well.

This is a 2 node cluster,Operating system installed on these node is RHEL 5.3.

The system was running fine until last week, well things changed all of a sudden by making one of the node(node2) in 2 node cluster offline.

All the cluster related services were hung and the server was in a state not to reboot.I had to kill rgmmanager service to reboot the server, however the system rebooted and came up in cluster mode which made the other node (node1) off-line.

All that i understood from this was the cluster was unable to keep both the nodes on-line simultaneously.The same happened when i rebooted the node1,which killed the node2 upon its reboot.

I have now kept the node2 down in order to run the production application installed in this server.

Looking forward to your valuable reply as this is a really concerned issue for me which is in production environment.

Logs from node1 when the node2 was booted into cluster is pasted here for your ready reference.
MESSAGE FILE OUTPUT
---------------------

Feb 2 15:06:39 htbapp1 openais[3840]: [SYNC ] This node is within the primary component and will provide service.
Feb 2 15:06:39 htbapp1 kernel: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz stepping 05
Feb 2 15:06:39 htbapp1 openais[3840]: [TOTEM] entering OPERATIONAL state.
Feb 2 15:06:39 htbapp1 kernel: Brought up 8 CPUs
Feb 2 15:06:39 htbapp1 openais[3840]: [MAIN ] Killing node htbapp2.ksebnet.com because it has rejoined the cluster with existing state
Feb 2 15:06:39 htbapp1 kernel: testing NMI watchdog ... OK.
Feb 2 15:06:40 htbapp1 kernel: time.c: Using 14.318180 MHz WALL HPET GTOD HPET/TSC timer.
Feb 2 15:06:40 htbapp1 kernel: time.c: Detected 2266.835 MHz processor.


Thanks in advance
Sree
 
Old 02-15-2012, 11:41 AM   #2
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6
Posts: 1,244

Rep: Reputation: 394Reputation: 394Reputation: 394Reputation: 394
Contact RedHat, that's what you're paying the support for.
 
1 members found this post helpful.
Old 02-16-2012, 01:08 AM   #3
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
My contract has been expired on last month.
 
Old 02-16-2012, 04:03 AM   #4
John VV
Guru
 
Registered: Aug 2005
Posts: 12,107

Rep: Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590Reputation: 1590
see post 9 on your other thread
https://www.linuxquestions.org/quest...7/#post4603818
 
Old 02-20-2012, 01:19 AM   #5
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
Has anyone got any clue about this issue ??

Sree
 
Old 02-20-2012, 10:02 AM   #6
rhbegin
Member
 
Registered: Oct 2003
Location: Arkansas, NWA
Distribution: Fedora/CentOS/SL6
Posts: 381

Rep: Reputation: 23
Is this an httpd cluster?
 
Old 02-20-2012, 11:13 PM   #7
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by rhbegin View Post
Is this an httpd cluster?
Nopes.The application running in this server is jboss, is a production system.

Never matter what application is running in the cluster, issue which makes me paranoid is with cluster processes that tends not to work simultaneously on both the nodes.

sree
 
Old 02-21-2012, 01:34 PM   #8
rhbegin
Member
 
Registered: Oct 2003
Location: Arkansas, NWA
Distribution: Fedora/CentOS/SL6
Posts: 381

Rep: Reputation: 23
I have setup a jboss server in RHEL5 x86_64 but it has been a couple of years, if I remember correctly it was challenging as the setup was pretty complex.

Do you have support with Red Hat they have jboss support, when I first started down this path I had to use Red Hat support since it was new (to me) and the company.
 
Old 02-22-2012, 05:51 AM   #9
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
Wink

Quote:
Originally Posted by rhbegin View Post
I have setup a jboss server in RHEL5 x86_64 but it has been a couple of years, if I remember correctly it was challenging as the setup was pretty complex.

Do you have support with Red Hat they have jboss support, when I first started down this path I had to use Red Hat support since it was new (to me) and the company.
The reason why i posted this thread here is coz the support with RHEL has been expired on last Nov and this problem was happened on last month. So obviously i had to seek help from linux experts who is playing right here. This seems to be a cluster BUG and i have no idea how to get rid of this.

sree
 
Old 02-22-2012, 10:05 AM   #10
rhbegin
Member
 
Registered: Oct 2003
Location: Arkansas, NWA
Distribution: Fedora/CentOS/SL6
Posts: 381

Rep: Reputation: 23
If it is a bug, could you migrate over to CentOS with your existing config's where it is possible to download updates.

This way you could work towards a problem resolution if you cannot download updates, just something to throw out there.

As with any software clustering suites, they can be very complex and you may have to break down and purchase support if it is a production system. You have to weigh the cost of being down vs. paying for 1 year to get the help on it.
 
Old 02-23-2012, 04:28 AM   #11
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
Smile

Quote:
Originally Posted by rhbegin View Post
If it is a bug, could you migrate over to CentOS with your existing config's where it is possible to download updates.

This way you could work towards a problem resolution if you cannot download updates, just something to throw out there.

As with any software clustering suites, they can be very complex and you may have to break down and purchase support if it is a production system. You have to weigh the cost of being down vs. paying for 1 year to get the help on it.

since this is a production system, I cannot go for os switch. I would possibly convince my manager to go for support renewal. But i wonder if I could get the right resolution method from here.
 
Old 04-18-2012, 01:03 AM   #12
sree.m
Member
 
Registered: Feb 2012
Posts: 56

Original Poster
Rep: Reputation: Disabled
Hi Guys,

This issue has been resolved !!! The culprit was "acpid" (power management)daemon that is not supposed to be running in cluster which caused the cluster nodes to mal-function. cluster started working perfect after the acpid daemon stopped in the startup.

Many thanks for your great tries and helps.

Rgrds,
Sree
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RedHat Cluster jnreddy Linux - Server 1 03-20-2011 11:17 PM
Redhat cluster sikander56k Linux - Newbie 3 04-21-2010 06:13 AM
Create N+1 cluster in redhat using redhat cluster software ranadeep Linux - Enterprise 2 04-03-2010 08:45 PM
redhat cluster Ammad Linux - Server 0 12-13-2009 10:17 AM
[SOLVED] Cluster two RedHat machines procfs Linux - Newbie 2 06-07-2007 12:50 AM


All times are GMT -5. The time now is 04:10 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration