Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I'm having a bit of trouble setting up a linux cluster service and am hoping you can point me in the right direction. I'm a software developer and don't have much background with linux. So here's the details:
I have two laptops running Scientific Linux (basically redhat from my understanding), and set them up in a 2 node failover cluster. Got that working using conga- luci, ricci, and rgmanager.
My goal is to have the cluster manage and monitor the failover of a process- specifically a java program. I created a script to start the program and that runs fine on its own. The program is meant to run continuously, so the script doesn't ever exit. Using the cluster, I created a service and added the script as a resource. I'm having trouble running the service correctly though. I have tried two approaches with the service with varying results:
1. Simply ran the service with the script running in foreground. Script seems to start up the process, but the service seems to be stuck. If I refresh the page, it shows the service is running, but is then unable to stop the service. It is also unable to relocate the service.
2. Ran the service with the script running in the background. The service starts up fine, but it reruns the script every 10-20 seconds or so. So I end up with 5/6 processes running within the first minute.
So my question is can the cluster be used to monitor the process of my application? ideally if the process stops the cluster would try to restart or relocate it. if the cluster can be used, are there any ideas what i'm setting up wrong?
Also, is it the default behavior of a cluster service to rerun scripts that finish executing? I've been through the documentation but can't find anything useful on how a service with a script resource should be used- or any of the options available when configuring these.
The problem we had with one of our clusters was the service script had no status function built into it.
The Cluster service would get stupid as it would check the status and never get an answer.
We wrote in a small function to return the status and this let the cluster services check the status of the service script.
If you do a command like service network status it will return something about the network.
To check your application you can run (As root)
# service yourservicename status
does it return anything?
If it returns something you probably have other issues if it doesnt then look at the status portion of your script if it has one, if not it needs to be created. At minimum all services under cluster control have to have a start stop and status function.
I'll look into the service status, maybe its similar for a cluster and a regular service. Perhaps some of my confusion in setting this up is that I created a "cluster service" through conga. There is no actual service for my application on the two machines I'm running- so the command you gave me doesn't recognize the service I created in the cluster. I think I'm close to a workaround for my initial problem, but have run into another and perhaps you can help.
The application I'm running is supposed to listen at a port, 2280. Before the machines were clustered, this ran fine. Now that they're clustered, I can no longer find the port when the application is running. I've been using
netstat -an | grep 2280
and it hasn't returned with any port. My application had been set to default to run on the localhost. Any idea how the cluster may cause this to fail? I eventually need the cluster to manage a virtual ip, but for now it simply fails to run at all on the machines original ip.