LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Linux cluster service- trouble running script (https://www.linuxquestions.org/questions/linux-newbie-8/linux-cluster-service-trouble-running-script-923261/)

LostPrincess 01-11-2012 10:29 AM

Linux cluster service- trouble running script
 
Hey guys,

I'm having a bit of trouble setting up a linux cluster service and am hoping you can point me in the right direction. I'm a software developer and don't have much background with linux. So here's the details:

Setup
I have two laptops running Scientific Linux (basically redhat from my understanding), and set them up in a 2 node failover cluster. Got that working using conga- luci, ricci, and rgmanager.

My goal is to have the cluster manage and monitor the failover of a process- specifically a java program. I created a script to start the program and that runs fine on its own. The program is meant to run continuously, so the script doesn't ever exit. Using the cluster, I created a service and added the script as a resource. I'm having trouble running the service correctly though. I have tried two approaches with the service with varying results:

1. Simply ran the service with the script running in foreground. Script seems to start up the process, but the service seems to be stuck. If I refresh the page, it shows the service is running, but is then unable to stop the service. It is also unable to relocate the service.

2. Ran the service with the script running in the background. The service starts up fine, but it reruns the script every 10-20 seconds or so. So I end up with 5/6 processes running within the first minute.


So my question is can the cluster be used to monitor the process of my application? ideally if the process stops the cluster would try to restart or relocate it. if the cluster can be used, are there any ideas what i'm setting up wrong?

Also, is it the default behavior of a cluster service to rerun scripts that finish executing? I've been through the documentation but can't find anything useful on how a service with a script resource should be used- or any of the options available when configuring these.

Thanks for any help you can give!

Dean Guilberry 01-17-2012 08:22 AM

I had a similar situation
 
The problem we had with one of our clusters was the service script had no status function built into it.
The Cluster service would get stupid as it would check the status and never get an answer.
We wrote in a small function to return the status and this let the cluster services check the status of the service script.
If you do a command like service network status it will return something about the network.

To check your application you can run (As root)
# service yourservicename status

does it return anything?
If it returns something you probably have other issues if it doesnt then look at the status portion of your script if it has one, if not it needs to be created. At minimum all services under cluster control have to have a start stop and status function.


I hope this helps.

LostPrincess 01-17-2012 09:33 AM

Thanks Dean,

I'll look into the service status, maybe its similar for a cluster and a regular service. Perhaps some of my confusion in setting this up is that I created a "cluster service" through conga. There is no actual service for my application on the two machines I'm running- so the command you gave me doesn't recognize the service I created in the cluster. I think I'm close to a workaround for my initial problem, but have run into another and perhaps you can help.

The application I'm running is supposed to listen at a port, 2280. Before the machines were clustered, this ran fine. Now that they're clustered, I can no longer find the port when the application is running. I've been using

netstat -an | grep 2280

and it hasn't returned with any port. My application had been set to default to run on the localhost. Any idea how the cluster may cause this to fail? I eventually need the cluster to manage a virtual ip, but for now it simply fails to run at all on the machines original ip.

Thanks again for your help!


All times are GMT -5. The time now is 03:47 AM.