Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I run a couple programs continuously that take a lot of command-line parameters to start up, so instead of trying to remember all them every time I wanted to start the program, I just put it all in a couple bash scripts. Makes it real easy.
Anyway, the problem I'm having is that occasionally the programs will crash (Segmentation fault). There's currently no fix out there for the crash, and the crash is pretty mild - the program just stop running, and I have to log into my linux box and restart it manually. Then it runs for a few more days before crashing again.
Anyway, I was wondering - is there some way to write a script that will detect when a program is no longer running, and automatically restart it? I'm still learning the ins and outs of bash scripting, so explain it to me slowly please.
The ps command will tell you if a program is still running. For example, let's say the program is called 'foo'. To see if the program is running, you can:
ps -Afl | grep foo
This will return a line containging information about the 'foo' program, or it will return null (nothing), if the foo program is not found.
For a bash script, it would be useful to have this information in a variable. The accent characters in bash allow you to take the output of a command and place it into a variable. So for example:
isFooRunning=`ps -Afl | grep foo`
The variable isFooRunning will now contain either the output from the ps command, or null. You can then test the result:
Code:
#!/bin/bash
isFooRunning=`ps -Afl | grep foo`
if [ "$isFooRunning" == "" ]
then
# foo failed, restart
logger -t foo.fail "foo failed - auto restart"
foo &
fi
The above will check to see if foo is running and, if not, restart it (and generate a log entry in /var/log/messages). The '&' at the end of the foo command will run foo in the background (allowing the script to continue execution.
You can cron the above on the userid that should run 'foo', to check every minute by adding a line similar to the following to the crontab (with the 'crontab -e' command):
Wow! Thanks for the help! However now I have a couple more questions...
About the cron jobs - how do I go about adding this script to my user's crontab? I'm running FC3, so I'm assuming there's also a GUI utility that will make this easier than doing it from the command line, but I'm perfectly capable of either.
Also, is there some sort of string manipulation "library" for bash scripts? I ask because I run multiple instances of the same program, just with different command line parameters, so for example if one instance of the program crashes but the other is still running, I will need to be able to detect (from the 'ps -Afl | grep foo' output) which one is running through substring matching.
Originally posted by Magsol Wow! Thanks for the help! However now I have a couple more questions... :p
About the cron jobs - how do I go about adding this script to my user's crontab? I'm running FC3, so I'm assuming there's also a GUI utility that will make this easier than doing it from the command line, but I'm perfectly capable of either.
As mentioned above, issue the 'crontab -e' command on the command line, and add a line like the sample above. If you'd like a GUI crontab editor, you can try this one. Using a command line text editor like vi isn't to bad for the crontab. Just keep in mind that the fields are:
Code:
field allowed values
----- --------------
minute 0-59
hour 0-23
day of month 1-31
month 1-12 (or names)
day of week 0-7 (0 or 7 is Sun, or use names)
command
An asterisk ("*") means "all" for the date/time fields, so the sample above:
* * * * * /path/to/fooCheck
means run the command every minute of every hour of every day of every month of every weekday.
Quote:
Also, is there some sort of string manipulation "library" for bash scripts? I ask because I run multiple instances of the same program, just with different command line parameters, so for example if one instance of the program crashes but the other is still running, I will need to be able to detect (from the 'ps -Afl | grep foo' output) which one is running through substring matching.
Thanks so much for your help! :D
The easiest way is probably to use awk to parse out specific parameters of interest. For example:
Code:
isFooRunning=`ps -Afl | grep foo`
# Now get the first three parameters to foo
parm1=`echo $isFooRunning | awk '{print $16}'`
parm2=`echo $isFooRunning | awk '{print $17}'`
parm3=`echo $isFooRunning | awk '{print $18}'`
In the above, the awk command is parsing the 'words' (white space delimited strings); the 15th word in the ps command output is the name of the program (foo), and the following words are the parameters.
Both are excellent suggestions - thanks macemoneta and Vgui!
Vgui, I'm intrigued by your suggestion. Would that actually work? And what does the $? stand for? Your suggestion is very elegant, void of needing to alter the crontab, but the crontab is a tried-and-true method for this sort of thing, one which would be fairly easy to implement.
That was just off the top of my head, but after some revision there is a minor change you would need to make. Something like the following works (if you have xmessage you can copy it, run it, and see for yourself)
Code:
# test.sh
#!/bin/bash
xmessage Test
if [ $? != 0 ]; then
echo "Program quit uncleanly, restarting"
bash test.sh
fi
Now, how this work is fairly simple. It will run whatever program you want, in this case xmessage. If xmessage is killed or crashes (try
Code:
killall xmessage
) then the script will restart. If you naturally exit the program (with Ctrl+C from the same terminal) nothing happens.
This is done because that little
Code:
$?
is the exit status of the last run program. So basically it gets the return status of xmessage. If the program follows standard guidelines, 0 will mean a clean exit, anything else means something went wrong but the program still quit and so it is restarted.
The only downside is that you need to keep it running in it's own terminal process (which happens when we remove the
Code:
&
after
Code:
xmessage Test
).
Hopefully that helps and explains it a bit better. You could probably expand on it or try one of the other suggestions here too.
I just tried something else, and that is running the whole script in a seperate process with
Code:
bash test.sh &
That works, and frees up your terminal, but may give you problems actually trying to shutdown the program, since you can't Ctrl+C it (since it is freed from your terminal).
You also might need to refine it to exit the previous script before running the new one (I thought it did this automatically) otherwise you end up with a bunch of "bash test.sh" processes running.
Gives you some good options and stuff to experiment with though.
I should have just tested more before posting, since I just found out something new again. I would recommend adding an exit after the rerun script command, to stop new processes spawning over and over.
Code:
# test.sh
#!/bin/bash
xmessage Test
if [ $? != 0 ]; then
echo "Program quit uncleanly, restarting"
bash test.sh ; exit
fi
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.