Help with "wait" or "timeout" in bash
Hello,
I've read the man page for wait and timeout but I'm new, and don't quite understand how to use it in my application. I have a set of batch scripts that call each other, and I want to kill a child (grandchild?) process if it runs for too long. Then, after all processes are finished, I want to echo that the script is finished. So, here's my structure. The top-level script is called "master" Code:
echo "START TIME\t\t\t\t\t\t" $(date +%T) > ../output/timelog Code:
./run_1 & now, each run_* batch script is something like this: Code:
cd ../target/1 By running master, multiple master_2 scripts are called in series. Each master_2 script calls some run_* scripts in parallel, and each run_* script calls some child scripts in series, which then call some youngestchild programs in series. The youngestchild program typically takes a few minutes, but if there are errors it can go for hours. So, I would like to use the wait (or timeout) command in one of the upper level scripts (preferably master_2) to tell the system to kill any child/grandchild/great-grandchild process that lasts longer than 10 minutes. I've tried to implement this by changing calls to "youngestchild" from "./youngestchild" to "timeout 10m youngestchild" but I have a lot of calls, and it would be very tedious to add this to every file. Also, if I ever need to change the timeout from 10 minutes to 20 minutes, it would be a huge hassle. I know the series/parallel structure is strange, but it's necessary to make this computation run quickly. Thanks! |
Hi jeffy_weffy!
One simple thing to do would be to change the master to be this: Code:
echo "START TIME\t\t\t\t\t\t" $(date +%T) > ../output/timelog Code:
1,$s/\.\/youngestchild/timeout \$TIME_LIMIT .\/youngestchild/ then do this: Code:
mkdir NEW Instead of using a fixed timeout, a shell environment variable is used to hold the timeout, which can then be set in the master script. It's "exported" into the shell environment, so it's value can be accessed from programs run from the shell. When I followed that procedure with one file name run_1 I got this output from the diff: Code:
1c1 I realize that the "youngestchild" programs might not actually be named "youngestchild", they might well each have a different name. The actual approach you would be able to use might have to be modified in order to do this same general type of thing, with different program/script names. But hopefully this has served to show that it shouldn't be very difficult to do what you want. The purpose of the NEW directory is to keep the changed versions of the files separate from the originals until you're sure the changed versions are exactly what are needed. Then the changed files can be used in place of the originals. But I'd suggest keeping the originals safe, just in case you need to refer back to them. If the "youngestchild" programs have some pattern to their names, the pattern might be used in place of the specific name I used in my example. If the names are actually very different from one another, an awk program could be substituted for sed, such that the awk program looks for any of a list of program names to change. Likewise, even though I used a single pattern in the for loop to select the filenames in which to place the timeout commands, multiple patterns, or a list of filenames could be used in place of the single pattern I used, if what you showed as run_1 run_2 etc., actually have completely distinct names, that don't match any single pattern. Hope this helps. |
All times are GMT -5. The time now is 06:53 PM. |