Global Variable updating from multithread jobs
So there is probably a much better way to do this but I am not seeing it.
What my end goal is to have a shell script run multiple jobs in parallel but limit the total number of jobs. I am hoping to avoid relying on searching for process as it seems silly to not be able to count within the script. So here is a sample to illustrate: ------------------ #/bin/bash function setvariables { testarray=("test 1" "test 2" "test 3" "test 4" "test 5") testvar=0 } function main { for t in "${testarray[@]}" do while [ $testvar -gt 2 ] do sleep 1 echo while loop shows $testvar done (testing $t; testvar=$(($testvar - 1)) ; echo $testvar) & testvar=$(($testvar + 1)) sleep 1 done wait } function testing() { echo $1 sleep 3 } old_IFS=$IFS IFS=$'\n' setvariables main IFS=${old_IFS} exit 0 ------------------ My hope was this would print 1 string in the array every second until $testvar was incremented to 3 and then be stuck in a while loop until $testvar was decreased back below 3. The problem is getting the variable decreased only AFTER 1 of the threaded jobs finishes ie "(testing $t; testvar=$(($testvar - 1)) ; echo $testvar) &" With this test code the output basically shows that the $testvar set in main increases to 3 and never drops down while the subshell for each thread will decrease this variable based on the current parent variable at run time. I appreciate any ideas out there! -Will |
Hi Ztole!
I can't tell from what you've done, with what information you are familiar, versus what you're not familiar. So pardon me if I am telling things you already know. I've put together three simple bash scripts, which I hope will illustrate various concepts that might be related/helpful in what you're doing. The following is the code for the various scripts. prog1.bash: Code:
#!/bin/bash prog2.bash: Code:
#!/bin/bash prog3.bash: Code:
#!/bin/bash When I run ./prog1.sh the output looks like this: Code:
prog1: The value of a_variable, after I set to it 1 = 1
If you are new to Linux, you might want to keep things simple, and perhaps try to keep track of the number of processes your BASH program has running, using BASH's built in jobs command. Just as a trivial example, if I run three sleep command in the background, telling them each to "sleep" for a different number of seconds, I can see them using the jobs commands: Code:
> sleep 180 & If you don't mind getting more deeply into Linux, there are ways processes can talk to one another. So it would be possible to have one program monitor another, etc. If you can can give us some idea with what sorts of BASH capabilities you're familiar, maybe we can give you a better idea how to approach what you're trying to do, using concepts with which you are comfortable. For example, were you aware of the issue with the use of the export command, and the use of the environment? HTH. |
Thanks for the reply Rigor! I definitely track what you have demonstrated. Which is the crux of my problem. I have been working on a few alternatives which rely on tracking either the pid or job count, however it just feels overly complicated. I was hoping there was a way to wait for a return value or explicitly setting it, instead of waiting for tracking the jobs. In the long run there may be no difference, but I am imagining something such as the system being run out of resources (by someone else's out of control code of course!) causing the script to process in error simply due to an invalid response.
Also, it is hard to quickly demonstrate knowledge but i will say i am an experienced systems engineer who has written MANY scripts for automation purposes. That being said I am certainly NOT a coder/developer. I am fortunate to be on vacation this week so if i have a slow reply to anything please forgive me :) Thanks! -Will |
Hi Will,
Normally, when someone asks a question involving BASH, I'm expecting that BASH is what they prefer to use, to do whatever it is that they are doing. I'm expecting that if they know for example, PERL, and they wanted to use PERL, they would have asked the question in terms of PERL. So if they ask about something in terms of BASH, and I have anything to contribute on the subject, I'll try to give them something in terms of BASH. I could be missing something here, but very roughly, I suspect it might be about fair to say that if I listed a few languages that can monitor/control processes, and ordered the list from least direct/simple/complete control available in the language, to most direct/simple/complete control, the list would be BASH, GAWK, PERL, C. In a Unix or Unix-like environment, C tends to have most of the process monitoring/controlling capabilities that the Kernel does. As a result, of those languages, it would be my first choice for any detailed process monitoring/controlling. So I wasn't asking you to demonstrate knowledge. Instead I was wondering with what features of BASH, you might be comfortable, or, if you are in a position to accomplish your goal using some language other than BASH script. If you are just talking about executing programs, and limiting the number of programs executing, you could do something simple in BASH, such as: Code:
job_list=( './prog_a.bash' './prog_b.bash' './prog_c.bash' './prog_d.bash' './prog_e.bash' './prog_f.bash' './prog_g.bash' './prog_h.bash' ) ; It now sounds almost as if you're trying to write something like an at Daemon. If you wanted almost anything beyond simply limiting the number of programs running, you might get into trouble, or at least deep Kludge, pretty quickly with BASH. For example, if you wanted the programs run by the BASH script to be killed off, when the BASH script is killed off, you might add one or more traps to the BASH script. Something like this code: Code:
running_job_sets=`jobs | awk -F'[\\\]\\\[]' ' { printf "%%%s " , $2 ; } END { print "" ; } '` ; But run it from a BASH script running in the "background", and you'll tend to only kill the programs run directly by the BASH script. If you run BASH in the background, you can pass the argument to BASH itself to tell it that you want to run BASH interactively, and even try to connect BASH to pseudo-tty's to make BASH "think" it's interactive. You can try to make use of other Inter-Process Communication facilities in Linux, for use with your BASH script. Yet in a sense with those approaches, you're more or less trying to add-on, or make facilities available to your BASH environment, that are already directly available in C. If you are worried about the System being run out of resources, you might want to look into placing resource limits on the jobs that you will be running. Unless, your goal is ultimately to monitor resource usage, and react accordingly, rather then impose limits; there are facilities in C for monitoring resource usage of child processes. Although I have repeatedly been impressed with how quickly some language processors have been able to run programs written in "scripting languages", a similarly written compiled program running "native code" is still hard to beat. Imagine a so called "Rabbit Job" that starts quickly spawning one process after another process. If by the time you've gotten a particular PID and tried to kill off one process, it's gone, and there's a different process with a different PID in its place, naturally the program trying to control that situation needs to be pretty speedy. That's another thing to recommend C if you are dealing with monitoring of processes that might not be "well behaved". Naturally process states can change rather quickly, so you don't want your program to be chasing it's own tail, if it doesn't react sufficiently fast, it can effectively create its own "race conditions". If I still don't have a good idea of what you're trying to accomplish, maybe you could give us some additional details, to help us, help you better. HTH. |
Hi Rigor, good info and I appreciate the help. I ended up relying on a job count which to be honest I saw here http://stackoverflow.com/questions/1...oncurrent-jobs. This feels backwards to me but I just got back from vacation and now i am on a time crunch ha!
Here is the base i added: for path in "${sourcefolders[@]}" do lognum=$((lognum + 1 )) mkdir -p "$path/large_files" joblist=($(jobs -p)) while (( ${#joblist[*]} >= 20 )) do sleep 5 joblist=($(jobs -p)) done filelist $path $lognum & done wait This is ultimately what i landed on to traverse a file system, find files over 15GB in size and move them to a new folder based on their original location. Source directory paths have been been changed to protect their identity :) Code:
#!/bin/bash |
All times are GMT -5. The time now is 11:31 PM. |