LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How can make this single threaded program as multi threaded ? (https://www.linuxquestions.org/questions/programming-9/how-can-make-this-single-threaded-program-as-multi-threaded-4175552629/)

praveen.vp 09-04-2015 12:35 PM

How can make this single threaded program as multi threaded ?
 
I have this single threaded working code which I want to convert it as multi-threaded.

The below script is started with the variable userThread which will accept the number of threads as parameter. Then it should work as per the number of threads specified while running the program.

The code reads an Input ImageURLs file having single column values like

100049/65994/640x480/ALTMA_EXCLUSIVE_ACERO_2013_11.JPG
100049/65994/640x480/ALTMA_EXCLUSIVE_ACERO_2013_12.JPG
100049/65994/640x480/ALTMA_EXCLUSIVE_ACERO_2013_13.JPG

and tries to upload it to a location (for different sizes declared in the "IMAGE_SIZES" variable.

I want to know how I can make this as a multi-threaded program.

Code:


. /etc/profile
. /root/.bash_profile

userThread=$1
IMAGE_SIZES=640x480,400x300,160x120,80x60,220x170,110x85,220x140
IMAGES_HOME="/inventory/"
BUCKET="s3://assets.domain-name.com"
local_path="/websites/admin.domain-name.com/ROOT/uploads/inventory/"


        for size in $(echo $IMAGE_SIZES |  tr "," " ")
        do
                echo "Now checking images directory of $size"
               
                        while read image; do
                          DealerMain=$(awk -F/ '{ print $1}' <<<"${image}")
                          DealerSub=$(awk -F/ '{ print $2}' <<<"${image}")
                          imagename=${image##*/}
                               
                                s3cmd put --recursive --force "$local_path$DealerMain/$DealerSub/$size/$imagename" "$BUCKET$IMAGES_HOME$DealerMain/$DealerSub/$size/"

                        done < /root/cronjobs/ImageURLs.txt

      done

echo "Completed !"

Thanks in advance.

Regards
Praveen

suicidaleggroll 09-04-2015 12:43 PM

Stick an "&" on the end of a command to put it in the background. You can then keep tabs on the number of background jobs with "jobs | wc -l".

Stick an if-statement right before your command to check the current number of backgrounded jobs, and if it's less than your limit run the command, otherwise wait a few seconds and check again.

praveen.vp 09-04-2015 01:53 PM

Thanks Mr.suicidaleggroll,

But I am still not clear with your answer. Where to fit "&" at the end in my code above?

Where I can apply number of threads loop

# start threads
for i in $(seq 1 $userThread)
do

<code>

done
# sleep 1 second
sleep 1

firstfire 09-04-2015 01:59 PM

Hi.

xargs can run multiple processes in parallel with the -P MAX-PROCS option, something like this
Code:

cat ImageURLs.txt | xargs -P5 -n1 script-to-upload-one-image.sh
The -n1 option tells xargs that it should provide only one argument from standard input for each script execution.
You can play with the following example:
Code:

$ printf "s\n" $(seq 5) | xargs -P5 -n1 echo
1
5
4
3
2

Changing to -n2 we get something like this:
Code:

$ printf "%s\n" $(seq 5) | xargs -P5 -n2 echo
3 4
1 2
5

Another cool utility for running scripts/programs in parallel is, well, GNU parallel. Highly recommend it if you need to run something in parallel on remote hosts.

suicidaleggroll 09-04-2015 02:16 PM

Quote:

Originally Posted by praveen.vp (Post 5415892)
Thanks Mr.suicidaleggroll,

But I am still not clear with your answer. Where to fit "&" at the end in my code above?

Where I can apply number of threads loop

# start threads
for i in $(seq 1 $userThread)
do

<code>

done
# sleep 1 second
sleep 1

In my example there is no loop over threads.

Sticking a "&" on the end of a command runs it in the background, which means bash doesn't wait for the command to exit before moving on to the next line in the script. You would run your s3cmd command in the background, since that's the one you want to parallelize, yes? But if you just stuck a "&" on the end of your s3cmd command and called it good, your script would loop through VERY quickly and launch ALL of them in the background in a very short amount of time. Then the script would exit, and you'd have hundreds (or however many there are) s3cmd processes all running, fighting each other for resources.

That's a good way to waste resources and piss off server admins, so you need one additional step. You put a while statement (I said "if" above, it should be while) right before the call to s3cmd which checks the current number of backgrounded jobs. You would use it as a "rate-limiter" of sorts, to ensure you have fewer than N jobs running simultaneously, and it would just patiently wait for the backgrounded jobs to complete before launching more. Your userThread variable would control how many jobs are allowed to be backgrounded simultaneously. Something like
Code:

while [[ $(jobs | wc -l) -ge $userThread ]]; do
  sleep 1
done
s3cmd put --recursive --force "$local_path$DealerMain/$DealerSub/$size/$imagename" "$BUCKET$IMAGES_HOME$DealerMain/$DealerSub/$size/" &



All times are GMT -5. The time now is 09:07 PM.