[SOLVED] Bash shell array processing slower than expected
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a bash script on Solaris to identify the last login date for an application. It extracts an ID from the first array, and searches for it in the second array, using two nested while do done loops.
The script works as designed, but is much slower than I anticipated. I have commented out portions to try to identify what makes it slow but have not identified anything. I would have thought that since everything is in memory it would run more quickly than it does (about three hours where there are 917 entries in idarray and 558 entries in loginarray). The two arrays are unfortunately not in the same sequence.
Thanks in advance...
Code:
while [ $idarraycounter -lt $idarrayrows ]
do
id=`echo ${idarray[$idarraycounter]}`
loginarraycounter=0
lastlogindate=""
while [ $loginarraycounter -lt $loginarrayrows ]
do
loginid=`echo ${loginarray[$loginarraycounter]} | cut -f1 -d","`
loginidlower=`echo $loginid | tr '[:upper:]' '[:lower:]'`
loginidupper=`echo $loginid | tr '[:lower:]' '[:upper:]'`
if [[ "$loginid" = "$id" || "$loginidlower" = "$id" || "$loginidupper" = "$id" ]]
then
lastlogindate=`echo ${loginarray[$loginarraycounter]} | cut -f2 -d","`
loginarraycounter=$loginarrayrows
else
((loginarraycounter++))
fi
done
# Write ID,lastlogindate
# Note - if ID is not found (i.e. no logins), $lastlogindate will still be blank
echo "$id,$lastlogindate"
((idarraycounter++))
done
Your script looks pretty good, but you are spawning a lot of processes looping over these arrays. You can try to eliminate some of them to speed it up. Try a case insensitive compare instead of spawning the 'tr's for the upper and lower compare. There is an example here: http://www.linuxquestions.org/questi...sitive-676101/
You may also want to try to split the loginarray into two separate arrays, one with field 1 and the second with field2, this would eliminate the cut in the inner loop.
The time still seems quite long for what you are doing. I generally move to a better language when there is a need for arrays in a bash script.
you invoke at least 8 additional processes inside the double loop. that is not really efficient.
you can try to do the same with only one awk or perl script, and you will see the difference
My goto tool for something like this would be Perl. It is made for things like this, and would eliminate all external calls.
There are some BASH tricks that would reduce external calls and speed your processing, but I really think porting to perl would result in an order of magnitude improvement.
where /u01/export/reports/login-ids-and-dates-master.csv is the file that I used to load the inner array. I thought catting the file 900 times would be slow, but the script now runs in less than one minute.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.