LinuxQuestions.org - [SOLVED] how to speed up this script

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - how to speed up this script (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-speed-up-this-script-854692/)

how to speed up this script

Hi

my script is running very slow.It basically compares 2 output files.Please suggest.URGENT!!!

#!/bin/sh

compareLOGandDBfunction()
{

export file1="DBcountOP"
export file2="logcountOP"
rm -f comp_res2.txt
rm -f temp2.txt
rm -f result2.txt
match=0
export comp_res="comp_res2.txt"
export temp="temp2.txt"
export result="compareLOGandDB_OP.txt"

while read FILE1_LINE ; do
#reading the output file from database and the first string is stored in the variable LINE

file1_tablename="$(echo $FILE1_LINE | cut -d ' ' -f1)"
#the 1st field is stored i.e,tablename

file1_count="$(echo $FILE1_LINE | cut -d ' ' -f2)"
#the 2nd filed is stored i.e,the count of rows for the table

while read FILE2_LINE ; do
#reading the OP file from log and stores the 1st string

file2_tablename="$(echo $FILE2_LINE | cut -d ' ' -f1)"
#the 1st field is stored i.e,tablename

file2_count="$(echo $FILE2_LINE | cut -d ' ' -f2)"
#the 2nd fild stored-the count of rows for the table

if [ "$file1_tablename" = "$file2_tablename" ] && [ "$file1_count" = "$file2_count" ]
#start of 1st if block

then
echo table has matched $file1_tablename
echo count has matched $file2_count
echo $file1_tablename " " $file1_count " " $file2_count "\t" Matched | tee -a $temp $result

fi #end of 1st if block

if [ "$file1_tablename" = "$file2_tablename" ] && [ "$file1_count" != "$file2_count"]
#start of 2nd if
#checking for match b/w tables but a mismatch b/w the rowcount from logfile and the DB output file

then
echo table has matched $file1_tablename
echo count has not matched $file2_count
echo $file1_tablename " " $file1_count " " $file2_count "\t" Count MissMatch | tee -a $temp $result
fi
#end of 2nd if block

done < $file2

done < $file1

echo *******************************************************

#the following 2 blocks check for tables that exists only in either of the logfile or the database

while read FILE2_LINE ; do
#reading the logfile line after line

file2_tablename="$(echo $FILE2_LINE | cut -d ' ' -f1)"

file2_count="$(echo $FILE2_LINE | cut -d ' ' -f2)"
echo table name is $file2_tablename
echo count is $file2_count
match=`grep $FILE2_LINE $temp | wc -l`
if [ $match -eq 0 ]
then
echo $FILE2_LINE not present in $file1 | tee -a $comp_res $result

fi
#checks for a table that may exist in log but not in the database

done < $file2

echo ******************************************************

while read FILE1_LINE ; do
#reading the output file from the database

file1_tablename="$(echo $FILE1_LINE | cut -d ' ' -f1)"
file1_count="$(echo $FILE1_LINE | cut -d ' ' -f2)"
echo table name is $file1_tablename
echo count is $file1_count
match=`grep $FILE1_LINE $result | wc -l`
if [ $match -eq 0 ]
then
echo $FILE1_LINE not present in $file2 | tee -a $comp_res $result
fi
#checks for a table that might exist in database but not in the log
done < $file1

Quote:

Originally Posted by smritisingh03 (Post 4215667)

Please suggest.URGENT!!!

Hi,

this is only urgent to you. We are all volunteers here and we will help when we have the time and when we want to. So please do not use terms like urgent in your posts.

Well the first part with the 2 while loops looks like a complicated way to do a simple diff?? Any reason this tool cannot work for you?

Hi there...btw i know someone would answer my query only if he has time...in fact all of us are aware of this!!! URGENT doesnt mean that i need attention to this code even if anyone doesnt have time....it is just to increase its priority...hope u understand this and would be a bit courteous...

some simple rules ...

Quote:

Originally Posted by smritisingh03 (Post 4215930)

URGENT doesnt mean that i need attention ...

Marking your post as URGENT is still considered rude.

Quote:

it is just to increase its priority ...

This is where you are mistaken. You need to understand that you have no means to prioritize your thread. I know that there are other sites that actually do have dedicated sub-forums for urgent requests. LQ, however, does not implement such a feature (yet). And 'urgent' requests tend to get less responses. So if someone points out to you that 'urgent' requests are frowned upon then it is in your best interest.

Two other points of notice:
1)

Quote:

Proper spelling, capitalization and punctuation is also highly appreciated by most members. Text-speak language is hard to read and it gets annoying after a while.

2)
When you are posting code samples then you should enclose them in code-tags. This way your code stays properly formatted and is easier to read. See my signature on how to use code-tags.

I hope you understand that by abiding to this simple rules your threads will get better resonance here on LQ.

smritisingh03,

From a reader point of view your script is long and uses a lot of unknown files. It’s difficult to guess what it should do. I suppose you know that. But I don’t know.

Moreover your script is invalid. The function begins:

Code:

compareLOGandDBfunction()

{

but never ends.

I suggest you to put at the beginning of the script the following code:

Code:

started=$(date +"%s")



function works {

elapsed=$(expr $(date +"%s") - $started)

started=$(expr $started + $elapsed)

echo -e "\033[1m$elapsed sec.\033[0m"

echo

}

Then put works command in selected places to track the time passed from the start of the script, for example:

Code:

.

.

.

done < $file2

works

.

.

.

The other method is to put at the beginning of the script the following code:

Code:

declare -i number=0



function number {

number=$number+1

echo

echo -e "\033[1m$number.\033[0m"

}

Then precede the commands you suspect work too slow with the sequence number; time, for example:

Code:

.

.

.

number; time while read FILE1_LINE ; do

.

.

.

Thankyou so much...i could find out where the problem is.Actually this script alone is wkng fine.I used ur code for the script which is querying the DB and there the culprit was :-).

Now another question....i have a script which is doing a "select * from tab;" and writing the O/P into a file x2.

then in the second part I am using this file x2 as my input ,reading with a while loop and querying each table for its rowcount "select count(*) from tablename" .

There are possibly n*1000 tables in DB and that is the reason why it is taking so long!!

Does anyone here have a clue as to how can I improve this.

Actually I want to writ a function for the sql part and then use a variable maybe "maxparellel=5" and then call the sql function in a while loop and execute it using "&".The moment it processes 5 rows parellely i give the wait command.

please let me know if this is a good workaround and please help me with this.I need to know how to go ahead!!!