LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Optimizing a Bash Script (http://www.linuxquestions.org/questions/programming-9/optimizing-a-bash-script-874621/)

zokken 04-12-2011 02:38 PM

Optimizing a Bash Script
 
I wrote a little Bash script which accepts two values -- sender address and recipient address -- and finds any instances of mail sent from sender to recipient in /var/log/maillog. The script works fine, but it seems to be very inefficient. I'm just wondering how I can optimize it.

Code:

#!/bin/bash

SENDER="$1"
RECIPIENT="$2"

if [ "$#" -ne 2 ]; then
  echo "Usage: findmail SENDER RECIPIENT"
  exit 1
fi

# grab the message IDs of all messages where the SENDER is matched
MSG_ID=$(grep "from=<$SENDER" /var/log/maillog | cut -f4 -d":" | grep -v NOQUEUE)
for i in $MSG_ID; do
  # for each message ID where SENDER is matched, see if RECIPIENT is also matched
  if grep $i /var/log/maillog | grep "to=<$RECIPIENT" > /dev/null; then
      # if the recipient is also matched, grep out all instances of the message ID
      echo FROM $SENDER TO $RECIPIENT, ID $i
      echo ----------
      zgrep $i /var/log/maillog
      echo ----------
  fi
done

Example:

Code:

$ findmail user1@domain.com user2@domain.com
FROM user1@domain.com TO user2@domain.com, ID 172E6594405
----------
Apr 12 15:14:51 smtp postfix/smtpd[15323]: 172E6594405: client=exprod8mx205.postini.com[64.18.3.105]
Apr 12 15:14:51 smtp postfix/cleanup[17490]: 172E6594405: message-id=<BANLkTinNS83+CkGsj+Xa_20xz1uwCr9Yeg@mail.gmail.com>
Apr 12 15:14:51 smtp postfix/qmgr[12461]: 172E6594405: from=<user1@domain.com>, size=24236, nrcpt=1 (queue active)
Apr 12 15:14:52 smtp postfix/smtp[18668]: 172E6594405: to=<user2@domain.com>, relay=127.0.0.1[127.0.0.1]:10024, delay=1, delays=0.85/0/0/0.16, dsn=2.0.0, status=sent (250 2.0.0 Ok, id=17731-04, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as 02B3A5943F1)
Apr 12 15:14:52 smtp postfix/qmgr[12461]: 172E6594405: removed
----------
FROM user1@domain.com TO user2@domain.com, ID 02B3A5943F1
----------
Apr 12 15:14:52 smtp postfix/smtpd[3717]: 02B3A5943F1: client=localhost.localdomain[127.0.0.1]
Apr 12 15:14:52 smtp postfix/cleanup[18960]: 02B3A5943F1: message-id=<BANLkTinNS83+CkGsj+Xa_20xz1uwCr9Yeg@mail.gmail.com>
Apr 12 15:14:52 smtp postfix/qmgr[12461]: 02B3A5943F1: from=<user1@domain.com>, size=24912, nrcpt=1 (queue active)
Apr 12 15:14:52 smtp postfix/smtp[18668]: 172E6594405: to=<user2@domain.com>, relay=127.0.0.1[127.0.0.1]:10024, delay=1, delays=0.85/0/0/0.16, dsn=2.0.0, status=sent (250 2.0.0 Ok, id=17731-04, from MTA([127.0.0.1]:10025): 250 2.0.0 Ok: queued as 02B3A5943F1)
Apr 12 15:14:52 smtp postfix/lmtp[3698]: 02B3A5943F1: to=<user2@domain.com>, relay=imap.domain.com[1.2.3.4]:7025, delay=0.13, delays=0.01/0.01/0/0.11, dsn=2.1.5, status=sent (250 2.1.5 Delivery OK)
Apr 12 15:14:52 smtp postfix/qmgr[12461]: 02B3A5943F1: removed

Again, it seems to work fine, but the script greps through /var/log/maillog three times -- first to find the sender; second to see if it's also going to the desired recipient; finally to view all instances of the message ID in the logs. This seems redundant and inefficient.

Any ideas on how to improve it?

Ramurd 04-12-2011 03:52 PM

I currently don't have a good maillog to sample around with, but what you can do is a single grep that catches multiple lines, store that in one variable and grep on the recipient; that way you run through /var/log/maillog some fewer times;

In fact you're running through /var/log/maillog more than three times: first to find all the times "sender" matches, then for each sender all recipients to see if they're matched and finally you even use zgrep to uncompress if needed;

You could go somewhere along these lines:
Code:

msg_id=$(grep -A 5 "from=<${SENDER}" /var/log/maillog | grep -B 5 -A 5 "to=<$RECIPIENT" | head -n 1 | cut -d ':' -f 4 | grep -v NOQUEUE)
for i in $msg_id; do
  echo FROM $SENDER TO $RECIPIENT, ID $i
  zgrep $i /var/log/maillog
done

Above code is not very optimized, but you reduce the amount of times you have to go through /var/log/maillog quite a bit as you only have messages where sender and recipient match. Above code could not be completely correct, but I think you can fiddle around to get things right. (The second grep I did with before and after in such a way that the grep -A you began with is at that point still complete, so you can head the first line and get the appropriate field for message id from the first line)

kurumi 04-12-2011 07:57 PM

Quote:

Code:

grep "from=<$SENDER" /var/log/maillog | cut -f4 -d":" | grep -v NOQUEUE
for i in $MSG_ID; do
  # for each message ID where SENDER is matched, see if RECIPIENT is also matched
  if grep $i /var/log/maillog | grep "to=<$RECIPIENT" > /dev/null; then
      # if the recipient is also matched, grep out all instances of the message ID
      echo FROM $SENDER TO $RECIPIENT, ID $i
      echo ----------
      zgrep $i /var/log/maillog
      echo ----------
  fi


the above lines can be shortened to just one process
Code:

awk -F":" -vsender=$SENDER -vrecp=$RECIPIENT  '/from</&&$0~sender&&!/NOQUEUE/ {
  msgid=$4 
}
$0~msgid && $0~/to</ && $0~recp {
  ....
}
' /var/log/maillog


zokken 04-13-2011 12:29 AM

thanks for the replies. i guess i should finally invest some more time into learning awk. :)


All times are GMT -5. The time now is 05:56 AM.