LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-31-2013, 02:11 PM   #1
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Rep: Reputation: 3
text manipulating ... the more efficient way to grab a string


what's the most efficient way to grab B39391FF67D from this line when i grep a file?

Code:
Oct 31 14:06:04 mailserver02 postfix/smtp[6737]: B39391FF67D: to=<spamyamy@yahoo.com>, relay=hostname.filter[123.123.123.123]:25, delay=1.9, delays=0.06/0/0.45/1.4, dsn=2.0.0, status=sent (250 Thanks)
how i normally do it is to one line awk, cut, and sed commands but i know there is a more efficient way to do this.

my method is
Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | awk '{print $6}' | sed -e 's/://'

Last edited by socalheel; 10-31-2013 at 03:04 PM.
 
Old 10-31-2013, 02:43 PM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
I dont see anything wrong with what you are doing,.. but if you want a smaller command:

Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | cut -d ":" -f4
 
2 members found this post helpful.
Old 10-31-2013, 02:45 PM   #3
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Original Poster
Rep: Reputation: 3
ah maybe that was a bad example, but there are instances where i have all three awk/cut/sed in the same line and i'm not sure if there's a better way to extract what i need.

let me gather up a better example.
 
Old 10-31-2013, 02:49 PM   #4
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Original Poster
Rep: Reputation: 3
oh i just noticed with your cut -f 4 -d ":" command, that gives me a space in front of my number and i still have to use sed to remove it ...

is that correct?
 
Old 10-31-2013, 02:59 PM   #5
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
I would use your original command. Its nice.

Do you have any reason for looking to tune this? Usually we only do that if we have to search billions of records and such.
 
Old 10-31-2013, 03:02 PM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
It seems you want to have 2 search criteria: spam and Oct 31 14. If both are found you want the B39391FF67D string.

Assuming that the layout of such a line is always the same (i.e. $6 is always the wanted field), have a look at this:
Code:
awk '/Oct 31 14/ && /spam/ { gsub(/:/,"") ; print $6 }' /var/log/maillog
B39391FF67D
If you need a case insensitive search (GNU Awk only...):
Code:
awk 'BEGIN{IGNORECASE=1}/oCt 31 14/ && /SpAm/ { gsub(/:/,"") ; print $6 }' /var/log/maillog
B39391FF67D
 
2 members found this post helpful.
Old 10-31-2013, 03:03 PM   #7
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Original Poster
Rep: Reputation: 3
well i have a script that run every hour to grep our maillog for a certain entry, and if that entry is present, do a few other things then email out an alert.

i know it's not too resource intense, but i like to minimize every little thing i can so all these little "resource grabbers" don't grow into something that would cause a headache later.
 
Old 10-31-2013, 03:04 PM   #8
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
Consider:
Code:
awk -F":" '{print $4}'
Daniel B. Martin
 
2 members found this post helpful.
Old 10-31-2013, 03:06 PM   #9
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Original Poster
Rep: Reputation: 3
for example, i want this line

Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)

to only come back with
3C9341FF9D8 to=spamyamy@yahoo.com

and how i get that stripped down is rather ugly, and i'm not sure it's necessary. here is how i get it:

Code:
grep $MAILID /var/log/maillog | egrep "from=|to=" | egrep -v "osj" | awk '{print $6,$7}' | sed -e 's/,//g' | sed -e 's/://g' | sed -e 's/>//g' | sed -e 's/<//g';done

Last edited by socalheel; 10-31-2013 at 03:07 PM.
 
Old 10-31-2013, 03:11 PM   #10
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by szboardstretcher View Post
I dont see anything wrong with what you are doing,.. but if you want a smaller command:

Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | cut -d ":" -f4
you end up with a 'leading' space

Code:
awk -F\: '/^Oct 31 14.*spam.*/{gsub(/ /,"",$4);print $4}' /var/log/maillog
or
Code:
grep "Oct 31 14.*spam.*" /var/log/maillog | cut -d\: -f4 | sed 's/ //'

to 'feed' awk,
Code:
Date="Oct 31"
Hour="14"
String="spam"
awk -F\: '/'"${Date} ${Hour}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
alternate , as a function
Code:
#!/bin/bash
GetSpamID () {
Date="$1 $2"
Hour="$3"
String="$4"
awk -F\: '/'"${Date} ${Hour}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
}

#
GetSpamID Oct 31 14 spam
probably makes sense to further break it down to month day, hour

or, this form

Code:
GetSpamID () {
Prefix="$1"
String="$2"
awk -F\: '/'"${Prefix}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
}

#
GetSpamID "Oct 31 14" spam
 
2 members found this post helpful.
Old 10-31-2013, 03:13 PM   #11
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Quote:
Originally Posted by socalheel View Post
for example, i want this line

Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)

to only come back with
3C9341FF9D8 to=spamyamy@yahoo.com

and how i get that stripped down is rather ugly, and i'm not sure it's necessary. here is how i get it:

Code:
grep $MAILID /var/log/maillog | egrep "from=|to=" | egrep -v "osj" | awk '{print $6,$7}' | sed -e 's/,//g' | sed -e 's/://g' | sed -e 's/>//g' | sed -e 's/<//g';done
Using a modified version of my previously posted command:
Code:
awk '/Oct 31 14/ && /spam/ { gsub(/[:,<>]/,"") ; print $6, $7 }' /var/log/maillog
B39391FF67D to=spamyamy@yahoo.com
 
2 members found this post helpful.
Old 10-31-2013, 03:24 PM   #12
Habitual
LQ Veteran
 
Registered: Jan 2011
Location: Abingdon, VA
Distribution: Catalina
Posts: 9,374
Blog Entries: 37

Rep: Reputation: Disabled
What a really Great Question!

my insanity is apparent when I grep this | grep -v that | cut -d | sed
what a mess.
 
1 members found this post helpful.
Old 10-31-2013, 03:40 PM   #13
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
Alrighty,.. well, no one has mentioned python yet,..

Code:
import re

f = open('maillog', 'r')
for line in f:
    if not re.search('osj', line):
        if re.search('from=|to=', line):
            clean = re.sub('[:<>,]', '', line)
            split = clean.split()
            print split[5], split[6]
File open uses lazy line reading, so it should work fine on big files.

Last edited by szboardstretcher; 10-31-2013 at 03:45 PM.
 
2 members found this post helpful.
Old 10-31-2013, 04:44 PM   #14
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881

Rep: Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660Reputation: 660
With this InFile ...
Code:
Oct 30 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<bogus@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
Nov 01 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<dontwant@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
... this awk ...
Code:
awk 'BEGIN{FS=":|,"} /^Oct 31 14/ {print $4$5}' $InFile >$OutFile
... produced this OutFile ...
Code:
 3C9341FF9D8 to=<spamyamy@yahoo.com>
Daniel B. Martin

Last edited by danielbmartin; 10-31-2013 at 04:48 PM. Reason: Elaborate the InFile to make a better test
 
1 members found this post helpful.
Old 10-31-2013, 07:04 PM   #15
socalheel
Member
 
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158

Original Poster
Rep: Reputation: 3
man you guys are absolutely amazing ... all these different ways to get the same result and teaches me something as well.

you rock ... thank you.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Grab time from string kram66 Programming 1 05-10-2009 09:23 PM
Efficient use of C string libraries with C++ strings? R00ts Programming 4 04-08-2008 11:43 AM
C: storing string which is more efficient. debiant Programming 22 09-01-2006 12:39 AM
Grab text lines in text file LULUSNATCH Programming 1 12-02-2005 10:55 AM
Manipulating SIP msg string sti2envy Linux - Security 5 10-12-2005 07:52 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 01:10 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration