Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a
virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month.
Click here for more info.
10-31-2013, 02:11 PM
#1
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Rep:
text manipulating ... the more efficient way to grab a string
what's the most efficient way to grab B39391FF67D from this line when i grep a file?
Code:
Oct 31 14:06:04 mailserver02 postfix/smtp[6737]: B39391FF67D: to=<spamyamy@yahoo.com>, relay=hostname.filter[123.123.123.123]:25, delay=1.9, delays=0.06/0/0.45/1.4, dsn=2.0.0, status=sent (250 Thanks)
how i normally do it is to one line awk, cut, and sed commands but i know there is a more efficient way to do this.
my method is
Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | awk '{print $6}' | sed -e 's/://'
Last edited by socalheel; 10-31-2013 at 03:04 PM .
10-31-2013, 02:43 PM
#2
Senior Member
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278
I dont see anything wrong with what you are doing,.. but if you want a smaller command:
Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | cut -d ":" -f4
2 members found this post helpful.
10-31-2013, 02:45 PM
#3
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Original Poster
Rep:
ah maybe that was a bad example, but there are instances where i have all three awk/cut/sed in the same line and i'm not sure if there's a better way to extract what i need.
let me gather up a better example.
10-31-2013, 02:49 PM
#4
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Original Poster
Rep:
oh i just noticed with your cut -f 4 -d ":" command, that gives me a space in front of my number and i still have to use sed to remove it ...
is that correct?
10-31-2013, 02:59 PM
#5
Senior Member
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278
I would use your original command. Its nice.
Do you have any reason for looking to tune this? Usually we only do that if we have to search billions of records and such.
10-31-2013, 03:02 PM
#6
LQ Veteran
Registered: Sep 2003
Posts: 10,532
It seems you want to have 2 search criteria:
spam and
Oct 31 14 . If both are found you want the
B39391FF67D string.
Assuming that the layout of such a line is always the same (i.e. $6 is always the wanted field), have a look at this:
Code:
awk '/Oct 31 14/ && /spam/ { gsub(/:/,"") ; print $6 }' /var/log/maillog
B39391FF67D
If you need a case insensitive search (GNU Awk only...):
Code:
awk 'BEGIN{IGNORECASE=1}/oCt 31 14/ && /SpAm/ { gsub(/:/,"") ; print $6 }' /var/log/maillog
B39391FF67D
2 members found this post helpful.
10-31-2013, 03:03 PM
#7
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Original Poster
Rep:
well i have a script that run every hour to grep our maillog for a certain entry, and if that entry is present, do a few other things then email out an alert.
i know it's not too resource intense, but i like to minimize every little thing i can so all these little "resource grabbers" don't grow into something that would cause a headache later.
10-31-2013, 03:04 PM
#8
Senior Member
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881
Consider:
Code:
awk -F":" '{print $4}'
Daniel B. Martin
2 members found this post helpful.
10-31-2013, 03:06 PM
#9
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Original Poster
Rep:
for example, i want this line
Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
to only come back with
3C9341FF9D8 to=spamyamy@yahoo.com
and how i get that stripped down is rather ugly, and i'm not sure it's necessary. here is how i get it:
Code:
grep $MAILID /var/log/maillog | egrep "from=|to=" | egrep -v "osj" | awk '{print $6,$7}' | sed -e 's/,//g' | sed -e 's/://g' | sed -e 's/>//g' | sed -e 's/<//g';done
Last edited by socalheel; 10-31-2013 at 03:07 PM .
10-31-2013, 03:11 PM
#10
Senior Member
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683
Quote:
Originally Posted by
szboardstretcher
I dont see anything wrong with what you are doing,.. but if you want a smaller command:
Code:
grep -i spam /var/log/maillog | grep "Oct 31 14" | cut -d ":" -f4
you end up with a 'leading' space
Code:
awk -F\: '/^Oct 31 14.*spam.*/{gsub(/ /,"",$4);print $4}' /var/log/maillog
or
Code:
grep "Oct 31 14.*spam.*" /var/log/maillog | cut -d\: -f4 | sed 's/ //'
to 'feed' awk,
Code:
Date="Oct 31"
Hour="14"
String="spam"
awk -F\: '/'"${Date} ${Hour}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
alternate , as a function
Code:
#!/bin/bash
GetSpamID () {
Date="$1 $2"
Hour="$3"
String="$4"
awk -F\: '/'"${Date} ${Hour}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
}
#
GetSpamID Oct 31 14 spam
probably makes sense to further break it down to month day, hour
or, this form
Code:
GetSpamID () {
Prefix="$1"
String="$2"
awk -F\: '/'"${Prefix}.*${String}.*"'/{gsub(/ /,"",$4);print $4}' /var/log/maillog
}
#
GetSpamID "Oct 31 14" spam
2 members found this post helpful.
10-31-2013, 03:13 PM
#11
LQ Veteran
Registered: Sep 2003
Posts: 10,532
Quote:
Originally Posted by
socalheel
for example, i want this line
Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
to only come back with
3C9341FF9D8 to=spamyamy@yahoo.com
and how i get that stripped down is rather ugly, and i'm not sure it's necessary. here is how i get it:
Code:
grep $MAILID /var/log/maillog | egrep "from=|to=" | egrep -v "osj" | awk '{print $6,$7}' | sed -e 's/,//g' | sed -e 's/://g' | sed -e 's/>//g' | sed -e 's/<//g';done
Using a modified version of my previously posted command:
Code:
awk '/Oct 31 14/ && /spam/ { gsub(/[:,<>]/,"") ; print $6, $7 }' /var/log/maillog
B39391FF67D to=spamyamy@yahoo.com
2 members found this post helpful.
10-31-2013, 03:24 PM
#12
LQ Veteran
Registered: Jan 2011
Location: Abingdon, VA
Distribution: Catalina
Posts: 9,374
Rep:
What a really Great Question!
my insanity is apparent when I grep this | grep -v that | cut -d | sed
what a mess.
1 members found this post helpful.
10-31-2013, 03:40 PM
#13
Senior Member
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278
Alrighty,.. well, no one has mentioned python yet,..
Code:
import re
f = open('maillog', 'r')
for line in f:
if not re.search('osj', line):
if re.search('from=|to=', line):
clean = re.sub('[:<>,]', '', line)
split = clean.split()
print split[5], split[6]
File open uses lazy line reading, so it should work fine on big files.
Last edited by szboardstretcher; 10-31-2013 at 03:45 PM .
2 members found this post helpful.
10-31-2013, 04:44 PM
#14
Senior Member
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Mint 17.3
Posts: 1,881
With this InFile ...
Code:
Oct 30 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<bogus@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
Oct 31 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<spamyamy@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
Nov 01 14:34:17 mailserver02 postfix/smtp[7009]: 3C9341FF9D8: to=<dontwant@yahoo.com>, relay=outbounds8.obsmtp.com[64.18.7.12]:25, delay=4.5, delays=0.08/0/0.46/3.9, dsn=2.0.0, status=sent (250 Thanks)
... this
awk ...
Code:
awk 'BEGIN{FS=":|,"} /^Oct 31 14/ {print $4$5}' $InFile >$OutFile
... produced this OutFile ...
Code:
3C9341FF9D8 to=<spamyamy@yahoo.com>
Daniel B. Martin
Last edited by danielbmartin; 10-31-2013 at 04:48 PM .
Reason: Elaborate the InFile to make a better test
1 members found this post helpful.
10-31-2013, 07:04 PM
#15
Member
Registered: Oct 2012
Location: Raleigh, NC
Distribution: CentOS / RHEL
Posts: 158
Original Poster
Rep:
man you guys are absolutely amazing ... all these different ways to get the same result and teaches me something as well.
you rock ... thank you.
All times are GMT -5. The time now is 01:10 PM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News