LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-28-2008, 11:00 PM   #1
matyu
LQ Newbie
 
Registered: Jun 2006
Posts: 5

Rep: Reputation: 0
Smile bash script to use sed for filter mutiples patterns from apache access logs


Hi,

I would like to write a script to filter out mutiples patterns string from the apache access log and create a new output file. Below is an example of my old script,

sed -e '/keepalive.html/d' -e '/Yahoo! Slurp/d' -e '/help.yahoo.com\/help\/us\/ysearch\/slurp/d' -e
'/msnbot-media/d' -e '/MSRBOT/d' -e '/Googlebot/d' -e '/+c.root+c.l\[l\-1\].arrow+/d' /usr/log/apache2/access.log > /tmp/www4-test/access-test

But I want to enhance the script by put all those string patterns in file so I can read them from file.

Any sugguestion and example of how to do that ?

Thank you very much.
 
Old 01-29-2008, 12:35 AM   #2
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Why not use grep?
Code:
grep -vf patternfile /usr/log/apache2/access.log > /tmp/www4-test/access-test
 
Old 02-06-2008, 08:18 AM   #3
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
Excellent suggested use of a grep pattern file. If you can make all the patterns "fixed" -- i.e. not regexes -- then you could add the "-F" option. . . .
1st put the patterns in 'patternfile' & check its contents:
Code:
$ cat patternfile
keepalive.html
Yahoo! Slurp
help.yahoo.com/help/us/ysearch/slurp
msnbot-media
MSRBOT
Googlebot
\+c.root+c.l\[l\-1\].arrow\+
Then run the code:
Code:
PATTERNFILE='patternfile' 
LOGFILE='/usr/log/apache2/access.log'
TESTFILE='/tmp/www4-test/access-test' 

grep -vFf $PATTERNFILE $LOGFILE > $TESTFILE

Last edited by archtoad6; 02-06-2008 at 08:29 AM.
 
Old 02-06-2008, 08:34 AM   #4
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
You could also rewrite your sed command as a sed script. Then you can modify the sed script to make changes.

Code:
/keepalive.html/d
/Yahoo! Slurp/d
/help.yahoo.com\/help\/us\/ysearch\/slurp/d
/msnbot-media/d
/MSRBOT/d
/Googlebot/d
/+c.root+c.l\[l\-1\].arrow+/d
Code:
sed -f filterlog.sed /usr/log/apache2/access.log > /tmp/www4-test/access-test
 
Old 02-06-2008, 10:58 AM   #5
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 234Reputation: 234Reputation: 234
I wonder which method is quicker & if it matters.

matyu,
Could you put one of your logs on a pastebin & post a link to it here. Please, please, please, do not clutter this thread by putting the log here, as it will surely do if it's long enough for those of us who want to play w/ it to have a good sample.
 
Old 02-06-2008, 10:28 PM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
sed method with the "d" flag may be slower, as it has to "search" and "delete" the line.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
how to use sed to delete mutiple string patterns from apache access log matyu Programming 3 01-05-2008 11:42 PM
Filter junk from web logs using this script ahz LinuxQuestions.org Member Success Stories 3 02-24-2006 12:31 PM
Remembering patterns and printing only those patterns using sed bernie82 Programming 5 05-26-2005 05:18 PM
problem with sed in a bash script nexus55 Linux - Software 6 05-03-2004 09:40 PM
sed in small BASH script OhLordy Linux - General 1 08-29-2003 11:32 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:32 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration