LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-28-2008, 03:46 PM   #1
cmeyer
LQ Newbie
 
Registered: Aug 2008
Posts: 5

Rep: Reputation: 0
sed, awk - solution for filtering logs


I am trying to write a simple script to filter logs. The problem is that lines the logging system generates do not use specific column numbers to identify information. As a result I need to look for text between two points.

Example (all one line):

19Aug2008 14:13:38 208.246.35.27 rule: 56; rule_uid: {7C56FACF1-638C-4BA-B400-65513F787C}; rule_name: 56 - This is a Test; service_id: smtp_30; src: 192.68.10.23; dst: 10.80.110.20; proto: tcp; service: smtp_30; s_port: 49594;

Even in this simple example you can see that the field "rule_name:" can change the amount of columns in the line. I am trying to find a way to line by line grab the data between "rule_name:" and the next ";" in the stream. Anyone have a way to do this??

I found the example:
sed -n '/Iowa/,/Montana/p'

Unfortunately this does not get down to the granularity of a single line.

-Craig
 
Old 08-28-2008, 04:23 PM   #2
matthewg42
Senior Member
 
Registered: Oct 2003
Location: UK
Distribution: Kubuntu 12.10 (using awesome wm though)
Posts: 3,530

Rep: Reputation: 65
The example you posted might have been wrapped strangely - if you edit and put the log file example in [code] tags, that would be helpful, as it preserves whitespace and does not insert new line characters.

Please include several lines of log.
 
Old 08-28-2008, 04:31 PM   #3
marozsas
Senior Member
 
Registered: Dec 2005
Location: Campinas/SP - Brazil
Distribution: SuSE, RHEL, Fedora, Ubuntu
Posts: 1,499
Blog Entries: 2

Rep: Reputation: 68
Code:
grep -E -o "rule_name:[^;]+;"
 
Old 08-28-2008, 07:01 PM   #4
cmeyer
LQ Newbie
 
Registered: Aug 2008
Posts: 5

Original Poster
Rep: Reputation: 0
The solution from marozsas is innovative but I'm not sure I can chain it together like I can with the "-e" option of sed. It also keeps information outside of just the data I want. What I am looking to do is filter a number of lines (firewall logs) that will have similar data but things that are always different (timestamps).

One of my colleges that does a lot of system administration suggested the following:
Code:
 sed -n 's/.*rule: \(.*\); rule_uid.*/\1/p'
This works well and can probably be chained with "-e", but I'm not sure I can confirm that "rule_uid" will **always** follow "rule:"

So what I am looking for is a way to turn a line like this:
19Aug2008 14:13:38 208.246.35.27 rule: 56; rule_uid: {7C56FACF1-638C-4BA-B400-65513F787C}; rule_name: 56 - This is a Test; service_id: smtp_30; src: 192.68.10.23; dst: 10.80.110.20; proto: tcp; service: smtp_30; s_port: 49594;

Into something like this:
source: 192.168.10.23 destination: 10.80.110.20 protocol: tcp service: smtp_30

To that I need a code snipit that can match a pattern (example: "src: ") and then match the next expression of a character (example: ";") and provide me the data in between. I also need to be able to chain it together so that I can match multiple sets of patterns on the same line (ex: src, dst, service ...).

Here are some addtional logs:
Code:
18Aug2008 17:35:56 accept 192.168.5.23 rule: 28; rule_uid: {DA2C3F50-499B3B32}; rule_name: VPN to Internal; service_id: snmp; src: 10.5.3.1; dst: 172.18.34.2; proto: udp; service: snmp; s_port: 1025;
18Aug2008 17:35:55 accept 192.168.10.8 rule: 17; rule_uid: {C730785E-6E7A2A74}; rule_name: Caching DNS to Inet; session_id: 41833; dns_query: www.google.com ; dns_type: A; service_id: domain-udp; src: 61.28.5.4; dst: 204.2.243.44; proto: udp; service: domain-udp; s_port: 33551;
18Aug2008 17:35:55 accept 192.168.10.8 rule: 22; rule_uid: {FCD9233-B070260780}; rule_name: To Internal; src: 192.168.66.42; dst: 10.24.3.6; proto: tcp; service: 1533; s_port: 44552;
-Craig
 
Old 08-28-2008, 08:33 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,126

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I'm learning that you can do all sorts of arcane things with sed. I tend to use it for know well-formed (i.e. simple) data.
I'm sure awk can do what you want - personally I'd use perl.
 
Old 08-28-2008, 09:26 PM   #6
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682
Awk may be a better choice for log data. Log text usually has a fixed format.

Code:
awk -F\; '{ print $2 }' junkfile

 rule_uid: {DA2C3F50-499B3B32}
 rule_uid: {C730785E-6E7A2A74}
 rule_uid: {FCD9233-B070260780}

Last edited by jschiwal; 08-28-2008 at 10:21 PM.
 
Old 08-29-2008, 09:26 AM   #7
marozsas
Senior Member
 
Registered: Dec 2005
Location: Campinas/SP - Brazil
Distribution: SuSE, RHEL, Fedora, Ubuntu
Posts: 1,499
Blog Entries: 2

Rep: Reputation: 68
hi,

It is not clear what do you want.
Based on your last post, the RE below would match the source/destination information too and the initial string from your first post:

Code:
... | grep -E -o "rule_name:.+"
rule_name: VPN to Internal; service_id: snmp; src: 10.5.3.1; dst: 172.18.34.2; proto: udp; service: snmp; s_port: 1025;
rule_name: Caching DNS to Inet; session_id: 41833; dns_query: www.google.com ; dns_type: A; service_id: domain-udp; src: 61.28.5.4; dst: 204.2.243.44; proto: udp; service: domain-udp; s_port: 33551;
rule_name: To Internal; src: 192.168.66.42; dst: 10.24.3.6; proto: tcp; service: 1533; s_port: 44552;
[miguel@babylon5 ~]$
If you don't want the s_port: piece of informartion, cut it off using sed:

Code:
... | grep -E -o "rule_name:.+" | sed -e 's/s_port: [0-9]\+;//'
rule_name: VPN to Internal; service_id: snmp; src: 10.5.3.1; dst: 172.18.34.2; proto: udp; service: snmp; 
rule_name: Caching DNS to Inet; session_id: 41833; dns_query: www.google.com ; dns_type: A; service_id: domain-udp; src: 61.28.5.4; dst: 204.2.243.44; proto: udp; service: domain-udp; 
rule_name: To Internal; src: 192.168.66.42; dst: 10.24.3.6; proto: tcp; service: 1533; 
[miguel@babylon5 ~]$
I agree with the others. To more sophisticate log processing you would be better using perl or awk. Personally, like syg00, I prefer perl too.
 
Old 08-29-2008, 10:00 AM   #8
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 360

Rep: Reputation: 170Reputation: 170
This gives two fields but could be extended to more.
Each of the first two -e statements copy a modified field to the end of the line.
The final -e statement removes the original part of the line leaving only the copied fields.
The fields don't have to be always in the same order on the input line.
Code:
sed -e 's/ src\(: [^;]*\);.*/& source\1/' -e 's/ dst\(: [^;]*\);.*/& destination\1/' -e 's/.*; \(source: \)/\1/'

source: 10.5.3.1 destination: 172.18.34.2
source: 61.28.5.4 destination: 204.2.243.44
source: 192.168.66.42 destination: 10.24.3.6
 
Old 10-11-2008, 01:01 PM   #9
cmeyer
LQ Newbie
 
Registered: Aug 2008
Posts: 5

Original Poster
Rep: Reputation: 0
Excellent!!

FYI - I ended up using Kenhelm's code in the final version of my script. Thanks much!!

Cheers,

Craig


Quote:
Originally Posted by Kenhelm View Post
This gives two fields but could be extended to more.
Each of the first two -e statements copy a modified field to the end of the line.
The final -e statement removes the original part of the line leaving only the copied fields.
The fields don't have to be always in the same order on the input line.
Code:
sed -e 's/ src\(: [^;]*\);.*/& source\1/' -e 's/ dst\(: [^;]*\);.*/& destination\1/' -e 's/.*; \(source: \)/\1/'

source: 10.5.3.1 destination: 172.18.34.2
source: 61.28.5.4 destination: 204.2.243.44
source: 192.168.66.42 destination: 10.24.3.6
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
awk or sed help cmontr Programming 16 05-14-2008 10:59 AM
filtering files with SED ovince Programming 4 03-13-2007 05:04 AM
alternative easy filtering solution goldeneyexs Linux - Software 0 06-17-2004 07:00 AM
awk/sed help pantera Programming 1 05-13-2004 11:59 PM
filtering system logs KingofBLASH Linux - Security 10 12-14-2003 04:59 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:17 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration