LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Parse log file with bash (https://www.linuxquestions.org/questions/linux-general-1/parse-log-file-with-bash-831102/)

mijohnst 09-08-2010 10:09 PM

Parse log file with bash
 
I'm not a programmer and barely a scripter (compared to many of the God here) so I'm having a hard time deciding what I should do in parsing the snare.log file. I only need a bit of information from certain lines. For instance when someone logs in with SSH, I want to know when, who, successful/unsuccessful and from where.

So one entry from the snare log looks like this:

Code:

vm-rhel5        LinuxKAudit    criticality,0  event,login_start,20100907 22:22:43    uid,0,root      id,    gid,0,root      euid,,root      egid,,root  process,32519,ssh        return,0,yes    acct,mijohnst  addr,127.0.0.1  auid,500,mijohnst      exe,/usr/sbin/sshd      hostname,vm-rhel5      msg,PAM session open  subj,system_u:system_r:unconfined_t:s0-s0:c0.c1023      terminal,ssh
Of coarse everything I need is in this entry, but how do I pick through what I don't need. I've written a short command to help.

Code:

egrep "return\,0\,yes" /var/log/audit/audit.log | awk '{print "Date = " $5" "$4"\t\tUID = "$13"\t\tFrom = "$14}'
The problem with this command is that the output data isn't always located in the same column. So UID might be $21 for one entry and $22 for the next. An example would be like:

Code:

Date = 22:22:43 event login_start 20100907              UID = acct mijohnst            From = addr 127.0.0.1
Date = 22:22:43 event login_auth 20100907              UID = addr 127.0.0.1            From = auid 500 mijohnst

So I guess my question is, what's the best way to proceed? I'm thinking somehow I should be able to write a function that assigns the beginning of certain columns with a veritable, ie. any line beginning with "acct" would automatically be written into a variable called $ACCOUNT. Does that make sense?

I don't expect anyone to know this, I guess I'm still trying to solve it out in my head and it's helping to write it down, think about it and hope someone might have a direction to point me. As always, thanks again!

ghostdog74 09-08-2010 10:16 PM

There should be facilities already created to parse snare log files. Have you check tools like sawmill etc ?

mijohnst 09-08-2010 10:46 PM

Well, you have to buy sawmill and I was looking at something to work with that I already have... Also, I audit and roll my logs every week. Thanks for the suggestion though.

quanta 09-09-2010 02:16 AM

Quote:

Originally Posted by mijohnst (Post 4092028)
For instance when someone logs in with SSH, I want to know when, who, successful/unsuccessful and from where.

Because of this purpose, I suggest you use OSSEC.

mijohnst 09-09-2010 09:29 AM

Thanks for the suggestion Quanta. OSSEC looks good but I don't want to have to install agents, enable httpd or anything like that. I have a whole security process that I would have to go through in order to allow use of something like OSSEC or Sawmill. I want to keep it simple...which means just parse the snare.log file for the week and then roll them off the machines. Anyway, thanks for the suggestion.

catkin 09-09-2010 11:48 AM

I'm reasonably fluent at parsing text strings with bash but would choose awk for this task; it could be done in bash but would be difficult to write in a transparent way and so difficult to maintain. Please say if using awk is not acceptable for you and I'll see if anything half-comprehensible can be written in bash but I'll need an example of every type of line from the log.

frieza 09-09-2010 12:37 PM

in that case i would probably use cut with a for loop and an if statement to check for the word UID, then base the rest of the parsing for the line from the results of that

in pseudo code
Code:

for [c=1;c<=(a large number);c=c+1]
do
  if [`cut (delimeter = space colum=$c)` == 'UID']
  then
        uidcolum=c
        break
  fi
done

this might not be the best idea but it should work note the backtics (`) tell the shell that the part between them is a shell command

ghostdog74 09-09-2010 06:08 PM

if you have Python
Code:

tags=['acct','addr']
for line in open("file"):
    s=line.split()
    print s[3],s[4],
    for item in s:
        if item[:4] in tags:
            print item,


mijohnst 09-09-2010 10:46 PM

Thanks very much for the responses guys! Some very good ideas here. I wish I knew more about python because I've heard it's powerful. I think however I'm going to go with the awk script...only because I know more about it and I have the sed/awk books on hand. When I figure it out I'll post it here In hopes it will help others.


All times are GMT -5. The time now is 06:23 AM.