LinuxQuestions.org - Shell grep issue

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Shell grep issue (https://www.linuxquestions.org/questions/programming-9/shell-grep-issue-176744/)

Shell grep issue

List,

I have an ever-growing firewall logfile on my FreeBSD, and I want to perform some actions on the events of the last 15 minutes. Here is a sample:

May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-4-106023: Deny udp src inside:10.161.6.63/123 dst outside:131.107.1.10/123
May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-4-106023: Deny udp src inside:10.130.71.149/1814 dst outside:193.100.2.1/22200
May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-3-106011: Deny inbound (No xlate) icmp src inside:10.31.255.34 dst inside:10.134.9.130
May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-4-106023: Deny tcp src inside:10.150.143.62/3719 dst outside:206.173.193.10/80
May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-4-106023: Deny udp src inside:10.140.155.27/1652 dst outside:192.43.244.18/123

Is there a way that I can make my box only show the last 15 minutes? I tried various ways, but I think I'm making it overly complicated :(
Any suggestions?

Thx,

Phil

A lot depends on what you're doing with the data from the last 15 minutes.

A simple way to extract the last bits from a file (not necessarily the last 15 minutes but...) is to use tail. You can ask for the last n lines (where n is the number of lines to select), you can 'follow' the output of a log file, etc...
Check the fine man page or info for details.

EX: tail /var/log.boot.log

You can select as many lines as you might think will get the job done. Trying to get lines that match the last 15 minutes would be work, and it's do-able, but not off the top of my head. I'd also need mroe detail as to the specifics of wht you're doing.

You can grep the log file for the time about 15 minutes ago, probably leaving off the seconds. Use grep -m 1 -n to get the line number of the first hit. Then parse out the line number, subtract that from the total wc of the file and do a tail on that number.

I have been thinking about tailing # number of lines, but that is practically not doable; sometimes this spans 15 minutes, sometimes 2 hours. I would need it more accurate than 'last 10.000 lines' or so.

WC -l on the total file is not an option either since that file is 10+ gigs and if I'd have to grep AND wc the file, that would double the execution time.

Isn't there a way to compare 2 dates?

Not really. You usually have to write code.
Here is some C that kind of does what you want. We use it for file times,
so you'll have to play with it -

Code:

/*  t_diff  */

/*        time differences between to two date values in minutes */

/*        usage tdiff time1 time2 */

/*        time1, time2 format: MON DD HH:MM */

/*        eg Feb  1 15:34 Jan 19 12:24 */

/*        in the format displayerd by ll for file times for recent files */

/*    -----assumes dates are from the same year */

/*        so that Jan  1 00:01  Dec 31 23:59  does not return 0 minutes  */

/*              rather, it returns  525600 minutes */



/* value displayed to stdout: (abs) number of seconds elapsed */



/*-----------------------------------------------------------------*/



#include <stdio.h>

#include <stdlib.h>

#include <time.h>

#include <sysexits.h>

#include <ctype.h>

#include <string.h>

#include <errno.h>



/*                                        number returned by argc */



#define CORRECT_ARGS 7



/*                                        month names for validation */



const char mon[][4]={"JAN",

                    "FEB",

                    "MAR",

                    "APR",

                    "MAY",

                    "JUN",

                    "JUL",

                    "AUG",

                    "SEP",

                    "OCT",

                    "NOV",

                    "DEC"};

                                

/*                                      month lengths  */

const int mlens[]= { 31,

                    28,

                    31,

                    30,

                    31,

                    30,

                    31,

                    31,

                    30,

                    31,

                    30,

                    31};



/*                                        absolute value macro */



#define abs(c)  ( (c<0) ? (-1)*c : c )



/*                                        prototypes */



time_t time_diff(struct tm *, struct tm *);

void to_time(struct tm *, char *[], int);

void usage(int);

void chkargs(int, char *[]);

int get_months(char *);

void bad_date(void);

int the_year(void);



/*                                        end prototypes */



int main(int argc, char *argv[]){

    struct tm now;

    struct tm then;

    long diff=0;

    chkargs(argc, argv);

    to_time(&now,argv,1);

    to_time(&then,argv,4);

    diff=(long) time_diff(&now,&then);

    diff = abs(diff);

    diff/= 60;

    fprintf(stdout,"%d\n",diff);

    return 0;

}



/*                                          display usage and exit with error */



void usage(int EXITVALUE){

    fprintf(stderr,"%s\n",

      "usage:  t_diff <time1> <time2>");

    fprintf(stderr,"\t%s\n",

      "example: t_diff May 1 20:43  May 1 21:28");

    exit(EXITVALUE);

}



/*                                                    validate arguments */



void chkargs(int argcnt, char *argvals[]){

    char tmp[256]={0x00};

    int i=0;

    int found=0;

    if (argcnt != CORRECT_ARGS)                              /* wrong number of arguments */

          usage(EX_USAGE);



    for (i=1; i < argcnt; i++){

          char *buf=argvals[i];

          char *s=tmp;

          memset(tmp,0x00,sizeof(tmp));

          if (strlen(buf) > 6)                        /*  block big strings */

                    usage(EX_DATAERR);

          if(! isdigit((int)*buf) ){

                  found=get_months(buf);                  /* get_months exits on bad month */

          }

          if(i==3 || i==6 ){                            /* must have colon in hh:mm */

                  if(strstr(buf,":") == NULL)

                            usage(EX_DATAERR);

          }

          if( i==2 || i==3 || i==5 || i==6 ){                /* day and hour must be numbers */

                      if(! isdigit((int)*buf) )

                              usage(EX_DATAERR);

          }

    }

}



/*                                          create a tm struct from 3 strings: */

/*                                        month day hour:minute    */

/* members of tm struct ---

          int tm_sec;      /o seconds after the minute - [0,61] o/

          int tm_min;      /o minutes after the hour - [0,59] o/

          int tm_hour;    /o hours - [0,23] o/

          int tm_mday;    /o day of month - [1,31] o/

          int tm_mon;      /o month of year - [0,11] o/

          int tm_year;    /o years since 1900 o/

          int tm_wday;    /o days since Sunday - [0,6] o/

          int tm_yday;    /o days since January 1 - [0,365] o/

          int tm_isdst;    /o daylight savings time flag o/



*/





void to_time(struct tm *dest, char *times[], int start){

      char *buf, *s;              

      int i=0;

      int test=0;

      int test2=0;

      char curmon[4]={0x00};

      int found=0;

      dest->tm_isdst=(-1);                      /* init unused members to zero */

      dest->tm_wday=0;

      dest->tm_yday=0;

      dest->tm_year=the_year();                  /* use current year */

      dest->tm_sec=0;                                  /* set seconds, prevent rollover problems */

      dest->tm_mon=get_months(times[start]);        /* set the month */

      test=dest->tm_mon;

      if(test > 11 || test < 0 )

            bad_date();

      dest->tm_mday=atoi(times[start + 1]);        /* set the day */

      test2=dest->tm_mday;

      if( test2 > mlens[test] || test2 < 1 ) {  /* bad date value */

          if (test != 1 && test2 != 29){        /* it's not a leap year value */

                bad_date();

          }

      }

      dest->tm_hour=atoi(times[start + 2]);        /* set the hour */

      test=dest->tm_hour;

      if(test < 0 || test > 23 )                /* test hour value */

              bad_date();

      for(buf=times[start + 2];*buf  && *buf!=':';buf++);

      buf++;

      dest->tm_min=atoi(buf);                  /* set the minute - after the ':' */

      test=dest->tm_min;

      if(test < 0 || test > 59)                /* test minute value */

              bad_date();                                                                        

}



/* return the difference in seconds, time_t is unsigned long */



time_t time_diff(struct tm *now, struct tm *then){

      time_t result=0;

      time_t one=0;

      time_t two=0;

      one=mktime(now);                          /* get seconds for first struct */      

      two=mktime(then);                        /* second struct */            

      if(one == (time_t) -1 ||                  /* error check */

        two == (time_t) -1 ) 

      {

                bad_date();

      }                                            

      if(two > one ) {                          /* return only  positive numbers or zero */

                result = two - one;

      }else{

          result = one - two;

      }

      return result;

}



int get_months(char *month){

      char *buf;

      int i=0;

      int found=0;        

      char curmon[16]={0x00};

      if (strlen(month) > 4 )

            usage(EX_DATAERR);

      strcpy(curmon,month);

      for(buf=curmon; *buf; buf++) {                /* upp case the month */

            *buf=toupper((int)*buf);

      }

      for(i=0;!found & i<12; i++){                    /* get month number, zero-based */

            if(!strcmp(curmon, mon[i]) )

                    found=i;

      }

      if (!found)                                    /* we got a bad month */

            usage(EX_DATAERR);

      return found;

}



/*                                              complain about bad time value */



void bad_date(void){

    fprintf(stderr,

            "%s\n",

            "Invalid day, hour or minute value.");

    usage(EX_DATAERR);        

}      



/* returns current year  */



int the_year(void){  

        time_t localt;

        struct tm local;

        struct tm *ptr;

        localt=time(NULL);

        ptr=localtime(&localt);

        memcpy(&local,ptr,sizeof(struct tm));

        return local.tm_year;

}

wow. thats the longest peice of code i've ever seen posted lol!! really, i'll bet that awk has an answer for you. i just started it in school, but if FreeBSD has awk (i'm new so i only assume it does), then awk would be a much better bet. we just learned in class today about how to do operations on fields, and i can give you a conceptual answer in awk, and i do hope this helps somewhat. i have not had an opportunity to try this myself just yet and i am by no means experienced with awk. but here goes. i know you can get the system date, and i know awk can parse fields within file. it can even parse fields within fields. so like if i had

May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29:

in a file, i could take out just the minute portion of the system date, subract 15 and save it. then i could compare it with all the minute feilds and echo all the ones less than 15 to wherever. ill bet that a much much much simpler solution lies in awk. i'll put up a link to an online book about awk, and maybe someone more experienced can offer such a solution.

awk book

sorry i couldnt be of more assistance.

jim:
I didn't have time to look into the long code yet, doesn't seem to be the most convenient solution. But thanks for the suggestion, if I run out of other options/suggestions I will have a look :)

sphynx:
FreeBSD indeed has several AWK flavors, I tried to play with it as well. The problem with your suggestion is that my logfile spans multiple days. So I need to grep on the current date first, then on time and so on. Hmmm, might need to play a bit longer to find something useful...

Hi,

My solution is a simple shell script (for now), which can be optimized later, I suppose. It's very slow (as you might imagine). But, it works.

Also, I didn't understand the logfile line. For example, this line:
May _1 22:19:55 10.11.12.13 May 01 2004 20:20:29: %PIX-4-106023: Deny udp src inside:10.161.6.63/123 dst outside:131.107.1.10/123

I don't know what the fields: May _1 22:19:55 10.11.12.13
are. So, I've just ignored them. I'm using fields 5,6,7 and 8.

Code:

#!/bin/ksh



# Calculate the number of seconds from epoch to now.

epoch_to_now="`date +%s`"



while read line

do

    time=`echo "$line" | cut -d' ' -f5,6,7,8 | sed 's/:$//'`

    epoch_to_log=`date -d "$time" +%s`

    diff=`expr $epoch_to_now - 900 - $epoch_to_log`



    # echo "$diff:$epoch_to_log:$epoch_to_now"



    if [ "$diff" -le 0 ]

    then

        echo "$line"

    fi

done



exit 0

Now, this script runs horribly slow at about 100 lines per second (on my
system), which could take, er, a little while to parse a 10GB file.

So, here's the C version of it. It runs a little faster. It takes about
a second to parse a file with 250000 lines and about 6 seconds to parse
a 26MB file with about a million lines. Should be fast enough.

Beware, it does not have a LOT of checks which should be added before
using it. For example, what happens when the line does not conform to
the logfile input? Also, I've (being lazy) used a 10000 character array. You should use malloc or something similar if you're going to be using this in production.

Here it is:

Code:

#include <stdio.h>

#include <stdlib.h>



#define _XOPEN_SOURCE /* glibc2 needs this */

#include <time.h>



int main (int argc, char **argv) {

    FILE *inp = NULL;

    char line[10000];

    struct tm tm;

    time_t now = time (0);



    if (argc < 2) {

        fprintf (stderr, "Need an argument (a file)!\n");

        exit (1);

    }



    inp = fopen (argv[1], "r");

    if (!inp) {

        fprintf (stderr, "Uh-oh!\n");

        exit (2);

    }



    while (1) {

        fgets (line, 10000, inp);



        if (feof (inp)) {

            break;

        }

        // Should check if length is actually greater than 28 here.

        // Possible buffer overflow opportunity.

        char *time = line + 28;

        time_t this = 0;



        // Temporarily make the : at the end of the date a '\0' so that

        // it can be passed to strptime.

        line[48] = '\0';



        /* Ex: May 01 2004 20:20:29 */

        strptime (time, "%b %d %Y %H:%M:%S", &tm);

        this = mktime (&tm);



        // Set the \0 back to :

        line[48] = ':';



        // 900 is the number of seconds in 15 minutes (isn't it?)

        if (now - this <= 900) {

            puts (line);

        }

    }



    return (0);

}

Arvind

i was pondering an awk script to do this as i mentioned before. i have had a little time to think about it, and when finals are over i will give it a serious go. it's an interesting problem at the very least, and will give me some experience with awk. i'll let you konw if/what i come up with. we did stuff that is conceptually identical to what you want in lecture today, but unfortunatly i have so much due, so quickly i won't have time to sit down and give it serious thought for a little while.