Best way to extract lines from a log file in Linux

RHCE_ran · 03-11-2014, 10:59 AM

My query is that I have a log file which is around 7 GB and I want to extract lines between a certain time range which should total up to half of the size to 3.5 GB. The lines start with a time range like "2014-03-11 17:35:00". I want to extract the lines between "2014-03-11 17:35:00" to "2014-03-11 18:05:00". What should be the best way as there may be a grep command or a sed command to do it.

I hope, my question is clear.

Please revert with the reply to my query.

Regards

pan64 · 03-11-2014, 11:51 AM

There is no best solution, but there are a lot of different languages..... What do you prefer? What have you tried so far?
you can use for example awk (but it will look similar in perl too):

Code:

# pseudo code
awk '/first date filter/ { set a switch to 1 }
     /last date filter/ { set that switch to 0 }
     print lines if switch is set to 1
'

hope this helps

RHCE_ran · 03-12-2014, 02:22 AM

Thanks for your help. Request you to please explain the command also with special reference to the switch option as in { set a switch to 1 }.

Waiting for your revert.

Regards

pan64 · 03-12-2014, 02:29 AM

that was not a script, but pseudo-code. It means you can implement your script based on that information, by translating the written text to commands. That "switch" is usually a variable....

descendant_command · 03-12-2014, 04:57 AM

TBH I'd just do it with a series of simple greps appending to a file and repeat as necessary.
http://xkcd.com/1319/

chrism01 · 03-12-2014, 06:03 AM

If you can't guarantee that the exact datetime stamps you want will actually exist in the logfile, then I'd use eg Perl and write a program that eg converted each datetime stamp to Epoch seconds and compared to your desired datetimes.

OTOH, for exact matches, sed can do it

Code:

sed -n '/2014-03-11 17:35:00/,/2014-03-11 18:05:00/p' yourfile

cliffordw · 03-12-2014, 06:13 AM

Quote:

Originally Posted by pan64

There is no best solution, but there are a lot of different languages..... What do you prefer? What have you tried so far?
you can use for example awk (but it will look similar in perl too):

Code:

# pseudo code
awk '/first date filter/ { set a switch to 1 }
     /last date filter/ { set that switch to 0 }
     print lines if switch is set to 1
'

hope this helps

Hi,

The above approach will only work if both the 1st & last dates are found. What if those exact times (either/both) have no entries in the log, but there are entries between these times?

Grep for a pattern (or six) might be easier if this is a once off requirement, or the start & end times always follow the same rules, but could get messy if not.

Personally I'd write a bit of PERL code to parse the timestamps into seconds (using the Time::Local module or similar) & then use that numeric timestamp to filter. May be overkill, though :-)

allend · 03-12-2014, 07:45 AM

If log.txt contains

Quote:

2014-03-11 17:34:00 line0
2014-03-11 17:35:00 line1
2014-03-11 17:36:00 line2
2014-03-11 17:37:00 line3
2014-03-11 17:38:00 line4
2014-03-11 17:53:00 line5
2014-03-11 18:05:00 line6
2014-03-11 18:06:00 line7

then

Code:

awk '{split($1,d,"-");split($2,t,":");e=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3]);if (e>=1394519700 && e<=1394521500) print $0}' log.txt

produces

Quote:

2014-03-11 17:35:00 line1
2014-03-11 17:36:00 line2
2014-03-11 17:37:00 line3
2014-03-11 17:38:00 line4
2014-03-11 17:53:00 line5
2014-03-11 18:05:00 line6

cliffordw · 03-12-2014, 11:43 PM

Quote:

Originally Posted by allend

If log.txt contains

Code:

awk '{split($1,d,"-");split($2,t,":");e=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3]);if (e>=1394519700 && e<=1394521500) print $0}' log.txt

produces

Beautiful! Didn't know mktime is available in awk. This is much shorter than the perl equivalent I suggested :-)