Best way to extract lines from a log file in Linux
My query is that I have a log file which is around 7 GB and I want to extract lines between a certain time range which should total up to half of the size to 3.5 GB. The lines start with a time range like "2014-03-11 17:35:00". I want to extract the lines between "2014-03-11 17:35:00" to "2014-03-11 18:05:00". What should be the best way as there may be a grep command or a sed command to do it.
I hope, my question is clear. Please revert with the reply to my query. Regards |
There is no best solution, but there are a lot of different languages..... What do you prefer? What have you tried so far?
you can use for example awk (but it will look similar in perl too): Code:
# pseudo code |
Thanks for your help. Request you to please explain the command also with special reference to the switch option as in { set a switch to 1 }.
Waiting for your revert. Regards |
that was not a script, but pseudo-code. It means you can implement your script based on that information, by translating the written text to commands. That "switch" is usually a variable....
|
TBH I'd just do it with a series of simple greps appending to a file and repeat as necessary.
http://xkcd.com/1319/ :) |
If you can't guarantee that the exact datetime stamps you want will actually exist in the logfile, then I'd use eg Perl and write a program that eg converted each datetime stamp to Epoch seconds and compared to your desired datetimes.
OTOH, for exact matches, sed can do it Code:
sed -n '/2014-03-11 17:35:00/,/2014-03-11 18:05:00/p' yourfile |
Quote:
The above approach will only work if both the 1st & last dates are found. What if those exact times (either/both) have no entries in the log, but there are entries between these times? Grep for a pattern (or six) might be easier if this is a once off requirement, or the start & end times always follow the same rules, but could get messy if not. Personally I'd write a bit of PERL code to parse the timestamps into seconds (using the Time::Local module or similar) & then use that numeric timestamp to filter. May be overkill, though :-) |
If log.txt contains
Quote:
Code:
awk '{split($1,d,"-");split($2,t,":");e=mktime(d[1]" "d[2]" "d[3]" "t[1]" "t[2]" "t[3]);if (e>=1394519700 && e<=1394521500) print $0}' log.txt Quote:
|
Quote:
|
All times are GMT -5. The time now is 08:25 PM. |