Best way to extract lines from a log file in Linux
Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Best way to extract lines from a log file in Linux
My query is that I have a log file which is around 7 GB and I want to extract lines between a certain time range which should total up to half of the size to 3.5 GB. The lines start with a time range like "2014-03-11 17:35:00". I want to extract the lines between "2014-03-11 17:35:00" to "2014-03-11 18:05:00". What should be the best way as there may be a grep command or a sed command to do it.
There is no best solution, but there are a lot of different languages..... What do you prefer? What have you tried so far?
you can use for example awk (but it will look similar in perl too):
Code:
# pseudo code
awk '/first date filter/ { set a switch to 1 }
/last date filter/ { set that switch to 0 }
print lines if switch is set to 1
'
that was not a script, but pseudo-code. It means you can implement your script based on that information, by translating the written text to commands. That "switch" is usually a variable....
If you can't guarantee that the exact datetime stamps you want will actually exist in the logfile, then I'd use eg Perl and write a program that eg converted each datetime stamp to Epoch seconds and compared to your desired datetimes.
OTOH, for exact matches, sed can do it
Code:
sed -n '/2014-03-11 17:35:00/,/2014-03-11 18:05:00/p' yourfile
There is no best solution, but there are a lot of different languages..... What do you prefer? What have you tried so far?
you can use for example awk (but it will look similar in perl too):
Code:
# pseudo code
awk '/first date filter/ { set a switch to 1 }
/last date filter/ { set that switch to 0 }
print lines if switch is set to 1
'
hope this helps
Hi,
The above approach will only work if both the 1st & last dates are found. What if those exact times (either/both) have no entries in the log, but there are entries between these times?
Grep for a pattern (or six) might be easier if this is a once off requirement, or the start & end times always follow the same rules, but could get messy if not.
Personally I'd write a bit of PERL code to parse the timestamps into seconds (using the Time::Local module or similar) & then use that numeric timestamp to filter. May be overkill, though :-)
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.