LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-09-2021, 02:53 AM   #1
alexsec
LQ Newbie
 
Registered: Feb 2021
Posts: 3

Rep: Reputation: Disabled
Filtering logs stored in gz archive based on time range


Dear Linux friends,

Logrotate is conigured to archive logs after 7 days. So for example mail.log.8.gz from Apil 18 contains logs from 11th to 18th April. What I need to do is to filter logs from 23 April 15:00 to 2 June and write those logs to file. Is there a smart way to this like script

Tnx
 
Old 06-09-2021, 02:56 AM   #2
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,129

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
Sure there is. Start with a basic search for gzip grep - you never know what you'll learn.
 
Old 06-10-2021, 01:07 AM   #3
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
... eg https://alvinalexander.com/blog/post...-gz-text-file/
 
Old 06-14-2021, 12:02 PM   #4
elgrandeperro
Member
 
Registered: Apr 2021
Posts: 415
Blog Entries: 2

Rep: Reputation: Disabled
Of course you can gunzip or zcat the files to your filter, if this is a one off, then you can just list the files in time order, then just use a editor to omit the data outside your timeframe. Since it is time order it should be obvious which lines to keep, you just need the entries between the first and last line timestamp.

If however you want to do this periodically, it gets very complicated. For instance, I used perl and it has a log file ripper that can generate the time in Unix time, since trying to compare a date string can get really complicated. Since you know what the endpoints are going to be, then you just make sure the log entry is between those values for each log entry. Once it is greater, you know (since in time order) you don't need to do the rest of input.

You could to it in bash using the %s to generate start/end/current. Each entry incurs a exec of date, for a large log file could be really slow. Small log, probably acceptable.

I've done this exact thing about 100 times for various reasons.
 
Old 06-15-2021, 01:15 PM   #5
allend
LQ 5k Club
 
Registered: Oct 2003
Location: Melbourne
Distribution: Slackware64-15.0
Posts: 6,372

Rep: Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750Reputation: 2750
If your log files have lines starting with something like "Jun 14 11:11:04", then this little awk script could be used on the unzipped archive.

Code:
# awk script to output log file entries between supplied dates and times
# Call with awk -v start="YYYY MM DD HH MM SS" -v end="YYYY MM DD HH MM SS" -f <this script name> <log file>
BEGIN {
split(start,ts)
split(end,te)
if (ts[1]==te[1]) {
  st=mktime(start)
  et=mktime(end)
}
else {
  print ts[1], te[1], "Cannot cross years. Exiting"
  exit
}
# To translate text month to numeric
m["Jan"]=1
m["Feb"]=2
m["Mar"]=3
m["Apr"]=4
m["May"]=5
m["Jun"]=6
m["Jul"]=7
m["Aug"]=8
m["Sep"]=9
m["Oct"]=10
m["Nov"]=11
m["Dec"]=12
}

{ if ($1 in m) {month=m[$1]}
  split($3,t,":")
  ct=mktime(ts[1]" "month" "$2" "t[1]" "t[2]" "t[3])
  if (ct > st && ct < et) {
    print
  }
}
 
Old 06-15-2021, 10:53 PM   #6
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,803

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by elgrandeperro View Post
... since trying to compare a date string can get really complicated. Since you know what the endpoints are going to be, then you just make sure the log entry is between those values for each log entry. Once it is greater, you know (since in time order) you don't need
Well, if logrotate is rotating logs every 7 days, you'll need to examine all the logs that cover the desired time period. Concatenate the records from successive logs into one big output:
Code:
touch output.file
gzip -dc log.file.1.gz | my-date-filter >> output.file
gzip -dc log.file.2.gz | my-date-filter >> output.file
...
When dealing with that default date/time format ("mmm dd HH:MM:SS"), I've resorted to a Perl script that reads the log file line-by-line, extract everything before the hostname, format it as "yyyy-mm-ddTHH:MM:SS", test the record's time/data stamp to see if it's in the desired range:
Code:
$start = "2021-04-23T15:00:00";
$tstop = "2021-06-02T00:)0:)0";
if ( $datetime_ge $tstart ) {
    if ( $datetime le $tstop ) {
        # write original record to output
    }
}
There are many ways to create "$datetime" from the log file bits. One way to get started: use "$months = 'JanFebMarAprMayJun...'" and 'int( index( $months, $month_from_log ) ) + 1'. Day of the month is a piece of cake.

If you have some control over the server, see about setting rsyslog to use ISO 8601 date format in the log records. It makes finding dates that fall after, before, or between certain dates far, far easier. (At least you'd no longer have to convert the dates in your filtering script.) See this old page for details. This change works on newer distributions, too, on non_Ubuntu Linuxes like OpenSUSE and, I strongly suspect, most others. It won't help you with older log files but future searches for records in a given date/time range should be much easier.

HTH...
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Back up logs file and create a script showing the backed up logs and the running logs Billy_6052 Programming 5 12-13-2014 02:32 AM
content filtering with layer7- filtering rose1366m Linux - Networking 1 05-04-2011 11:10 AM
"Input signal out of range" where are the KDE settings stored? whoisdon Linux - Software 1 10-18-2006 12:44 AM
importing an MS Outlook archive archive.pst alloydog Linux - Software 2 08-29-2003 03:02 PM
Sendmail Spam filtering and Virus filtering MrJoshua Linux - General 2 04-03-2003 10:12 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 05:40 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration