LinuxQuestions.org - Scan for time/date in a log file

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - Scan for time/date in a log file (https://www.linuxquestions.org/questions/programming-9/scan-for-time-date-in-a-log-file-949090/)

Scan for time/date in a log file

Hi everyone,

I'm trying to monitor a log file for a certain string. I'm thinking that I'll be doing a cron job to scan the file every x minutes.

In an attempt to eliminate false positives, I want to parse out the newest part of the file. Each log entry looks something like this:

2012-06-07T12:49:42.342-0700 ErrorMessageHere

I want to find all entries newer than the last scan. I'm going to store the scan time in a file, so at the beginning of the script, I will read that time and compare the values. Any time in the log newer than that time should be what I'm looking at

My problem is this: I don't know how to read and compare the date&time stamp. I'm guessing I'll have to use awk, but I'm at a loss as to how.

Any assistance that can be offered is welcome.

The timestamp you have is sortable. That is the beauty of ASCII.
2012-06-07T12:49:42.342-0700
(but what is the -0700 bit?)

Why would you have to use awk?
Any language with string comparison will work as expected.

Code:

while read timestamp message; do

  [[ $saved > $timestamp ]] && echo $timestamp $message

done

Alternatively you could take the file position instead, that would be easier as long
as the log does not truncate.

Man, did I overthink that!

That while statement did the trick. Thank you very much!

It's nice to be useful occasionally!

Quote:

Originally Posted by bigearsbilly (Post 4698231)

but what is the -0700 bit?

It's the numeric representation of the timezone, ±HHMM compared to UTC.

If you ignore the effect of the timezone on the time stamps, then

Code:

awk -v since="timestamp" '($1 >= since)' log-file

should also work. You could use

Code:

since="$(awk -v since="$since" '($1 >= since) { printf("%s\n", $0) > "/dev/stderr" ; if ($1 > max) max = $1 } END { printf("%s\n", max) }' log-file)"

to output all new entries in log-file to standard error, while also updating the timestamp. If since is initially empty, it will output the entire log file.

Note the >= . It means all entries matching the final timestamp in the last round will be included in the next round, but that is intentional: that way you don't lose an error message that happens in the same millisecond but AFTER you have last read the log file. If you don't want the repeats, and are willing to risk missing an error message if it happens at the same millisecond, then you can use > instead for the comparison.

If you don't want to miss anything, but don't want any repeats either, you need a timestamp and a counter or a hash list (a single since variable, but with two or more words in it separated by whitespace); the counter specifying the number of log lines output for that timestamp, or each hash matching an already output log line at that timestamp. (The latter works more reliably when the log files are rotated.) The awk script gets progressively more complicated, and I'd personally just live with the initial duplicate log line(s).