LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 05-01-2024, 04:19 PM   #1
johnnybao
LQ Newbie
 
Registered: May 2024
Posts: 2

Rep: Reputation: 0
Question awk pickle


Hi I am trying to write a proper awk statement to only return hostname entries from a logfile from a week ago to present time.

Logfile format is like this:
27-04-2024_00:04 hostname1 EverythingElseAfterHere
28-04-2024_02:05 hostname2 EverythingElseAfterHere

I thought I could reformat the date to a single string and compare like so:

#!/bin/bash
# get the date from a week ago:
lastweek=$(date +"%Y-%m-%d" --date="1 week ago")
# run today (5/1/24), this returns:
20240424

Then I tried converting field $1 in my file via awk to a similar format:
awk 'n=split($1,a,"[-_]") {print a[3] a[2] a[1]}' mylogfile
# this also looks good, returning as an example:
20240427

Here is where I get stuck. I want to (if possible) use the value of n to compare with lastweek and see if the date (value) is greater:
awk -v lastweek="$lastweek" 'n=split($1,a,"[-_]") {print a[3] a[2] a[1]} n > lastweek {print $2}' mylogfile
# this just returns more dates like '20240427' but I want field 2 with the hostname

I don't even know if I am doing the compare correctly or if its even possible.
I am trying to push the output from the split/print subcommand into 'n' and then compare that timestamp as text to the lastweek text and if n is greater then output $2 (hostname). Its getting messy and I am getting confused now as I am not very familiar with awk.

Any help would be greatly appreciated.

Thanks!
 
Old 05-01-2024, 05:43 PM   #2
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,616

Rep: Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555

Here's the fixed version of the method you're trying to do:
Code:
awk -v lastweek="$lastweek" 'split($1,a,"[_-]") && (a[3]"-"a[2]"-"a[1]) > lastweek {print $2}' input-file
The date command you used has hyphens, hence why we are re-inserting them here.

Otherwise, the return value of split is not needed, nor is print needed to concatenate, and we use && to make it a single condition/action item.

-

Alternatively, with GNU Awk, there are date functions available, so we can re-format the date into descending order, and use mktime to output a timestamp, e.g:
Code:
awk -F '[ _:-]' '{print mktime($3" "$2" "$1" "$4" "$5" 00")}' input-file
To set the date cut-off, there's two ways - either subtract the appropriate number of seconds from current time:
Code:
awk -vDaysAgo=4 'split($1,d,"[_:-]") && mktime(d[3]" "d[2]" "d[1]" "d[4]" "d[5]" 00")>(systime()-86400*DaysAgo) {print $2}' input-file
Or take advantage of a useful Gawk feature correcting out of range values:
Code:
awk -vDaysAgo=4 'split($1,d,"[_:-]") && mktime(d[3]" "d[2]" "d[1]+DaysAgo" "d[4]" "d[5]" 00")>systime() {print $2}' input-file
i.e. Adding 7 to 28 April results in "35 April" but gets corrected to "5 May"

If the hostnames are internal and can be guaranteed to not include hyphens or underscores, it can be simplified to:
Code:
awk -vDaysAgo=4 -F '[ _:-]' 'mktime($3" "$2" "$1+DaysAgo" "$4" "$5" 00")>systime() {print $6}' input-file
(Using 4 days ago here, because (at time of posting) that's the difference in the two rows of sample data.)


Last edited by boughtonp; 05-01-2024 at 05:57 PM.
 
3 members found this post helpful.
Old 05-02-2024, 12:36 PM   #3
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,333
Blog Entries: 3

Rep: Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730Reputation: 3730
Or using a slightly different date format output from the date utility will make the comparison easier:

Code:
awk -v lastweek="$(date -d 'last week' +'%Y%m%d')" \
    '{ n=split($1,a,"[-_]"); 
       date = a[3] a[2] a[1]; } 
     date > lastweek { print; }' \
     mylogfile
 
2 members found this post helpful.
Old 05-02-2024, 05:04 PM   #4
johnnybao
LQ Newbie
 
Registered: May 2024
Posts: 2

Original Poster
Rep: Reputation: 0
Smile

Thank you @boughtonp - that worked perfectly!
awk -vDaysAgo=4 'split($1,d,"[_:-]") && mktime(d[3]" "d[2]" "d[1]" "d[4]" "d[5]" 00")>(systime()-86400*DaysAgo) {print $2}' input-file

@Turbocapitalist I just saw your response and will check it out also for the alternate formatting.

Thanks all!
 
Old 05-02-2024, 05:46 PM   #5
astrogeek
Moderator
 
Registered: Oct 2008
Distribution: Slackware [64]-X.{0|1|2|37|-current} ::12<=X<=15, FreeBSD_12{.0|.1}
Posts: 6,269
Blog Entries: 24

Rep: Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206Reputation: 4206
Welcome to LQ johnnybao!

You have already attracted replies from two of the sharp pencils who share their knowledge here, so nothing to add! But I invite you to visit the Programming forum here at LQ where you may find others eager to offer help with any programming question when needed!

Again, welcome and good luck!
 
1 members found this post helpful.
Old 05-02-2024, 06:41 PM   #6
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,759

Rep: Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930
I know you posted it works perfectly and I have not actually played with the code but it depends on what date/times you actually want to "extract". For 1 week ago does then mean based on today 2/5 (or 5/2) anything > 25/4 or >= 25/4? Is the log file in UTC (I would guess) or local time?

boughtonp's script works on seconds so that if you were running the script at say 0900 you would not necessarily see time stamps from 25/4 (again 1 week ago from today 2/5) < 0900.

On the other hand, Turbocapitalist's script should output anything > 25/4 (based on 2/5) regardless of time.

Assuming I am awake enough to follow everything...
 
1 members found this post helpful.
Old 05-03-2024, 02:02 AM   #7
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,807

Rep: Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207Reputation: 1207
split() returns the number of fields i.e. the number of resulting array elements.

A simple string concatenation is done as (a[3] a[2] a[1])
String concatenation in awk does not have an operator; for clarity I wrap it in parentheses.
An alternative is sprintf("%s%s%s", a[3], a[2], a[1])
 
1 members found this post helpful.
Old 05-03-2024, 09:30 AM   #8
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,616

Rep: Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555

Quote:
Originally Posted by michaelk View Post
Is the log file in UTC (I would guess) or local time?
Two good points I meant to mention - I got distracted by wrestling with the idiotic LQ "security" filter not letting me post.

My view is that log files should be UTC (or include timezone), but that's definitely not guaranteed, so it might be necessary to add/remove hours as appropriate.


Quote:
boughtonp's script works on seconds so that if you were running the script at say 0900 you would not necessarily see time stamps from 25/4 (again 1 week ago from today 2/5) < 0900.
This was a deliberate choice to do it that way - again I meant to make it clear but forgot.

If one wanted they can set the hour and minute values to zero for midnight and have it work the other way. (Or indeed, some other fixed time of day if that makes sense for the use-case.)


Last edited by boughtonp; 05-03-2024 at 09:32 AM.
 
Old 05-03-2024, 09:44 AM   #9
michaelk
Moderator
 
Registered: Aug 2002
Posts: 25,759

Rep: Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930Reputation: 5930
I had thought about setting the default time to midnight. There are a couple of odd cases where the OP might not get the exact desired data in either script. Depending on the data, the OP's timezone and when the script was set to run, the starting results could be either the day before or day after.
 
Old 05-03-2024, 05:15 PM   #10
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,142

Rep: Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123Reputation: 4123
These are issues only the OP can determine - or more likely not give a damn about. "logs from a week ago" is sufficiently vague to not worry about IMHO. Plenty of good (awk) ideas already presented for the OP to work with.
 
  


Reply

Tags
awk



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
a triple boot pickle suse ubuntu winxp marblz Linux - Software 8 07-06-2007 04:36 PM
I'm in a big pickle. Small ram, cpu, and hard drive redfedora88 Linux - Distributions 10 01-08-2007 07:28 PM
A real lilo pickle legacyboy Mandriva 6 06-05-2006 02:25 PM
Resolution Pickle! Please Help! Geepak Linux - Newbie 4 02-08-2005 09:20 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 09:50 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration