LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   using awk want to extract logs between two time stamps (https://www.linuxquestions.org/questions/linux-newbie-8/using-awk-want-to-extract-logs-between-two-time-stamps-4175598536/)

vaibhav. 01-30-2017 05:11 AM

using awk want to extract logs between two time stamps
 
0
down vote
favorite
1
I'm trying to extract all logs between two time-stamps. Some lines may not have the time-stamp, but I want those lines to be included - I want every line that falls under two time stamps to be included in the extracted logs. the extracted logs should contain the 1st and the last timestamp lines as well.

Note: the start time-stamp or end time-stamp may not be there in the log, but I want every line between these two time stamps to be extracted.

My log time-stamp structure looks like: 25-01-2017 07:06:16:860

The awk command I've written is fetching the lines which contains timestamp only and skipping all other lines and its fetching logs end_time-1 i.e. if I've giving end_time as 11:30 so its fetching logs till 11:29 or sometimes very strange less than that too.

PFB the command I've written:

awk -v date=${date} -v start_time=${start_time} -v end_time=${end_time} '{if (($0 >= date FS start_time) && ($0 <= date FS end_time)) print $0; }' $log.$server_name.log > $requester_email.log

Please note that this command will be inside a script where I'm asking the user to just enter the details manually so when I'm asking the start and end time then user have to enter in this format HH:MM i.e. for above timestamp the user will enter time something like 07:06

Will really appreciate if anyone can help me out here please.

Log Example:

25-01-2017 05:51:04:010 [DEBUG] - sjhadjkshajksdhjkashdkjahsdkjhasdjkhkashdjkhaksdjhasdjhkjhasdkhkasdhkashdhasdkhkjasdhkhasdkhasdjkhas djkhasdjkhasdjkhasdjkhkasdjhasdjkhasdjkhasdjkhjkasdhjkadhkhasdkjhasdjkhasdjkhasdjkhasjkdhkasjdhjkasd hasdjhasjkdhasjkd
asdas asda

25-01-2017 05:51:04:011 [DEBUG] - sjhadjkshajksdhjkashdkjahsdkjhasdjkhkashdjkhaksdjhasdjhkjhasdkhkasdhkashdhasdkhkjasdhkhasdkhasdjkhas djkhasdjkhasdjkhasdjkhkasdjhasdjkhasdjkhasdjkhjkasdhjkadhkhasdkjhasdjkhasdjkhasdjkhasjkdhkasjdhjkasd hasdjhasjkdhasjkd

rtmistler 01-30-2017 07:37 AM

Hi yaibhav and welcome to LQ.

Please note that your question is rather disorganized and it would be helpful for you to edit it or provide an update which clarifies what you have tried and what your desired result is, in a more readable fashion.

If you use [code][/code] tags to encompass code and output sections it will help to retain formatting and make things more readable. If editing in Advanced Edit mode under the LQ site you can use the # widget in the top of the edit window to do this same action to enclose code and output.

Reorganizing your question to the form:
  1. Here is my code
  2. Here is an example of my input
  3. Here is exactly the output for I'm trying to achieve
  4. Here is exactly where I'm stuck
will go a long way to being able to obtain better assistance with your question.

Your opening statement says that you wish to get the logs only between two timestamps? That is to say that you wish to delete or keep the timestamps as part of the output data?

You expect a user to input the time and date they wish to extract?

Do you need help with the conversion of input? Do you need help with attaining the input?

What script language are you using?

And recommend you delete the apparent extra unrelated characters at the very start of your post.

Turbocapitalist 01-30-2017 08:06 AM

Based on what you have above, the following modification would print the start time and end time and everything in between, regardless of whether it has a date - time stamp:

Code:

'$1 == date && $2 >= start_time { p = 1 }; $1 == date && $2 > end_time { p = 0 }; p { print }'
The toggle is there because you mention that not all the lines you need have date - time stamps.

You'll want a start date and an end date if you want to handle cases that span the beginning of a new day (midnight). But, you have the date in a weird format though, so if you want to span dates, you'll have to modify it to something that can be compared numerically. The basic ISO-8601 date format (yyyy-mm-dd) does that and to get that you'd have to add your own conversion function.

allend 01-30-2017 08:12 PM

Awk provides the mktime function for converting date and time strings into numerical timestamps.
If your logfile has a blank line between the end of one time stamped record and the start of the next, then perhaps this will give you some ideas.
Code:

BEGIN {
start_date="24-01-2017"
start_time="01:01:01"
split(start_date,d,"-")
split(start_time,t,":")
st=mktime(d[3]" "d[2]" "d[1]" "t[1]" "t[2]" "t[3])
end_date="31-01-2017"
end_time="01:01:01"
split(end_date,d,"-")
split(end_time,t,":")
et=mktime(d[3]" "d[2]" "d[1]" "t[1]" "t[2]" "t[3])
}

/^[[:digit:]]{2}-[[:digit:]]{2}-[[:digit:]]{4}/ {
  split($1,d,"-")
  split($2,t,":")
  ct=mktime(d[3]" "d[2]" "d[1]" "t[1]" "t[2]" "t[3])
  if (ct > st && ct < et) {
    while (NF > 0) {
      print
      getline
    }
  }
}


vaibhav. 02-01-2017 12:36 AM

Hi, thanks for the post... I tried the code but it's fetching whole logs instead for that specific time entered by the user...

Quote:

Originally Posted by Turbocapitalist (Post 5662268)
Based on what you have above, the following modification would print the start time and end time and everything in between, regardless of whether it has a date - time stamp:

Code:

'$1 == date && $2 >= start_time { p = 1 }; $1 == date && $2 > end_time { p = 0 }; p { print }'
The toggle is there because you mention that not all the lines you need have date - time stamps.

You'll want a start date and an end date if you want to handle cases that span the beginning of a new day (midnight). But, you have the date in a weird format though, so if you want to span dates, you'll have to modify it to something that can be compared numerically. The basic ISO-8601 date format (yyyy-mm-dd) does that and to get that you'd have to add your own conversion function.


Turbocapitalist 02-01-2017 12:43 AM

Quote:

Originally Posted by vaibhav. (Post 5663212)
Hi, thanks for the post... I tried the code but it's fetching whole logs instead for that specific time entered by the user...

Please show what your code looks like. Be sure to put it between [code][/code] tags so that it is easier to read.

vaibhav. 02-03-2017 12:24 AM

When I'm trying the below mentioned code then I'm able to fetch the logs from 07:05 onwards till 07:25.

Please note: only first line which is having a timestamp of 07:25 is getting fetched and rest of the lines for the same time-stamp are not printing.

Code:

awk ' /25-01-2017 07:05/,/25-01-2017 07:25/ ' server.log > fetched_log.log
PS: if I'm passing a variable in the mentioned code then it's not working at all - not fetching anything, PFB that code:

Code:

awk -v date=${date} -v start_time=${start_time} -v end_time=${end_time} ' /date FS start_time/,/date FS end_time/ ' $server_name > fetched_log.log

Turbocapitalist 02-03-2017 01:47 AM

Thanks, but neither of those approaches will work. In the first, you're telling it to stop after the first instance of the second pattern. In the second, you have extra white space around the Field Separator. Please try what I sent earlier. It needs the three variables passed withg -v to work, just substitute the parts in the quotes.

vaibhav. 02-03-2017 05:22 AM

Thanks, I tried the piece of code you've provide earlier (PFB) and updated the variables in that code but its still not providing the complete logs for that time-stamp.

Code:

awk -v date=${date} -v start_time=${start_time} -v end_time=${end_time} '$1 == date && $2 >= start_time { p = 1 }; $1 == date && $2 > end_time { p = 0 }; p { print }' $server_name > fetched_log.log
start_time entered is 07:05
end_time entered is 07:25.

The logs fetched via the above code has the start_time of 25-01-2017 07:05:08:806 and end_time of 25-01-2017 07:23:56:098.

Quote:

Originally Posted by Turbocapitalist (Post 5664332)
Thanks, but neither of those approaches will work. In the first, you're telling it to stop after the first instance of the second pattern. In the second, you have extra white space around the Field Separator. Please try what I sent earlier. It needs the three variables passed withg -v to work, just substitute the parts in the quotes.


Turbocapitalist 02-03-2017 05:47 AM

Quote:

Originally Posted by vaibhav. (Post 5664399)
The logs fetched via the above code has the start_time of 25-01-2017 07:05:08:806 and end_time of 25-01-2017 07:23:56:098.

Are you sure the logs have anything after 25-01-2017 07:23:56:098 at all?

Turbocapitalist 02-03-2017 06:00 AM

Also, keep in mind that awk, except for gawk, does not have any time functions so ${end_time} and ${start_time} are getting compared as strings not actual times.

vaibhav. 02-03-2017 06:58 AM

Yes, the log file have logs till 25-01-2017 12:33:09:389.

Quote:

Originally Posted by Turbocapitalist (Post 5664407)
Are you sure the logs have anything after 25-01-2017 07:23:56:098 at all?


vaibhav. 02-04-2017 08:48 AM

can you please help me with some other code?

Quote:

Originally Posted by Turbocapitalist (Post 5664413)
Also, keep in mind that awk, except for gawk, does not have any time functions so ${end_time} and ${start_time} are getting compared as strings not actual times.


Turbocapitalist 02-04-2017 09:36 AM

Sure. If it's related to what you have above, post it here and there will be people who will see it. If it is not related to what you've posted above and is a separate problem, then start a new thread and get fresh eyes on it.

vaibhav. 02-04-2017 10:03 AM

It's the same above issue only...

Quote:

Originally Posted by Turbocapitalist (Post 5665007)
Sure. If it's related to what you have above, post it here and there will be people who will see it. If it is not related to what you've posted above and is a separate problem, then start a new thread and get fresh eyes on it.


TB0ne 02-04-2017 10:10 AM

Quote:

Originally Posted by vaibhav. (Post 5665024)
It's the same above issue only...

Right...we know you're issue. What we're asking for is what YOU have done/tried/written on your own to try to solve it. Post that, please.


All times are GMT -5. The time now is 07:33 AM.