How to get the result in such log format using linux tools (awk,sed..) ?

sylye · 05-18-2012, 12:24 AM

hi guys,

I have a log file which has such format:

Code:

Fri May 18 13:13:57 MYT 2012
Variable_name   Value
Aborted_clients 1
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      200

Fri May 18 13:14:57 MYT 2012
Variable_name   Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      194

Fri May 18 13:15:57 MYT 2012
Variable_name   Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      177

If I need to get all the 'Aborted_clients' value, I can simply do a grep. But, how do I associate each 'Aborted_clients' value with the 'date' information on top of each paragraph ? I wish to have something like this:

Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0

In fact this logfile is generated by me every minute using a very simple syntax :

Code:

LOG=./mysqlstatus.log

echo "`date`" | tee -a $LOG
mysql -u root -e 'show status' | tee -a $LOG
echo | tee -a $LOG

If I need to change the way I log the result to make the log become easier to be retrieved the information, you are welcome to advise me as well.

Thanks a million for any assist and advice given

pan64 · 05-18-2012, 02:44 AM

Is this ok for you?

Code:

awk ' /MYT/ { date=$0; next } length { print date " " $0 } ' file

_________________________________
Happy with solution ... mark as SOLVED
If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post.

druuna · 05-18-2012, 03:39 AM

Hi,

If you are only interested in the date/time and Aborted_clients parts:

Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { print date, $0 }' infile

sylye · 05-18-2012, 05:14 AM

hi,

Thanks for both of your feedback, it's a nice one!

I have taken 'druuna' solution, it fits with my need exactly

sylye · 05-21-2012, 05:24 AM

From another similar log, but this time I would like to calculate the appearance of certain lines, for example:

Code:

Fri May 18 13:13:57 MYT 2012
Aborted_clients 
Aborted_clients 
Aborted_clients 
Aborted_connects        

Fri May 18 13:14:57 MYT 2012
Aborted_clients 
Aborted_connects     
Aborted_connects   

Fri May 18 13:15:57 MYT 2012
Aborted_clients 
Aborted_clients 
Aborted_clients 
Aborted_clients 
Aborted_connects

And the output I would need this time is:

Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 3; Aborted_connects 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 1; Aborted_connects 2
Fri May 18 13:15:57 MYT 2012 Aborted_clients 4; Aborted_connects 1

Is that doable ? I tried to manipulate from the solution given previously, but just not able to make it work

I tried googling, but can't find in awk docs on how to do this. Is this called multi pattern matching ?

druuna · 05-21-2012, 06:17 AM

Assuming the log file in question looks like the one in post #1:

Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { abcl=$0 } /Aborted_connects/ { print date, abcl " ; " $1, $2 }' infile

Here's an example of the output using the log from post #1:

Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { abcl=$0 } /Aborted_connects/ { print date, abcl " ; " $1, $2 }' infile
Fri May 18 13:13:57 MYT 2012 Aborted_clients 1 ; Aborted_connects 4
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0 ; Aborted_connects 4
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0 ; Aborted_connects 4

sylye · 05-21-2012, 06:36 AM

hi druuna,

Apology that I didn't make the question clearer. Your solution is in fact not what I want this time. What I need is something a bit different with another format of log like this:

Code:

Fri May 18 13:13:57 MYT 2012
Sleep
Sleep
Locked
Locked      

Fri May 18 13:14:57 MYT 2012
Sleep
Sleep
Sleep
Locked      

Fri May 18 13:15:57 MYT 2012
Sleep
Sleep
Sleep
Sleep
Sleep
Sleep

So I need to calculate the appearance of each state at each time. The end result will need to be like this:

Code:

Fri May 18 13:13:57 MYT 2012 Sleep 2; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6; Locked 0

pan64 · 05-21-2012, 06:48 AM

Code:

awk '/ MYT / { date=$0; next } /.../ { arr[$1]++; next } { printf date; for ( i in arr ) { printf ", %s %s", i, arr[i] }; printf "\n" } ' filename

Nylex · 05-21-2012, 06:52 AM

Rather than just using stuff people provide, you should try and understand it. There's a tutorial for AWK here that will help with that.

druuna · 05-21-2012, 07:30 AM

Code:

#!/bin/bash

awk '
BEGIN { RS="\n\n" ; FS="\n" }
{ 
for (i=2;i<=NF;i++)
  if   ( $i == "Sleep" )  { sleep[$1]++ }
  else { locked[$1]++ } 
} 
END { for (x in sleep) print x, "Sleep " sleep[x], "; Locked " locked[x] }
' infile | sort

Two possible issues with the above code:
1 - If a state isn't encountered, nothing instead of 0 will be printed.
2 - the sort part will only work if entries are all from the same day and month (date in examples aren't sane, sorting those will be hard. sorting the arrays from within gawk without losing indices might be possible).

Example output:

Code:

./blaat
Fri May 18 13:13:57 MYT 2012 Sleep 2 ; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3 ; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6 ; Locked

Maybe someone comes up with a cleaner solution.

PS: I just tried pan64's solution and it seems the sleep/locked states aren't counted correctly.

pan64 · 05-21-2012, 07:35 AM

Quote:

Originally Posted by druuna

PS: I just tried pan64's solution and it seems the sleep/locked states aren't counted correctly.

You are right, a delete arr is missing:

Code:

awk '/ MYT / { date=$0; next }
     /.../ { arr[$1]++; next }
     { printf date;
       for ( i in arr ) 
           { printf ", %s %s", i, arr[i] };
       printf "\n";
       delete arr; } ' filename

schneidz · 05-21-2012, 07:39 AM

Quote:

Originally Posted by sylye

hi guys,

I have a log file which has such format:

Code:

Fri May 18 13:13:57 MYT 2012
Variable_name   Value
Aborted_clients 1
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      200

Fri May 18 13:14:57 MYT 2012
Variable_name   Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      194

Fri May 18 13:15:57 MYT 2012
Variable_name   Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use   0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      177

If I need to get all the 'Aborted_clients' value, I can simply do a grep. But, how do I associate each 'Aborted_clients' value with the 'date' information on top of each paragraph ? I wish to have something like this:

Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0

In fact this logfile is generated by me every minute using a very simple syntax :

Code:

LOG=./mysqlstatus.log

echo "`date`" | tee -a $LOG
mysql -u root -e 'show status' | tee -a $LOG
echo | tee -a $LOG

If I need to change the way I log the result to make the log become easier to be retrieved the information, you are welcome to advise me as well.

Thanks a million for any assist and advice given

this is somewhat of a hack but mite help[untested]:

Code:

d=`date`; echo $d - `mysql -u root -e 'show status'` | tee -a $LOG

edit: this should work with the original log method[untested];

Code:

 egrep "(MYT|Aborted_clients)" | tr "\n" " "

sylye · 05-21-2012, 07:49 AM

Quote:

Originally Posted by pan64

Code:

awk '/ MYT / { date=$0; next } /.../ { arr[$1]++; next } { printf date; for ( i in arr ) { printf ", %s %s", i, arr[i] }; printf "\n" } ' filename

pan64,

I don't quite understand the part where you do

Code:

/.../

, what does that mean ?

My problem in making awk to read a block of text are:

i) I don't know how to make it print the result after every end of the block, as I understand awk is a stream processor, how do we make it aware to only print result at the end of each block? Like for instance,

Code:

Fri May 18 13:14:57 MYT 2012 -> read input, date=$0, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Locked  -> read input, arr[Locked]++, but don't print
"\n" -> read input, but this time, print the result!

ii) how does awk know WHEN to reset the counter back to zero and counting again when it finds another 'MYT' ?

druuna's way is using a RS to do that, but I don't quite understand how's pan64 way can make awk aware about the above (i) and (ii). pan64, mind to elaborate more ?

Nylex,

Thanks for your advice, I do know taking blindly other people stuff is not a good habit, and if you have notice my question in LQ so far, I am not fall into those category. I will try understanding and discuss with the feedback and try to give my own finding. I'm slow in understanding awk even though I did go through many tutorials, hope you guys bear with my weakness in this part.

schneidz · 05-21-2012, 08:01 AM

Quote:

Originally Posted by sylye

hi druuna,

Apology that I didn't make the question clearer. Your solution is in fact not what I want this time. What I need is something a bit different with another format of log like this:

Code:

Fri May 18 13:13:57 MYT 2012
Sleep
Sleep
Locked
Locked      

Fri May 18 13:14:57 MYT 2012
Sleep
Sleep
Sleep
Locked      

Fri May 18 13:15:57 MYT 2012
Sleep
Sleep
Sleep
Sleep
Sleep
Sleep

So I need to calculate the appearance of each state at each time. The end result will need to be like this:

Code:

Fri May 18 13:13:57 MYT 2012 Sleep 2; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6; Locked 0

heres mine [untested]:

Code:

i=1;for d in `grep -n ^$ style.log | sed s/:/""/`; do   sed -n "$i","$d"p style.log | sort | uniq -c;  i=$d; done

pan64 · 05-21-2012, 08:24 AM

just to see post #9, you should try and understand.
. means any char, so /.../ means 3 chars, with other words it is a line containing at least 3 chars.
Otherwise my awk works exactly as you described:
first look for MYT to start counting
next look for non-empty lines and sum up what found in it
last on empty lines print result, reset counter and start over