LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   How to get the result in such log format using linux tools (awk,sed..) ? (https://www.linuxquestions.org/questions/linux-software-2/how-to-get-the-result-in-such-log-format-using-linux-tools-awk-sed-945642/)

sylye 05-18-2012 12:24 AM

How to get the result in such log format using linux tools (awk,sed..) ?
 
hi guys,

I have a log file which has such format:

Code:

Fri May 18 13:13:57 MYT 2012
Variable_name  Value
Aborted_clients 1
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      200

Fri May 18 13:14:57 MYT 2012
Variable_name  Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      194

Fri May 18 13:15:57 MYT 2012
Variable_name  Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      177

If I need to get all the 'Aborted_clients' value, I can simply do a grep. But, how do I associate each 'Aborted_clients' value with the 'date' information on top of each paragraph ? I wish to have something like this:

Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0

In fact this logfile is generated by me every minute using a very simple syntax :
Code:

LOG=./mysqlstatus.log

echo "`date`" | tee -a $LOG
mysql -u root -e 'show status' | tee -a $LOG
echo | tee -a $LOG

If I need to change the way I log the result to make the log become easier to be retrieved the information, you are welcome to advise me as well.

Thanks a million for any assist and advice given :)

pan64 05-18-2012 02:44 AM

Is this ok for you?
Code:

awk ' /MYT/ { date=$0; next } length { print date " " $0 } ' file






_________________________________
Happy with solution ... mark as SOLVED
If someone helps you, or you approve of what's posted, click the "Add to Reputation" button, on the left of the post.

druuna 05-18-2012 03:39 AM

Hi,

If you are only interested in the date/time and Aborted_clients parts:
Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { print date, $0 }' infile

sylye 05-18-2012 05:14 AM

hi,

Thanks for both of your feedback, it's a nice one!

I have taken 'druuna' solution, it fits with my need exactly :)

sylye 05-21-2012 05:24 AM

From another similar log, but this time I would like to calculate the appearance of certain lines, for example:

Code:

Fri May 18 13:13:57 MYT 2012
Aborted_clients
Aborted_clients
Aborted_clients
Aborted_connects       

Fri May 18 13:14:57 MYT 2012
Aborted_clients
Aborted_connects   
Aborted_connects 

Fri May 18 13:15:57 MYT 2012
Aborted_clients
Aborted_clients
Aborted_clients
Aborted_clients
Aborted_connects

And the output I would need this time is:
Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 3; Aborted_connects 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 1; Aborted_connects 2
Fri May 18 13:15:57 MYT 2012 Aborted_clients 4; Aborted_connects 1

Is that doable ? I tried to manipulate from the solution given previously, but just not able to make it work :( I tried googling, but can't find in awk docs on how to do this. Is this called multi pattern matching ?

druuna 05-21-2012 06:17 AM

Assuming the log file in question looks like the one in post #1:
Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { abcl=$0 } /Aborted_connects/ { print date, abcl " ; " $1, $2 }' infile
Here's an example of the output using the log from post #1:
Code:

awk '/ MYT / { date=$0 } /Aborted_clients/ { abcl=$0 } /Aborted_connects/ { print date, abcl " ; " $1, $2 }' infile
Fri May 18 13:13:57 MYT 2012 Aborted_clients 1 ; Aborted_connects 4
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0 ; Aborted_connects 4
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0 ; Aborted_connects 4


sylye 05-21-2012 06:36 AM

hi druuna,

Apology that I didn't make the question clearer. Your solution is in fact not what I want this time. What I need is something a bit different with another format of log like this:

Code:

Fri May 18 13:13:57 MYT 2012
Sleep
Sleep
Locked
Locked     

Fri May 18 13:14:57 MYT 2012
Sleep
Sleep
Sleep
Locked     

Fri May 18 13:15:57 MYT 2012
Sleep
Sleep
Sleep
Sleep
Sleep
Sleep

So I need to calculate the appearance of each state at each time. The end result will need to be like this:
Code:

Fri May 18 13:13:57 MYT 2012 Sleep 2; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6; Locked 0


pan64 05-21-2012 06:48 AM

Code:

awk '/ MYT / { date=$0; next } /.../ { arr[$1]++; next } { printf date; for ( i in arr ) { printf ", %s %s", i, arr[i] }; printf "\n" } ' filename

Nylex 05-21-2012 06:52 AM

Rather than just using stuff people provide, you should try and understand it. There's a tutorial for AWK here that will help with that.

druuna 05-21-2012 07:30 AM

Code:

#!/bin/bash

awk '
BEGIN { RS="\n\n" ; FS="\n" }
{
for (i=2;i<=NF;i++)
  if  ( $i == "Sleep" )  { sleep[$1]++ }
  else { locked[$1]++ }
}
END { for (x in sleep) print x, "Sleep " sleep[x], "; Locked " locked[x] }
' infile | sort

Two possible issues with the above code:
1 - If a state isn't encountered, nothing instead of 0 will be printed.
2 - the sort part will only work if entries are all from the same day and month (date in examples aren't sane, sorting those will be hard. sorting the arrays from within gawk without losing indices might be possible).

Example output:
Code:

./blaat
Fri May 18 13:13:57 MYT 2012 Sleep 2 ; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3 ; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6 ; Locked

Maybe someone comes up with a cleaner solution.

PS: I just tried pan64's solution and it seems the sleep/locked states aren't counted correctly.

pan64 05-21-2012 07:35 AM

Quote:

Originally Posted by druuna (Post 4683888)

PS: I just tried pan64's solution and it seems the sleep/locked states aren't counted correctly.

You are right, a delete arr is missing:

Code:

awk '/ MYT / { date=$0; next }
    /.../ { arr[$1]++; next }
    { printf date;
      for ( i in arr )
          { printf ", %s %s", i, arr[i] };
      printf "\n";
      delete arr; } ' filename


schneidz 05-21-2012 07:39 AM

Quote:

Originally Posted by sylye (Post 4681437)
hi guys,

I have a log file which has such format:

Code:

Fri May 18 13:13:57 MYT 2012
Variable_name  Value
Aborted_clients 1
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      200

Fri May 18 13:14:57 MYT 2012
Variable_name  Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      194

Fri May 18 13:15:57 MYT 2012
Variable_name  Value
Aborted_clients 0
Aborted_connects        4
Binlog_cache_disk_use  0
Binlog_cache_use        0
Bytes_received  128
Bytes_sent      177

If I need to get all the 'Aborted_clients' value, I can simply do a grep. But, how do I associate each 'Aborted_clients' value with the 'date' information on top of each paragraph ? I wish to have something like this:

Code:

Fri May 18 13:13:57 MYT 2012 Aborted_clients 1
Fri May 18 13:14:57 MYT 2012 Aborted_clients 0
Fri May 18 13:15:57 MYT 2012 Aborted_clients 0

In fact this logfile is generated by me every minute using a very simple syntax :
Code:

LOG=./mysqlstatus.log

echo "`date`" | tee -a $LOG
mysql -u root -e 'show status' | tee -a $LOG
echo | tee -a $LOG

If I need to change the way I log the result to make the log become easier to be retrieved the information, you are welcome to advise me as well.

Thanks a million for any assist and advice given :)

this is somewhat of a hack but mite help[untested]:
Code:

d=`date`; echo $d - `mysql -u root -e 'show status'` | tee -a $LOG
edit: this should work with the original log method[untested];
Code:

egrep "(MYT|Aborted_clients)" | tr "\n" " "

sylye 05-21-2012 07:49 AM

Quote:

Originally Posted by pan64 (Post 4683862)
Code:

awk '/ MYT / { date=$0; next } /.../ { arr[$1]++; next } { printf date; for ( i in arr ) { printf ", %s %s", i, arr[i] }; printf "\n" } ' filename

pan64,

I don't quite understand the part where you do
Code:

/.../
, what does that mean ?

My problem in making awk to read a block of text are:

i) I don't know how to make it print the result after every end of the block, as I understand awk is a stream processor, how do we make it aware to only print result at the end of each block? Like for instance,
Code:

Fri May 18 13:14:57 MYT 2012 -> read input, date=$0, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Sleep -> read input, arr[Sleep]++, but don't print
Locked  -> read input, arr[Locked]++, but don't print
"\n" -> read input, but this time, print the result!

ii) how does awk know WHEN to reset the counter back to zero and counting again when it finds another 'MYT' ?

druuna's way is using a RS to do that, but I don't quite understand how's pan64 way can make awk aware about the above (i) and (ii). pan64, mind to elaborate more ?

Nylex,

Thanks for your advice, I do know taking blindly other people stuff is not a good habit, and if you have notice my question in LQ so far, I am not fall into those category. I will try understanding and discuss with the feedback and try to give my own finding. I'm slow in understanding awk even though I did go through many tutorials, hope you guys bear with my weakness in this part.

schneidz 05-21-2012 08:01 AM

Quote:

Originally Posted by sylye (Post 4683858)
hi druuna,

Apology that I didn't make the question clearer. Your solution is in fact not what I want this time. What I need is something a bit different with another format of log like this:

Code:

Fri May 18 13:13:57 MYT 2012
Sleep
Sleep
Locked
Locked     

Fri May 18 13:14:57 MYT 2012
Sleep
Sleep
Sleep
Locked     

Fri May 18 13:15:57 MYT 2012
Sleep
Sleep
Sleep
Sleep
Sleep
Sleep

So I need to calculate the appearance of each state at each time. The end result will need to be like this:
Code:

Fri May 18 13:13:57 MYT 2012 Sleep 2; Locked 2
Fri May 18 13:14:57 MYT 2012 Sleep 3; Locked 1
Fri May 18 13:15:57 MYT 2012 Sleep 6; Locked 0


heres mine [untested]:
Code:

i=1;for d in `grep -n ^$ style.log | sed s/:/""/`; do  sed -n "$i","$d"p style.log | sort | uniq -c;  i=$d; done

pan64 05-21-2012 08:24 AM

just to see post #9, you should try and understand.
. means any char, so /.../ means 3 chars, with other words it is a line containing at least 3 chars.
Otherwise my awk works exactly as you described:
first look for MYT to start counting
next look for non-empty lines and sum up what found in it
last on empty lines print result, reset counter and start over


All times are GMT -5. The time now is 02:44 PM.