How well do you know awk?
This is what I came up with, running the following awk program with three file arguments (
log.*):
Code:
$ awk '{count[substr($2,1,2)] += 1 }
ENDFILE { for (c in count) result[c] = result[c] " " count[c] " " c; delete count }
END { for (r in result) print result[r] }' log.*
1 08 1 08 1 08
2 00 3 00 2 00
2 10 2 10 2 10
1 03 1 03 1 03
1 22 1 22 1 22
1 23 1 23 2 23
The program is based on one of awk's most powerful features, associative arrays.
The first line collects the lines for each hour.
The hour is used as the index of the array
count; to get the hour, I use the
substr() function to peel the first two characters off the second field (
$2) in each line.
Second line:
ENDFILE only exists in the Gnu version of awk, which is normally the version in Linux distros. It may or may not work in BSD or other UNIXes.
ENDFILE signals that the end of a file is reached. At this point, I go through the
count array and add the result to another array named
result.
c is an hour,
count[c] is the number of times the hour occurred in a file.
After that, I delete the
count array so that I can start from scratch with a new file.
Third line:
END signals the overall end of input. At that point, I dump out the
result array. Unfortunately, associative arrays are not sorted in any way.
Exercises: Correct sorting (I'd pipe the output into the
sort command), and labeling the columns with the file names (needs to be added to the awk program I think).
I warmly recommend the awk guide referenced in my signature.
EDIT: I wrote the program based on your comment in the original post:
Quote:
I want the resulting output in column wise as below-
473 8 395 8 462 8
765 9 642 9 704 9
957 10 877 10 906 10
|
which is not what you say in post 6.