ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Oct 7 02:21:48 ipb named[2677]: client 38.229.33.47#59569: query (cache) 'a998207098p59569i39337.d2016100618000222958.t12135.dnsresearch.cymru.com/A/IN' denied
Oct 7 02:39:12 ipb named[2677]: client 183.56.172.145#20000: query (cache) '2054061883.www.baidu.com/A/IN' denied
Oct 7 04:31:44 ipb named[2677]: client 141.212.122.111#38457: query (cache) 'c.afekv.com/A/IN' denied
Oct 7 05:34:21 ipb named[2677]: client 95.215.60.214#43977: query (cache) 'm24.pl/ANY/IN' denied
Oct 7 06:39:09 ipb named[2677]: client 185.94.111.1#46130: query (cache) 'com/ANY/IN' denied
Oct 7 08:22:08 ipb named[2677]: client 209.126.136.2#52517: query (cache) 'a.gtld-servers.net/A/IN' denied
Oct 7 09:00:09 ipb named[2677]: client 185.141.24.209#42825: query (cache) 'leth.cc/ANY/IN' denied
Oct 7 09:28:25 ipb named[2677]: client 124.232.142.220#38773: query (cache) 'www.google.com/A/IN' denied
Oct 7 12:31:08 ipb named[2677]: client 124.232.142.220#38332: query (cache) 'www.google.it/A/IN' denied
Oct 7 01:36:57 ipb postfix/anvil[15006]: statistics: max connection count 1 for (smtp:223.74.42.35) at Oct 7 01:33:36
Oct 7 03:14:45 ipb postfix/anvil[13320]: statistics: max connection count 1 for (submission:169.56.71.47) at Oct 7 03:11:24
Oct 7 04:16:04 ipb postfix/anvil[7596]: statistics: max connection count 1 for (smtp:223.74.42.155) at Oct 7 04:12:43
Oct 7 09:03:20 ipb postfix/anvil[357]: statistics: max connection count 1 for (smtp:62.219.225.141) at Oct 7 09:00:00
Oct 7 11:47:26 ipb postfix/anvil[28328]: statistics: max connection count 1 for (smtp:81.240.248.53) at Oct 7 11:44:03
Oct 7 13:54:54 ipb postfix/anvil[1113]: statistics: max connection count 1 for (smtp:210.211.102.38) at Oct 7 13:51:33
Oct 7 22:28:26 ipb postfix/anvil[31118]: statistics: max connection count 1 for (smtp:80.82.64.102) at Oct 7 22:25:00
Oct 7 03:11:25 ipb postfix/submission/smtpd[13318]: SSL_accept error from unknown[169.56.71.47]: lost connection
Yes, it is possible, but I think the approach of changing the Field Separator is not helpful. I would leave the FS as whitespace and work on finding the right patterns followed by a little manipulation with gsub or something. Then you'd have to collect your cumulative stats in a different array for each of the three categories.
Code:
#!/usr/bin/awk -f
/situation1/ { something1 }
/situation2/ { something2 }
/situation3/ { something3 }
END {
for(i in a) { summary1 }
for(i in b) { summary2 }
for(i in c) { summary3 }
}
Also, which version of "awk" are you using for this?
Personally I would go with perl, that would be quite simple, but you can do it in awk too.
In awk I would use one, single (common) FS for all the cases you have - containing []()#: and whitespace and you will only need to handle the required columns (as it was described in post #2)
Why the need for a one liner? Not for nothin', but your three lines are hard enough to read as is, and I'd hate to have to debug or modify it.
awk lets you run scripts from a file. If you are concerned about cluttering your bash script, while not follow along Turbocapitalist's lead, and put that in a separate awk script?
Oct 7 02:21:48 ipb named[2677]: client 38.229.33.47#59569: query (cache) 'a998207098p59569i39337.d2016100618000222958.t12135.dnsresearch.cymru.com/A/IN' denied
Oct 7 02:39:12 ipb named[2677]: client 183.56.172.145#20000: query (cache) '2054061883.www.baidu.com/A/IN' denied
Oct 7 04:31:44 ipb named[2677]: client 141.212.122.111#38457: query (cache) 'c.afekv.com/A/IN' denied
Oct 7 05:34:21 ipb named[2677]: client 95.215.60.214#43977: query (cache) 'm24.pl/ANY/IN' denied
Oct 7 06:39:09 ipb named[2677]: client 185.94.111.1#46130: query (cache) 'com/ANY/IN' denied
Oct 7 08:22:08 ipb named[2677]: client 209.126.136.2#52517: query (cache) 'a.gtld-servers.net/A/IN' denied
Oct 7 09:00:09 ipb named[2677]: client 185.141.24.209#42825: query (cache) 'leth.cc/ANY/IN' denied
Oct 7 09:28:25 ipb named[2677]: client 124.232.142.220#38773: query (cache) 'www.google.com/A/IN' denied
Oct 7 12:31:08 ipb named[2677]: client 124.232.142.220#38332: query (cache) 'www.google.it/A/IN' denied
Oct 7 01:36:57 ipb postfix/anvil[15006]: statistics: max connection count 1 for (smtp:223.74.42.35) at Oct 7 01:33:36
Oct 7 03:14:45 ipb postfix/anvil[13320]: statistics: max connection count 1 for (submission:169.56.71.47) at Oct 7 03:11:24
Oct 7 04:16:04 ipb postfix/anvil[7596]: statistics: max connection count 1 for (smtp:223.74.42.155) at Oct 7 04:12:43
Oct 7 09:03:20 ipb postfix/anvil[357]: statistics: max connection count 1 for (smtp:62.219.225.141) at Oct 7 09:00:00
Oct 7 11:47:26 ipb postfix/anvil[28328]: statistics: max connection count 1 for (smtp:81.240.248.53) at Oct 7 11:44:03
Oct 7 13:54:54 ipb postfix/anvil[1113]: statistics: max connection count 1 for (smtp:210.211.102.38) at Oct 7 13:51:33
Oct 7 22:28:26 ipb postfix/anvil[31118]: statistics: max connection count 1 for (smtp:80.82.64.102) at Oct 7 22:25:00
Oct 7 03:11:25 ipb postfix/submission/smtpd[13318]: SSL_accept error from unknown[169.56.71.47]: lost connection#!/bin/
... this awk ...
Code:
awk -F'client |[][)(#]|smtp:|submission:|SSL_accept error from unknown' \
'{/denied/ ?a[$4]++:0;
/max connection/?b[$5]++:0;
/accept/ ?c[$5]++:0;}
END{for(i in a) printf "%-15s%-27s%-2s\n",i,",query_denied",a[i]
for(i in b) printf "%-15s%-27s%-2s\n",i,",max_connection_count",b[i]
for(i in c) printf "%-15s%-27s%-2s\n",i,",SSL_accept_error",c[i]}' \
$InFile >$OutFile
Try putting it into a file and running as a script so you can comment it and understand what all the tasks are you performed in a months time
Here is a slightly longer version in script format so you can see how laying it out is easier to follow (I'll leave the comments to you )
Code:
#!/usr/bin/awk -f
{
regex = "[^0-9.]"
switch ($0){
case /query/:
if($NF == "denied"){
field = $7
name = "query_denied"
regex = "#.*"
}
break
case /max connection/:
field = $(NF-4)
name = "max_connection_count"
break
case /SSL_accept/:
field = $(NF-2)
name = "SSL_accept_error"
}
ip = gensub(regex,"","g",field)
output[name][ip]++
}
END{
for(i in output)
for(j in output[i])
print j","i,output[i][j]
}
I find well named variables can also help eliminate the need for extensive comments
I add the default case for if no match, and edit print to printf.
Code:
#!/usr/bin/awk -f
{
regex = "[^0-9.]"
switch ($0){
case /query/:
if($NF == "denied"){
field = $7
name = "query_denied"
regex = "#.*"
}
break
case /max connection/:
field = $(NF-4)
name = "max_connection_count"
break
case /SSL_accept/:
field = $(NF-2)
name = "SSL_accept_error"
break
default:
field = $(NF-2)
name = "No_match"
break
}
ip = gensub(regex,"","g",field)
output[name][ip]++
}
END{
for(i in output)
for(j in output[i])
printf "%-15s %-27s %-2s\n",j,","i,output[i][j]
}
Try putting it into a file and running as a script so you can comment it and understand what all the tasks are you performed in a months time
Here is a slightly longer version in script format so you can see how laying it out is easier to follow (I'll leave the comments to you )
Code:
#!/usr/bin/awk -f
{
regex = "[^0-9.]"
switch ($0){
case /query/:
if($NF == "denied"){
field = $7
name = "query_denied"
regex = "#.*"
}
break
case /max connection/:
field = $(NF-4)
name = "max_connection_count"
break
case /SSL_accept/:
field = $(NF-2)
name = "SSL_accept_error"
}
ip = gensub(regex,"","g",field)
output[name][ip]++
}
END{
for(i in output)
for(j in output[i])
print j","i,output[i][j]
}
I find well named variables can also help eliminate the need for extensive comments
The ip's with the additional port numbers on them are a by product of previous lines not being read properly because I was not catching lines we didn't care about and the final lines of the
original braces is always run. A simple default case fixed it for me:
Code:
#!/usr/bin/awk -f
{
regex = "[^0-9.]"
switch ($0){
case /query/:
if($NF == "denied"){
field = $7
name = "query_denied"
regex = "#.*"
}
break
case /max connection/:
field = $(NF-4)
name = "max_connection_count"
break
case /SSL_accept/:
field = $(NF-2)
name = "SSL_accept_error"
break
default:
name = "NOT TRACKED"
ip = "0.0.0.0"
}
if(name != "NOT TRACKED")
ip = gensub(regex,"","g",field)
output[name][ip]++
}
END{
for(i in output)
for(j in output[i])
print j","i,output[i][j]
}
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.