LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Shell Script issue (https://www.linuxquestions.org/questions/linux-newbie-8/shell-script-issue-933392/)

elmo219 03-08-2012 04:14 AM

Shell Script issue
 
Hi guys, newbie here :-D

I'm trying to create a script that will pull the 20 top queried domains from my DNS server.

Only issue is i have 99 query logs for any one day, this is what i have so far;

Code:

#!/bin/bash
 
echo "Top 20"
echo ""
for file in $(find /var/dns/log/ -iname "dns-query.log.*");
do cat "${file}"|awk '{print $6}'|sed 's/www.//'|awk '{ FS = "." } ; { print $1"."$2"."$3"." }'| sort | uniq -c | sort -nr | head -n 20
done

Now this works in a way.... However it prints the top 20 for all 99 files one after another. What i want to do is some how add these 99 files together then run my command for sorting the queries.

so gives me an output like so;

Quote:

Top 20 Domains

5557 google.com..
4852 api-read.facebook.com.
3817 orcart.facebook.com.
3319 api.facebook.com.
3028 facebook.com..
2577 m.hotmail.com.
2389 pop3.live.com.
2088 profile.ak.fbcdn.
1936 fbcdn-profile-a.akamaihd.net.
1899 mtalk.google.com.
1836 static.ak.fbcdn.
1691 m.facebook.com.
1424 google.co.uk.
1423 ksn2-12.kaspersky-labs.com.
1408 fbcdn-photos-a.akamaihd.net.
1336 google-analytics.com..
1213 android.clients.google.
1177 s-static.ak.facebook.
1168 api.twitter.com.
1095 wpad...
5586 google.com..
4781 api-read.facebook.com.
3638 orcart.facebook.com.

continues for the next 99 files
Anyone have any ideas, or even understand what i'm blabbing on about?

TB0ne 03-08-2012 10:16 AM

Quote:

Originally Posted by elmo219 (Post 4621635)
Hi guys, newbie here :-D
I'm trying to create a script that will pull the 20 top queried domains from my DNS server. Only issue is i have 99 query logs for any one day, this is what i have so far;

Now this works in a way.... However it prints the top 20 for all 99 files one after another. What i want to do is some how add these 99 files together then run my command for sorting the queries. so gives me an output like so;

Anyone have any ideas, or even understand what i'm blabbing on about?

Seems like you're most of the way there. If they're just text/log files you're talking about, why not just use a low-tech solution? Something like this:
Code:

#!/bin/bash
 
echo "Top 20"
echo ""
cat `find /var/dns/log/ -iname "dns-query.log.*" >> big-log-file.log
cat big-log-file.log |awk '{print $6}'|sed 's/www.//'|awk '{ FS = "." } ; { print $1"."$2"."$3"." }'| sort | uniq -c | sort -nr | head -n 20
rm big-log-file.log

Just combine all of them into one big file, then run your operation on it.

elmo219 03-09-2012 02:36 AM

Hi thanks for replying...

Yeah i got the Bash script working like so;

Code:

#!/bin/bash
 
echo "Top 20 Domains"
echo ""
cat /var/dns/log/dns-query.log.*|awk '{print $6}'|sed 's/www.//'|awk '{ FS = "." } ; { print $1"."$2"."$3"." }' > /tmp/collectorstats
cat /tmp/collectorstats| sort | uniq -c | sort -nr | head -n 20
rm /tmp/collectorstats

As i stripped out the info first it made the collectorstats file alot smaller.

However it seems after running it i will have to re-write this in perl as the sort is taking 99% usage of one processor :-/

Down side is i'm completely useless at patter matching in perl :-/

elmo219 03-09-2012 08:59 AM

ok guys quick update. I'm struggling through converting this simple bash script to perl but having trouble with the sort;

Code:

#!/usr/bin/perl
#use strict;
#use warnings;
print "Top 20 Queried Domains";
print "\n";
 
#my $txtfile = '/var/dns/log/dns-query.log.1';
my $txtfile = '/var/dns/log/dns-test-2';
 
my $url_queries = {};
my $ip_queries  = {};
#!/usr/bin/perl
#use strict;
#use warnings;
print "Top 20 Queried Domains";
print "\n";
 
#my $txtfile = '/var/dns/log/dns-query.log.1';
my $txtfile = '/var/dns/log/dns-test-2';
 
my $url_queries = {};
my $ip_queries  = {};
 
open (READ, "$txtfile") || die "Can't open logs\n";
while ($line = <READ>){
chomp ($line);
$line =~ s/#/ /g;
$line =~ s/www\./ /g;
($date,$time,$client,$ip,$qn,$query,$dnsname,@d)=split(" ",$line);
 
if ( defined $url_queries->{$dnsname} )
{
  $url_queries->{$dnsname}=$url_queries->{$dnsname}+1;
}
else
{
  $url_queries->{$dnsname}=1;
}
 
if ( defined $ip_queries->{$ip} )
{
  $ip_queries->{$ip}=$ip_queries->{$ip}+1;
}
else
{
  $ip_queries->{$ip}=1;
}
}
close READ || die "Couldn't close logs";
 
# Sort
 
 
# Show 20
$count=5;
while ( (($key, $value) = each(%$url_queries)) && ($count>0) )
#foreach $value (sort{$url_queries{$a} cmp $url_queries{$b}} keys %$url_queries)
{
    print "$key\t\t$value\n";
    #print "$value\t\t$url_queries{$value}\n";
    $count--;
}
 
$count=5;
while ( (($key, $value) = each(%$ip_queries)) && ($count>0) )
{
    print "$key\t\t$value\n";
    $count--;
}
exit (0);

which provides an output like so ;

Quote:

./counter.pl
Top 20 Queried Domains
40-courier.push.apple.com 1
api-read.facebook.com 1
147.66.194.173.in-addr.arpa 1
ssl.google-analytics.com 1
download965.avast.com 13
xxx.xxx.xxx.xxx 1
xxx.xxx.xxx.xxx 3
xxx.xxx.xxx.xxx 1
xxx.xxx.xxx.xxx 1
xxx.xxx.xxx.xxx 1
i've obviously blanked out the IP's

*Sorry edited and cleaned up a bit*

So i can make it clearer, where would i put the sort command here so i can sort on the count value?

elmo219 03-09-2012 10:08 AM

ffs


All times are GMT -5. The time now is 09:02 PM.