Nominal Animal |
05-06-2012 02:55 PM |
dig is your best bet, but I'd use a different approach.
First, I'd use awk to filter out only valid IPv4 addresses from your list, and convert to the reverse order used for DNS requests:
Code:
awk '#
BEGIN {
RS="[\t\v\f ]*(\r\n|\n\r|\r|\n)[\t\v\f ]*"
FS="[.]"
}
(NF==4 && $1>=0 && $1<=255 && $2>=0 && $2<=255 && $3>=0 && $3<=255 && $4>=0 && $4<=255) {
# Loopback address?
if ($1 == 127) next
# Private address?
if ($1 == 10) next
if ($1 == 172 && $2 >= 16 && $2 <= 31) next
if ($1 == 192 && $2 == 168) next
# Link-local address?
if ($1 == 169 && $2 == 254) next
# Multicast address?
if ($1 >= 224 && $1 <= 239) next
# This seems like a real IP address.
printf("%d.%d.%d.%d.in-addr.arpa.\n", $4, $3, $2, $1)
}
' original-file > ipv4.list
Now you can use dig to go through the IPv4 address list in batch mode. It is basically the most lightweight option. If you want to reduce the load on your name servers, install dnscache so you do the queries directly to the target nameservers, not relying on your normal nameservers -- but I would not bother. The command to run is
Code:
dig +noall +answer -t any -f ipv4.list > ipv4.lookup
After that completes, you can edit the lookup results so they are easier to process:
Code:
awk '#
BEGIN {
RS = "[\t\v\f ]*(\r\n|\n\r|\r|\n)[\t\v\f ]*"
FS = "[\t\v\f ]+"
}
NF > 3 {
if (split($1, ip, ".") < 6) next
name = $NF
sub(/\.$/, "", name)
printf("%d.%d.%d.%d %s\n", ip[4], ip[3], ip[2], ip[1], name)
}
' ipv4.lookup > ipv4.names
At this point, you have a list of IPv4 addresses and matching hostnames in ipv4.names . Now you can easily repeat the filtering step you did first, except this time, use the name list to classify each address:
Code:
awk -v names="ipv4.names" '#
BEGIN {
RS="[\t\v\f ]*(\r\n|\n\r|\r|\n)[\t\v\f ]*"
FS="[\t\v\f ]+"
while ((getline < names) > 0)
if (NF == 2)
name[$1] = $2
}
(NF > 1) {
printf("%s BAD_INPUT\n", $1)
next
}
(NF == 1) {
if (split($1, ip, ".") < 4) {
printf("%s NO_IP\n", $1)
next
}
if (ip[1] < 0 || ip[1] > 255 || ip[2] < 0 || ip[2] > 255 ||
ip[3] < 0 || ip[3] > 255 || ip[4] < 0 || ip[4] > 255) {
printf("%s NO_IP\n", $1)
next
}
if (ip[1] == 127) {
printf("%s LOOPBACK\n", $1)
next
}
if ((ip[1] == 10) ||
(ip[1] = 172 && ip[2] >= 16 && ip[2] <= 31) ||
(ip[1] == 192 && ip[2] == 168)) {
printf("%s PRIVATE\n", $1)
next
}
if (ip[1] == 169 && ip[2] == 254) {
printf("%s LINK_LOCAL\n", $1)
next
}
if (ip[1] >= 224 && ip[1] <= 239) {
printf("%s MULTICAST\n", $1)
next
}
addr = sprintf("%d.%d.%d.%d", ip[1], ip[2], ip[3], ip[4])
if (addr in name)
printf("%s KNOWN %s\n", $1, name[addr])
else
printf("%s UNKNOWN\n", $1)
}
' original-file > final-results
In the final-results file, the original IP address will be in the first column, reason in the second column, and if the second column contains KNOWN, the name is in the third column.
Note: The above scriptlets have not been thoroughly tested, so there might be typos.
|