LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (http://www.linuxquestions.org/questions/linux-networking-3/)
-   -   clients CAN resolve hostname of server w/nslookup, but CANNOT access by hostname WTF? (http://www.linuxquestions.org/questions/linux-networking-3/clients-can-resolve-hostname-of-server-w-nslookup-but-cannot-access-by-hostname-wtf-4175434707/)

psycroptic 10-30-2012 01:52 AM

clients CAN resolve hostname of server w/nslookup, but CANNOT access by hostname WTF?
 
1 Attachment(s)
Ok, so I'd really like some help into what's going on here.

I have a simple BIND DNS server setup on a LAN, it's the only one running on an Archlinux system. It's up to date. It's been working completely fine as an internal network server for over a year. Now all of a sudden, COMPLETELY AT RANDOM all of my Win7 clients lost the ability to access the server by hostname. IP address access still works, and I can even resolve the hostname of the server via the nslookup command on each system. But pinging the actual hostname (which worked completely fine as of last week) randomly stopped working, as does typing \\servername (in this case pLAN9-server1). What the frick? Nothing was changed by me on the server and clients.

I've attatched a picutre of comman dprompt commands that illustrate whats going on.

Any ideas as to what happened? As usual, it's an issue that started happening completely at RANDOM?????

acid_kewpie 10-30-2012 03:28 AM

Not convinced at all this isn't a windows issue here, but maybe there are search domain mismatches, or the DNS server simply isn't being requested. It's certainly possible to legitmately see this sort of thing on linus, e.g. dig will ONLY use DNS servers, not use the actual NSS stack of hosts files too, so output can differ. Fire up wireshark on that box, and run both commands again. inspect the protocol stuff to see if there's a difference in the traffic to and from the server.

psycroptic 10-31-2012 09:21 PM

tcpdump shows that when I try to ping "plan9-server1" there is a DNS request from the client but no response back from the server. Whenever this happens, I can't even ping using a FQDN (in this case "pLAN9-Server1.pLAN9.site". Additionally, I've found that running an "ipconfig /flushdns" on the clients fixes the problem for a time, but it's like something expires or gets reset and the clients lose the ability to ping JUST the server by name.
This is stupidly annoying, if for nothing else than the complete irrationality of it. I would ignore it but whenver the issue happens people lose the ability to access the main server on the network, and most of them don't know how to access it by IP so they just conclude that the server is down and email me about it.

acid_kewpie 11-01-2012 04:41 AM

so if there's no response AT ALL, then hmm, not sure, you should always get back something if it's working. Look at the actual DNS request details in wireshark, see if a domain is being added etc.

psycroptic 11-05-2012 04:09 PM

bumping. seriously does no one have ANY ideas about whats going on here? this defies all logic to me. if the client can resolve the hostname, AND access it by ip, why can't it access it by hostname? makes no f'ing sense.....

DutchGeek 11-09-2012 04:30 AM

try:

ipconfig /displaydns

and see the entries, if they are there check their TTL.

Maybe try wireshark/tcpdump in both cases; when you do an nslookup, and when you ping. Then compare the DNS requests of each, see if you can find a difference ?!

You also might want to check the logs for the BIND server, see why it doesn't response.

psycroptic 11-09-2012 04:49 PM

Alright, so an ipconfig /displaydns on a machine that currently is having this problem shows the following for plan9-server1:

Code:

    plan9-server1.plan9.site
    ----------------------------------------
    Name does not exist.

Why the frick not? Of course it exists, nslookup confirms that. And worse, now, when I do an ipconfig /flushdns on this particular system, I STILL can't access by hostname, even fully qualified name. I actually had to restart the whole machine to get it back working.

When attemtping to ping plan9-server1, I get the following on a tcpdump of the server:

Code:

IP 172.16.16.13.137 >  172.16.255.255.137: UDP, lenght 50
Where .13 is the machine currently having the problem. No response is sent from the server. This is making absolutely no sense....

psycroptic 11-09-2012 05:02 PM

Just for info, posting config files

/etc/samba/smb.conf
Code:

[global]
        workgroup = pLAN9
        server string = "pLAN9-Server1 :: Linux File Server"
        security = share
        interfaces = eth0 tun0 10.11.12.0/24 lo
        hosts deny = 192.168.68.0/24
#      bind interfaces only = yes
        load printers = no
        log file= /var/log/samba/%m.log
        max log size = 1024
        local master = yes
        os level = 65
        preferred master = yes
        domain master = yes
        local master = yes
        printing = bsd
        printcap name = /dev/null
        create mask = 0775

[Shared]
        comment = Shared server directory (150GB)
        path = /smb/shared
        browseable = yes
        writable = yes
        public = yes

[Videos]
        comment = Video files & downloads
        path = /mnt/videos
        writable = yes
        public = yes
        #valid users = wil htpc

[Administrators]
        comment = Server admin directory (150GB)
        path = /smb/admin
        writable = yes
        valid users = wil

[Music]
        comment = Network-wide music
        path = /mnt/music
        browseable = yes
        writable = yes
        public = yes

[Podcasts]
        comment = Podcasts HDD
        path = /mnt/podcasts
        browseable = yes
        writable = yes
        public = yes

[DVD Rips]
        comment = Uncompressed rips from DVDs
        path = /mnt/dvdrips
        browseable = yes
        writable = yes
        public = yes

[Saved Games]
        comment = Saved games (150GB)
        path = /smb/savedgames
        browseable = yes
        writable = yes

[Backgrounds]
        comment = Widescreen desktop wallpaper
        path = /smb/backgrounds
        browseable = yes
        writable = yes

[RESTORE]
        comment = Access to Win7 restore images
        path = /smb/admin/RESTORE
        browseable = no
        writable = yes

Forward BIND zone (/var/named/pLAN9.zone)

Code:

$ORIGIN .
$TTL 2419200    ; 4 weeks
pLAN9.site              IN SOA  pLAN9-Server1.pLAN9.site. root.pLAN9.site. (
                                2011082128 ; serial
                                28800      ; refresh (8 hours)
                                7200      ; retry (2 hours)
                                2419200    ; expire (4 weeks)
                                86400      ; minimum (1 day)
                                )
                        NS      pLAN9-Server1.pLAN9.site.
$ORIGIN pLAN9.site.
$TTL 3600      ; 1 hour
CERT                    A      172.16.16.31
                        TXT    "00fa7be6910d449153c0bbf9db8852928f"
$TTL 2419200    ; 4 weeks
Galt-PC                A      172.16.16.15
HP102E6C                A      172.16.16.253
pLAN9-Gateway          A      172.16.16.1
pLAN9-HTPC              A      172.16.16.11
pLAN9-Laptop            A      172.16.16.12
pLAN9-Server1          A      172.16.16.2
pLAN9-Server2          A      172.16.16.3
pLAN9-Server3          A      172.16.16.4
$TTL 86400      ; 1 day
pLAN9-WAP              A      172.16.16.254
$TTL 2419200    ; 4 weeks
pLAN9-Wil              A      172.16.16.10

Reverse (var/named/pLAN9.rev)

Code:

$ORIGIN .
$TTL 86400      ; 1 day
16.16.172.in-addr.arpa  IN SOA  pLAN9-Server1.pLAN9.site. root.pLAN9.site. (
                                2011082115 ; serial
                                28800      ; refresh (8 hours)
                                7200      ; retry (2 hours)
                                2419200    ; expire (4 weeks)
                                86400      ; minimum (1 day)
                                )
                        NS      pLAN9-Server1.pLAN9.site.
$ORIGIN 16.16.172.in-addr.arpa.
$TTL 2419200    ; 4 weeks
1                      PTR    pLAN9-Gateway.pLAN9.site.
$TTL 3600      ; 1 hour
10                      PTR    pLAN9-Wil.pLAN9.site.
11                      PTR    pLAN9-HTPC.pLAN9.site.
12                      PTR    pLAN9-Laptop.pLAN9.site.
15                      PTR    Galt-PC.pLAN9.site.
$TTL 86400      ; 1 day
2                      PTR    pLAN9-Server1.pLAN9.site.
$TTL 3600      ; 1 hour
253                    PTR    HP102E6C.pLAN9.site.
254                    PTR    pLAN9-WAP.pLAN9.site.
$TTL 86400      ; 1 day
3                      PTR    pLAN9-Server2.pLAN9.site.
$TTL 3600      ; 1 hour
31                      PTR    CERT.pLAN9.site.
$TTL 2419200    ; 4 weeks
4                      PTR    pLAN9-Server3.pLAN9.site.

/etc/named.conf

Code:

//
// /etc/named.conf
//

options {
        directory "/var/named";
        pid-file "/var/run/named/named.pid";
        auth-nxdomain yes;
        datasize default;
// Uncomment these to enable IPv6 connections support
// IPv4 will still work:
//      listen-on-v6 { any; };
// Add this for no IPv4:
//      listen-on { none; };

        // Default security settings.
        allow-recursion { localnets; localhost; };
        allow-transfer { localnets; localhost; };
        allow-query { localnets; localhost; };
        allow-update { key dhcpupdate; };
        listen-on { 127.0.0.1; 172.16.16.2; };
        forward first;
        forwarders { 208.67.220.220; 208.67.222.222; };

    version none;
    hostname none;
    server-id none;
};

key dhcpupdate {
        algorithm HMAC-MD5;
        secret "xxxxxxxxxxxxxxxx";
};

zone "localhost" IN {
        type master;
        file "localhost.zone";
        allow-transfer { any; };
};

zone "0.0.127.in-addr.arpa" IN {
        type master;
        file "127.0.0.zone";
        allow-transfer { any; };
};

zone "." IN {
        type hint;
        file "root.hint";
};

zone "pLAN9.site." IN {
        type master;
        file "pLAN9.zone";
};

zone "16.16.172.in-addr.arpa" IN {
        type master;
        file "pLAN9.rev";
};

//zone "example.org" IN {
//      type slave;
//      file "example.zone";
//      masters {
//              192.168.1.100;
//      };
//      allow-query { any; };
//      allow-transfer { any; };
//};

logging {
        channel xfer-log {
                file "/var/log/named.log";
                print-category yes;
                print-severity yes;
                print-time yes;
                severity debug;
        };
        category xfer-in { xfer-log; };
        category xfer-out { xfer-log; };
        category notify { xfer-log; };
//      category queries { xfer-log; };
};


DutchGeek 11-10-2012 05:23 AM

Hmmm, strange, as acidkewpie said, it might be a windows issue..

So when you do ipconfig /displaydns, it shows you the name not found, i.e it tried previously to resolve it but got negative answer from server.

You said that when you reboot this machine, it works for a while then stops. try to reboot it and save all the relevant info you can from ipconfig, then when it stops working see what the difference is.


Are you logging the queries in your BIND server? if not then run:
# rndc querylog

then do: tail -f /var/log/messages

while you are: pinging the host, and while using nslookup.

psycroptic 11-10-2012 11:24 AM

Windows would make sense as the problem here, because I recently (a few days) ago loaded linux onto my laptop just to test for this particular issue. Its been a few days and the laptop hasnt had the problem.

If it is a Windows issue, then I find it quite peculiar that I've never heard or run into this problem ANYWHERE else.

When I have time ill see if BIND has any log info for when this happens... I suspect it won't though, as I saw with the packet logs logging zero response from the server..


All times are GMT -5. The time now is 10:38 AM.