3rd try.... windows 7 clients randomly failing to resolve address of nameserver
i've posted about this twice and got zero responses, so.. third times the charm right?
I have a BIND server setup for my home network. It's named "pLAN9-Server1", address 172.16.16.2, local domain name of "pLAN9.site". This machine is also a dhcpd server. After what seems like a random amount of time, and with no apparent cause, Windows 7 clients stop being able to resolve the address of the nameserver itself. Any other address gets resolved fine, it's just the nameserver itself. Even weirder, doing an "nslookup" from a Win7 box that's having the problem always returns the proper result??? But pinging the server by name or trying to access it via Windows explorer fails.... that doesn't make ANY sense to me... if nslookup resolves the name, why is every other means of accessing the server failing? Of course, the Windows boxes don't log anything remotely helpful. Doing an "ipconfig /renew" on the windows machines always fixes the problem temporarily, but it eventually comes back, once again with no comprehensible reason. It seems like it's a DNS-only problem, as accessing the server by IP always works. Here's output on the windows command line from a machine that currently has no access (nslookup -d2): Code:
> plan9-server1 server config files /etc/named.conf: Code:
// Code:
option domain-name "pLAN9.site"; /var/named/pLAN9.zone (forward) Code:
$TTL 3D Code:
$TTL 3D |
what does your hosts file in win7 look like?
Microsoft is stupid when it comes to LAN side DNS resolution unless it is part of an AD network in full native mode. |
Code:
# Copyright (c) 1993-2009 Microsoft Corp. |
Do these clients use a wpad file to authenticate or proxy with?
|
Actually no, if youre referring to the "option wpad" thing in dhcpd.conf. I put that there because all of the Win7 clients were semi-flooding the dhcp server with a bunch of spurious "DHCPINFORM" messages, and adding those lines stopped that.
Could that be part of the problem? |
What response are you getting from an nslookup and what does you NS log when your windows clients request?
|
Are you DNS servers being set up on the windows clients? what is the output from ipconfig /all on one of the windows clients?
|
Quote:
and ipconfig /all Code:
Windows IP Configuration |
just adding that when the systems lose access, running a rndc querylog on the DNS server produces absolutely no results when the clients try to ping/access by name. Nothing at all. So it seems like the Windows clients aren't even trying to send a DNS query.... but only for the nameserver's query... this makes no sense.
|
an update on this
i have since removed the second DNS server from dhcpd.conf, the opendns address that I had had as the secondary dns. it's been almost 3 days without a problem. if it doesn't happen for a few more days ill mark this solved. |
Quote:
The reason is the way Windows handles multiple DNS entries. It works like this:
Such events would trigger a switch of DNS server on the Windows client, and from then on it would be using the OpenDNS server exclusively. Of course, the OpenDNS server knows nothing of the contents of your internal DNS zone, and will return NXDOMAIN for any queries related to the "pLAN9.site" domain. The only way to make Windows switch back to the primary server is to deactivate and reactivate the network connection, reset the IP configuration (as you've discovered), or reboot the OS. Oh, and sometimes Windows may decide that the primary DNS server is not the one you've put first in your DHCP zone. In other words, never mix internal and external DNS servers. |
actually, it isn't dropped queries or packets or anything; i typically reboot this and all of my servers fairly frequently as i update them nearly constantly. during the reboot, if any client tried to query DNS for any reason, it would fail, and from then on would never try my internal server again.
am i the only one who doesn't think this is fairly illogical behavior on the part of Windows? i would think that, at least like every 50 queries or so it would try the first one again to see if it's back up? since after all, Windows itself calls the first DNS server "primary"... in any case, i guess it wasn't exactly a "Linux" question, but thanks for all the help folks. |
2 Attachment(s)
I could be wrong, but I believe your problem there was the fact that Linux isn't syntax picky... it's syntax fucking crazy!
Correct me if I'm wrong but here is the difference between my home setup (running CentOS6 w/ bind) and your setup (what're you running?): Mine: option domain-name-servers 192.168.1.5, 192.168.1.1, 75.75.75.75, 76.76.76.76, 8.8.8.8, 8.8.4.4; Yours: option domain-name-servers 172.16.16.2,208.67.222.222; Could it be as simple as not having a space that separates them? It caused a problem for me when I first configured bind! Anyways I hope it helps. Here's iPhone & Windows7 Pro devices getting the correct information (in the correct order) in the attachments... |
hmm, i'll try that and see if the problem comes back.
marking as solved, thanks for the info all |
So, did it solve anything?
|
All times are GMT -5. The time now is 10:35 AM. |