LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Can't ping from a DNS server (https://www.linuxquestions.org/questions/linux-server-73/cant-ping-from-a-dns-server-758991/)

elliot01 10-01-2009 09:54 AM

Can't ping from a DNS server
 
Hi All,

Old Mandrake system here.
"Linux my.my-domain.co.uk 2.6.8.1-12mdksmp #1 SMP Fri Oct 1 11:24:45 CEST 2004 i686 Intel(R) Xeon(TM) MP CPU 3.00GHz unknown GNU/Linux"

This has been successfully acting as a DHCP, DNS, intranet and business systems server for years in a production environment (installed and configured by a separate third party software company before my time). It sits at 10.11.1.1 and serves a number of subnets ranging from 10.11.1.0 to 10.11.49.0.

Very recently we started having many problems with DNS, which mainly began with DNS resolver requests to the server timing out. As I have been looking into it, it just seems to have gotten worse.

I suddenly find myself in a situation where I can not even ping hostnames on my network from the DNS server itself, even with names which are specifically declared in '/var/named/pz/localdomain'.

For instance, I have a samba server on the network with hostname 'samba1'. In my DNS server's '/var/named/pz/localdomain' file I have an entry:

Code:

samba1                  A      10.11.22.210
My '/var/named/pz/localnet' contains:

Code:

$ORIGIN 22.11.10.in-addr.arpa.
210                    PTR    samba1.my-domain.co.uk.

The named service is running, as per usual. But if I ping...

Code:

[root@server pz]# ping samba1
ping: unknown host samba1

I'm, quite frankly, stumped. Is ping broken!?

Nslookup reveals:
Code:

[root@hayley pz]# nslookup samba1
Server:        10.11.1.1
Address:        10.11.1.1#53

** server can't find samba1: NXDOMAIN

I can't fathom how it's failing to resolve 'samba1' against the DNS server (itself) which I know for a fact is listed in the relevant domain files.

I must have added/maintained this server's localdomain/localnet files dozens and dozens of times when declaring static hosts (such as this samba1 host) and have never seen a problem like this.

Can anyone help or push me in the right direction? I'd be extremely grateful.

kbp 10-01-2009 10:18 AM

Hi there Elliot,

Firstly, can you check the servers /etc/resolv.conf ? Does it list itself as the nameserver, is the domain / search domain correct ?

cheers

elliot01 10-01-2009 11:32 AM

Quote:

Originally Posted by kbp (Post 3703683)
Hi there Elliot,

Firstly, can you check the servers /etc/resolv.conf ? Does it list itself as the nameserver, is the domain / search domain correct ?

cheers

kbp, you are awesome!

Truth be told I had just nosed into there before checking back to my post (but you still get full points for knowing where I needed to go!) and changed it from:

Code:

search my-domain.co.uk
nameserver 10.11.1.1
# nameserver 10.11.2.2
# nameserver 10.11.254.1

to

Code:

search my-domain.co.uk
nameserver 10.11.1.1
nameserver 10.11.2.2
# nameserver 10.11.254.1

This indeed worked, but I am confused. Does this mean that 10.11.1.1 (itself) shouldn't be listed in the resolv.conf, or is it a rule that you must have two servers active in here or the service breaks down?

Problem is that 10.11.2.2 will be shutdown permanently soon and 10.11.254.1 is an active directory/DNS server in a different domain (my-domain.local) which I just added in previously when I was 'tinkering', so don't know whether it's prudent to use this.

Any additional comments and advice very much appreciated

kbp 10-01-2009 11:47 AM

It looks like 10.11.1.1 is not serving dns requests, you're getting names resolved now by 10.11.2.2 .. the problem is still present.

Could you run :

Code:

netstat -tunlp
.. on 10.11.1.1 and check which addresses named is listening on ?

thanks

chrism01 10-01-2009 08:59 PM

It is normal to have at least 2 DNS servers in resolv.conf in case one dies.
You can use dig http://linux.die.net/man/1/dig to do dns lookup checks and specify which dns server to ask.
Very handy for your situation.

elliot01 10-02-2009 03:59 AM

Thank you once again for everyone's input, I am sincerely very grateful.

'netstat -tunlp | grep named' produces (this server has 3 NICs):

Code:

tcp        0      0 10.11.1.1:53            0.0.0.0:*              LISTEN      9138/named
tcp        0      0 10.11.2.1:53            0.0.0.0:*              LISTEN      9138/named
tcp        0      0 10.11.3.1:53            0.0.0.0:*              LISTEN      9138/named
tcp        0      0 127.0.0.1:53            0.0.0.0:*              LISTEN      9138/named
udp        0      0 10.11.1.1:53            0.0.0.0:*                          9138/named
udp        0      0 10.11.2.1:53            0.0.0.0:*                          9138/named
udp        0      0 10.11.3.1:53            0.0.0.0:*                          9138/named
udp        0      0 127.0.0.1:53            0.0.0.0:*                          9138/named
udp        0      0 0.0.0.0:34272          0.0.0.0:*                          9138/named
udp        0      0 :::34273                :::*                                9138/named

A little out of my depth here, but 10.11.1.1 should be listening (I assume the above results confirm this), but when I have indeed done manual nslookups against it recently (while troubleshooting the whole DNS problem), it was timing out.

However, this morning, it's not even timing out:
Code:

[root@server etc]# nslookup samba1 10.11.1.1
Server:        10.11.1.1
Address:        10.11.1.1#53

** server can't find samba1: NXDOMAIN

As you have advised though, my other server appears to be working:

Code:

[root@server etc]# nslookup samba1
;; Got SERVFAIL reply from 10.11.1.1, trying next server
Server:        10.11.2.2
Address:        10.11.2.2#53

Name:  samba1.my-domain.co.uk
Address: 10.11.22.210

So I guess my next question would be, how can I troubleshoot the DNS service on 10.11.1.1? It seems to me that something must be corrupted? I've manually scanned through the localdomain, localnet, named.conf and resolv.conf files and can't see anything obvious (the former two are quite large though).

The service restarts as it has always done:

Code:

[root@server etc]# service named restart
Stopping named: rndc: connect failed: connection refused
                                                                [  OK  ]
Starting named:                                                [  OK  ]

The apparent error above has always appeared. As I remember, when I researched (read: 'googled') it last year it turned out to be harmless.

Any further assistance would be very well received.

kbp 10-03-2009 09:07 AM

Hey Elliot,

Try running named-checkzone against your zonefile, if there's any corruption it should show up, then again I think named will tell you when it starts up if it has any issues with a zone. It may be worth posting your named.conf in case there's something syntactically correct but not functional... just had another thought, could you check whether iptables is running, if it is then turn it off and test again.. maybe something snuck into your rules

cheers

kbp

elliot01 10-05-2009 09:39 AM

Hi kbp,

Thanks for the additional info. I wasn't aware of the 'named-checkzone' command, which did highlight a problem where I'd used a hash to comment a line out. Didn't realise I couldn't use them in this file. Though to be honest, I did this since the problems began, so it would only be partially to blame.

Now, with that tidied up and the resolv.conf corrected things appear to be stabilizing. At least the nslookups I have just tested are responding quickly and correctly. Which is great, so thank you once again!

Regarding named.conf, I will still post it, on the possibility that a discerning eye may pickup on anything unusual!

Code:

options {
        directory "/var/named";
        forward first;
        forwarders { 158.152.1.43; 158.152.1.58; };
        pid-file "/var/run/named/named.pid";
};


logging {
  channel systemlog {
      file "/var/log/named/named.log";
      severity debug;
      print-time yes;
  };
  channel audit_log {
      file "/var/log/named/security.log";
      severity debug;
      print-time yes;
  };
  channel xfer_log {
      file "/var/log/named/xfer.log";
      severity debug;
      print-time yes;
  };
  category default { systemlog; };
  category security { audit_log; systemlog; };
  category config { systemlog; };
  category xfer-in { xfer_log; };
  category xfer-out { xfer_log; };
  category notify { audit_log; };
  category update { audit_log; };
  category queries { audit_log; };
  category lame-servers { audit_log; };
};


zone "." {
        type hint;
        file "root.hints";
};

zone "0.0.127.in-addr.arpa" {
        type master;
        notify no;
        file "pz/127.0.0";
};

zone "my-domain.co.uk" {
        type master;
        notify no;
        file "pz/localdomain";
        allow-update { 10.11.1.1; 127.0.0.1; };
};

zone "11.10.in-addr.arpa" {
        type master;
        notify no;
        file "pz/localnet";
        allow-update { 10.11.1.1; 127.0.0.1; };
};


avijitp 10-05-2009 09:54 AM

A few things that are worth checking:

1. Did you find something in the /var/log/named/named.log file.
2. And did you ever touched the tcp wrappers recently?
3. There might be a issue with the rndc? What is the Bind version on the server?
4. Are you able to resolve using your forwarders from this server?
5. You might run a tcpdump to see what exactly is happening to the dns traffics. That might give you some help to fix the issue.

kbp 10-05-2009 05:59 PM

Hi Elliot,

Glad to hear things are improving, after looking at your config I have a couple of suggestions, please note that I have no idea what you environment looks like though -


Code:

options {
        directory "/var/named";
        version "666";
        allow-transfer { key "ns01-ns02.key"; };
        allow-query { any; };
        allow-recursion { 10.1.1.0/16; 127.0.0.1; };
        forwarders { 158.152.1.43; 158.152.1.58; };
        pid-file "/var/run/named/named.pid";
};

- take out the 'forward first' option, may slow down local resolution a tad
- restrict recursive lookups
- use key based auth for transfer between master and slave
- report bogus version ( old school but anyway... )
- restrict who's allowed to query if you wish ( could be a little excessive )

A good reference book is available online to brush up if you need to:
http://www.zytrax.com/books/dns/

cheers,

kbp


All times are GMT -5. The time now is 06:14 AM.