LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 09-13-2006, 11:58 PM   #1
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Rep: Reputation: 0
DNS Server Loosing Information


Ok folks,

I have a very interesting issue here with my DNS server. I am running Centos 4.4 Final with the default RPM for named. I have the DNS do 2 things:

1. Serve as a caching DNS for my LAN
2. Serve as an authoritative server for the hostnames in my LAN

I do not have the named-chroot environment installed.

So the issue at hand is that every so often my DNS server will loose parts of its cache. Let say I have resolved domain.com previously and it is cached in my DNS. When I resolve this domain from any of my computers on the LAN I get a super fast (0.2ms) response from my DNS.

However, 2-3 days later, when I try to resolve that same domain.com my DNS will return an empty value but it will still respond super fast in about (0.2ms). Clearly, it is fetching the blank response from its cache. So it will keep returning this empty value until I restart the deamon which clears the DNS cache.

Better yet, named will loose only parts of its cache. It will return blank values only for some domains but not all.

What causes this? What can I do to fix it? Obviously, its a bug with named.

I have 2GB of RAM but its rarely filled more than 60% and I run a bunch of junk on that server.

Last edited by vasillalov; 09-14-2006 at 12:00 AM.
 
Old 09-14-2006, 01:25 AM   #2
generic_user
Member
 
Registered: Sep 2006
Location: San Francisco, Ca.
Distribution: Redhat/Fedora/CentOS
Posts: 39

Rep: Reputation: 15
A couple of things...

First, what are the settings in /etc/resolve.conf?
Second, have you queried the server directly using something like nslookup or dig?

I.E. nslookup
server 127.0.0.1
domain.com.

or

dig @127.0.0.1 domain.com.

Can you please show a good output of dig, and a later bad output of dig using the dame domain that you think is "missing" from the cache?

Other bits of info would be the version of named, was it from a package like rpm, and what distro are you using?

if you're using rpms, please run:
rpm -qV bind
and
rpm -qV caching-nameserver

That will tell me which files have been altered from their defaults. Include your named.conf file too if possible.
 
Old 09-14-2006, 02:40 AM   #3
sysconfig
Member
 
Registered: Sep 2006
Location: (.)
Posts: 44

Rep: Reputation: 15
What is the deafault time for zone to be in cache dns?

thx
 
Old 09-14-2006, 07:48 AM   #4
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by sysconfig
What is the deafault time for zone to be in cache dns?

thx

Good question! I have absolutely no idea! Where do you configure this? Do you think its a timing issue?
 
Old 09-14-2006, 08:18 AM   #5
sysconfig
Member
 
Registered: Sep 2006
Location: (.)
Posts: 44

Rep: Reputation: 15
A Caching Only Server is a server that is not authoritative for any zone. This server services queries and asks other servers, who have the authority, for the information needed. All servers keep data in their cache until the data expires, based on a TTL (``Time To Live'') field which is maintained for all resource records
 
Old 09-14-2006, 01:37 PM   #6
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by sysconfig
A Caching Only Server is a server that is not authoritative for any zone. This server services queries and asks other servers, who have the authority, for the information needed. All servers keep data in their cache until the data expires, based on a TTL (``Time To Live'') field which is maintained for all resource records
Sorry, I misinterpreted the timing thing.

I have TTL set to 3600 all over the place.

When this happens again, I will do dig before and after I reset DNS and post the results here.
 
Old 09-15-2006, 12:05 AM   #7
sysconfig
Member
 
Registered: Sep 2006
Location: (.)
Posts: 44

Rep: Reputation: 15
This not from your caching server it depends on the TTL value of particuler record.!

let say www.abc.com has set 1400 as TTL then this record will stay in cache only for that period. if particular record has TTL value of 0 then it will not be in your cache as far as I know

thx
 
Old 09-15-2006, 08:44 AM   #8
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Ok folks,

It happened again. So here is the ouputs:

BEFORE RESTARTING NAMED:

$ dig flvhosting.com

; <<>> DiG 9.2.4 <<>> flvhosting.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4433
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;flvhosting.com. IN A

;; Query time: 22 msec
;; SERVER: 192.168.2.3#53(192.168.2.3)
;; WHEN: Fri Sep 15 09:35:22 2006
;; MSG SIZE rcvd: 32

AFTER RESTARTING NAMED:

First time fetch:

$dig flvhosting.com
; <<>> DiG 9.2.4 <<>> flvhosting.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12942
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:
;flvhosting.com. IN A

;; ANSWER SECTION:
flvhosting.com. 86400 IN A 130.94.247.186

;; AUTHORITY SECTION:
flvhosting.com. 86400 IN NS ns1.flvhosting.com.
flvhosting.com. 86400 IN NS ns2.flvhosting.com.

;; Query time: 307 msec
;; SERVER: 192.168.2.3#53(192.168.2.3)
;; WHEN: Fri Sep 15 09:39:20 2006
;; MSG SIZE rcvd: 84


Second fetch:

$ dig flvhosting.com

; <<>> DiG 9.2.4 <<>> flvhosting.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9473
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0

;; QUESTION SECTION:
;flvhosting.com. IN A

;; ANSWER SECTION:
flvhosting.com. 86330 IN A 130.94.247.186

;; AUTHORITY SECTION:
flvhosting.com. 86330 IN NS ns2.flvhosting.com.
flvhosting.com. 86330 IN NS ns1.flvhosting.com.

;; Query time: 17 msec
;; SERVER: 192.168.2.3#53(192.168.2.3)
;; WHEN: Fri Sep 15 09:40:30 2006
;; MSG SIZE rcvd: 84



Ok,
So you see when DNS gets a brainfreeze it still responds superfast, but it just does not provide anything for the Answer Section. BTW, flvhosting is NOT the domain that my DNS is authoritative for. It is hosted somewhere else.

Here is my /etc/named.conf:

## named.conf - configuration for bind
#
# Generated automatically by redhat-config-bind, alchemist et al.
# Any changes not supported by redhat-config-bind should be put
# in /etc/named.custom
#
controls {
inet 127.0.0.1 allow { localhost; } keys { rndckey; };
};

#include "/etc/named.custom";
#it is written below so no need to be included

include "/etc/rndc.key";

options {
directory "/var/named";
};

zone "." {
type hint;
file "/var/named/named.ca";
};

zone "0.0.127.in-addr.arpa" IN {
type master;
file "named.local";
allow-update { none; };
};

zone "2.168.192.in-addr.arpa" IN {
type master;
file "db.192.168.2";
allow-update { none; };
};

zone "dubpix.com" {
type master;
file "dubpix.com.zone";
notify no;
allow-query { any; };
};
 
Old 09-16-2006, 01:48 AM   #9
generic_user
Member
 
Registered: Sep 2006
Location: San Francisco, Ca.
Distribution: Redhat/Fedora/CentOS
Posts: 39

Rep: Reputation: 15
issue this command the next time it fails:
rndc querylog

and do another search.. you should see something in the /var/log/messages for each query now, and it's possible that might help clue you into what's happening.

rndc trace (run this multiple times) will increase debugging info.. also another helper when parsing the logs.

instead of restarting named, issue this:
rndc flush

that should flush your name server cache.. if it IS a corrupt cache, this should make things work again.

Also, at least for the time being, you might consider adding your isps name servers to the list in your /etc/resolv.conf file...At least things hould continue to resolve if your name server dies again.

Btw, why is dig using 192.168.2.3 instead of 127.0.0.1? Normally a system that runs a name server uses the localhost to do queries... I.E. 127.0.0.1 is the first entry in /etc/resolv.conf

Finally, rpm -qV bind should run a check on the files of that rpm to verify that you don't have some type of corruption of the binaries going on.



Quote:
Originally Posted by vasillalov
$ dig flvhosting.com

; <<>> DiG 9.2.4 <<>> flvhosting.com
;; global options: printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 4433
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;flvhosting.com. IN A
If I didn't know better, I'd think named was having trouble communicating with the outside world, but since all you're doing is restarting the service, that's probably not the case.
 
Old 09-16-2006, 11:20 AM   #10
generic_user
Member
 
Registered: Sep 2006
Location: San Francisco, Ca.
Distribution: Redhat/Fedora/CentOS
Posts: 39

Rep: Reputation: 15
Quote:
Originally Posted by vasillalov
Ok folks,

It happened again.

One last thing... I almost forgot about this... do you have the nscd service running? I've had trouble with that screwing with named in the past.
 
Old 09-16-2006, 12:05 PM   #11
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by generic_user
One last thing... I almost forgot about this... do you have the nscd service running? I've had trouble with that screwing with named in the past.
Thanks for the advise. NSCD is not running on my machine.

Some more details:

1. I don't think it makes any difference if I use 127.0.0.1 or 192.168.2.3. The first IP is my loopback interface and the second is my eth0 interfaces on the same machine. Anyway, I changed the resolve.conf file so now resolving goes over the loopback interface.

2. rndc flush produces:
rndc: connection to remote host closed
This may indicate that the remote server is using an older version of
the command protocol, this host is not authorized to connect,
or the key is invalid.

3. rpm -qV bind
S.5..... c /etc/rndc.key
missing /var/named/data
missing /var/named/slaves

So this is becoming more interesting... Any ideas? I'll try with reinstalling bind from scratch and hopefully that will fix the issue.
 
Old 09-16-2006, 02:42 PM   #12
generic_user
Member
 
Registered: Sep 2006
Location: San Francisco, Ca.
Distribution: Redhat/Fedora/CentOS
Posts: 39

Rep: Reputation: 15
Quote:
Originally Posted by vasillalov
Thanks for the advise. NSCD is not running on my machine.

Some more details:

1. I don't think it makes any difference if I use 127.0.0.1 or 192.168.2.3. The first IP is my loopback interface and the second is my eth0 interfaces on the same machine. Anyway, I changed the resolve.conf file so now resolving goes over the loopback interface.
your named.conf file is set for special access on the looback adapter... From your named.conf file:

Code:
controls {
inet 127.0.0.1 allow { localhost; } keys { rndckey; };
};
Quote:
2. rndc flush produces:
rndc: connection to remote host closed
This may indicate that the remote server is using an older version of
the command protocol, this host is not authorized to connect,
or the key is invalid.
Does this command work correctly after a restart? Note that rndc needs to either talk to your loopback, or use the rndc.key... (Also, from that line in your named.conf file)

Quote:
3. rpm -qV bind
S.5..... c /etc/rndc.key
missing /var/named/data
missing /var/named/slaves

So this is becoming more interesting... Any ideas? I'll try with reinstalling bind from scratch and hopefully that will fix the issue.
Well, this doesn't look too bad actually.. rndc.key will be different since that's supposed to be unique.. the c means the selinux settings are different from defaults though. Got selinux running?

I'm not certain what the data dir is meant for... mine is empty, but it's possible temp files are created there once in a while? The slaves dir is meant for slave zone files that need to be altered (read: written to from named).

It will be interesting to see what, if anything, comes from turning on debug a few levels and turning on the querylog. Maybe you can do this before it starts to misbehave.
 
Old 09-16-2006, 03:37 PM   #13
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Well now rndc spills this:

rndc flush
rndc: connection to remote host closed
This may indicate that the remote server is using an older version of
the command protocol, this host is not authorized to connect,
or the key is invalid.

This is after I changed the /etc/resolve.conf to look into the loopback interface.

I also reinstalled the bind rpm and that fixed the problem with the missing data and slaves folders. They are both emty now ...

Something is definitely up...


How can I turn up the logging levels without the use of the rndc command?
 
Old 09-17-2006, 01:57 AM   #14
generic_user
Member
 
Registered: Sep 2006
Location: San Francisco, Ca.
Distribution: Redhat/Fedora/CentOS
Posts: 39

Rep: Reputation: 15
Quote:
Originally Posted by vasillalov
Well now rndc spills this:

rndc flush
rndc: connection to remote host closed
This may indicate that the remote server is using an older version of
the command protocol, this host is not authorized to connect,
or the key is invalid.

This is after I changed the /etc/resolve.conf to look into the loopback interface.

I also reinstalled the bind rpm and that fixed the problem with the missing data and slaves folders. They are both emty now ...

Something is definitely up...


How can I turn up the logging levels without the use of the rndc command?
You're sure you don't have the bind-chroot rpm installed?

Can you rpm -qV the other bind rpms?
bind-utils
bind-libs

You should try to get rndc working first... the fact that that isn't means something is screwy with your configuration. Check your rndc.key file to make sure that looks legit... A failed rpm install might have only created a "stub" of that file.

What distro are you running? FC allows you to pass extra "options" via /etc/sysconfig/named
just add a line "OPTION="-d ???" where ??? is the debug level you want. I'm not certain there's a way to turn on querylogging via the cli though.
 
Old 09-17-2006, 08:43 AM   #15
vasillalov
LQ Newbie
 
Registered: Jul 2004
Posts: 16

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by generic_user
You're sure you don't have the bind-chroot rpm installed?
I am positive:

$rpm -qa | grep bind
bind-9.2.4-16.EL4
bind-libs-9.2.4-16.EL4
bind-utils-9.2.4-16.EL4
ypbind-1.17.2-8


I am running Centos 4.4 Final. Here is the /etc/rndc.key:

key "rndckey" {
algorithm hmac-md5;
secret "German Engineering In Da Haus, Yaaa!";
};

Last edited by vasillalov; 09-17-2006 at 08:29 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
help needed to setup a DNS server can anyone say how to configure a DNS server subha Linux - Networking 4 04-27-2012 11:50 PM
DNS-Information and Configuration rbh123 Red Hat 1 01-09-2006 05:16 AM
Looking for Split DNS Information using Bind jrbush82 Linux - Networking 2 04-22-2005 08:00 AM
Linux DHCP information for DNS tisource Linux - Networking 5 10-31-2004 06:47 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 01:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration