-   Linux - Server (
-   -   Dns problem with bind9 (

caksin 03-16-2010 10:13 AM

Dns problem with bind9
I've a server for all;
web server, dns server, sql server, mail server and for a couple of domains.
For about the last 1 month or so there are complaints that people can't reach the sites hosted on the server and it's getting more and more frequent.
Also they say they couldn't send or receive e-mails from time to time. I can never replicate/cacth this error cuz whenever they say "I can't reach the site", I can.
I had them try with the ip when they couldn't reach with address and voila, they can access the site.
In daemon.log there are 3 types of errors; (and wonder if one of these could be causing my headaches)
  1. Quote:

    date time www named[4551]: FORMERR resolving '': some.ip#53
    I searched and read about this error that it should be because someone is trying to resolve an ipv6 address. But it's mostly like "hotmai" or "hormail"
    Is this all there's to it?
  2. Quote:

    date time www named[4551]: unexpected RCODE (REFUSED) resolving 'ns3.some.domain/A/IN': some.ip#53
    Searched this too and found out it was not about my server, and that some other server refused to resolve the given address.
  3. Quote:

    ate time www named[4551]: lame server resolving 'some.sub.domain' (in 'same.sub.domain'?): some.ip#53
    With a "last message repeated x times" note, "x" is mostly 3 if not always.
    I figured out that the "some.ip" is supposed to be the authoritative for the "some.sub.domain" but is not - and this is logged in my server cuz it my server querried this "lame server", cuz it is in fact not authoritative for that domain.
    Is this supposed to happen when recursive querries are allowed? Now I set recursion to no through webmin and I guess it does reload bind itself without reqiring me to reload it seperately since I clicked save.

I also read somewhere that postfix and blacklisted ip s could be causing this problem. Didn't get what he meant so checked my syslog and found out billions! :p of lines like this;

www dovecot: pop3-login: Disconnected: user=<Aaaaaa>, method=PLAIN, rip=, lip=server.ip
www dovecot: pop3-login: Disconnected: user=<Abcdef>, method=PLAIN, rip=, lip=server.ip
www dovecot: pop3-login: Disconnected: user=<Action>, method=PLAIN, rip=, lip=server.ip
So someone was trying to send mail through my server using something like a brute force attack?
Thnx all in advance.

spampig 03-16-2010 12:07 PM


Originally Posted by caksin (Post 3900482)
I've a server for all;.... checked my syslog and found out billions! :p of lines like this;

So someone was trying to send mail through my server using something like a brute force attack?
www dovecot: pop3-login: Disconnected: user=<Aaaaaa>, method=PLAIN, rip=, lip=server.ip
www dovecot: pop3-login: Disconnected: user=<Abcdef>, method=PLAIN, rip=, lip=server.ip
www dovecot: pop3-login: Disconnected: user=<Action>, method=PLAIN, rip=, lip=server.ip
Thnx all in advance.

It looks like a BF but __not__ to send mail (well directly anyway) - Dovecot is a POP/IMAP server that stores mail, so someone is trying to read other peoples mail by the look of it. That said, if Postfix is using the Dovecot SASL library, getting a username and password would be useful to log in to your server and send authenticated mail.

Ideally there needs to be a limit on connection attempts to Dovecot as it looks like a weak link here. There is some talk of adding a feature:

Your BIND errors you've pretty much explained for yourself. Users often do dumb stuff and then blame the host (hootmail or hotmall instead of hotmail - just like you've seen). With regards to the lame server, this says it best:


caksin 03-18-2010 07:59 AM

Ok I get the Dovecot part, appearantly there are no built in solutions/options in dovecot for this problem.
One of the guys in that mail-list wrote some script to limit it. That's for sure a problem but for the time being the major problem I have is constantly growing; people can't send or receive emails and they can not reach the sites hosted on the server more and more often. I just thought could the login attempts be slowing down the server due to frequent querries, but I guess it's not so highly possible.
So it should be about the dns problem, how is it possible that I can reach the sites or send/receive emails when people in our other office can not? Does anyone have any idea about how this could be possible? Tiz driving me crazy :)

spampig 03-18-2010 08:09 AM

Can I ask, what makes you so sure it's a DNS issue rather than a plain connectivity issue? Where is the server - is it a hosting company offering or do you have it your office/building? To really nail it you would needs a user to say 'at 13:05 today I was unable to connect to the website x.y.z hosted on the server' and then look in the apache access logs to see if the request even made it that far. Ditto with POP/IMAP. With regards to Postfix blocking IP addresses, this would only happen if they were on some kind of blocklist and it had been configured to use it. It would normally show a message in the mail log to indicate that it had done it.

caksin 03-18-2010 08:44 AM

Yes it's a hosting company, a co-location service. Ok I will check the access logs and report back :). I thought it was a dns problem because of the the thing I mentioned in the first post;

I had them try with the ip when they couldn't reach with address and voila, they can access the site.
I might be mistaken of course, I'm not sure about the problem.

spampig 03-18-2010 08:55 AM

I understand that Caksin but you've pointed out you have some users who can't spell things like 'hotmail' :-) From my own experience with 'discount' hosting/VPS hosting I've seen issues with BIND running out of allocated memory. The named process uses quite a bit of meat. If this is a full power independent dedicated server it should be fine and dandy. If it's a VPS offering I would bear in mind that they can be flaky.

caksin 03-18-2010 09:29 AM

Hm thnx for the heads up spampig, but the company is a very good one and we have our machine located in their building. It's also quite a good system (xeon 1.60 8 core) but as I mentioned all the stuff is in it, all the servers. May be it really is overloaded. The traceroute is problematic by the way, can not reach to the server. It get's stuck at the 11th stop using dnsstuff's traceroute check. Could this be it?

spampig 03-18-2010 09:46 AM

Assuming you have all of that to yourself (rather than a slice or share of it) then I would be really surprised for a xeon 1.60 8 core machine to fail to answer DNS requests. I'm also slightly baffled that you don't have any issues reaching it yourself. I guess that you are running more than one name server and they are authorative for domains you have hosted? Something else is not occasionally answering with NXDOMAIN for the effected customers?

caksin 03-18-2010 09:56 AM

Yes the system is totally ours. If I understood you correctly; there is a bind9 installation with only 4 virtual domains. And the same system is authoritative for those domains, you're right. You mean there could be something else - other than bind - trying to respond to querries? I've got absolutely no idea what's not correctly happening.

spampig 03-18-2010 10:14 AM

So I'm not confused here, are you using BIND to give authoritative answers for your domains to the rest of the world, or resolving queries to other domains for your clients (or both)???

It would be normal to provide more than one authoritative DNS server for a domain for redundancy. I guess you have a second one set up some place to resolve your domains in case of load/failure?

So I'm totally clear your clients try to access ''. Their software runs off to resolve '' by using their ISP's name servers. If they don't already have this in cache they will recurse the request until they find either a cached version from another server that still has 'time to live (TTL)' or directly from your AUTHORITATIVE bind server if nothing else knows along the way. During the process something fails and on occasions clients don't get an IP for the name. Is this the scenario you are facing or are you trying to force them to query your server directly?

What I would do is just run a quick check to see what the rest of the world thinks should be answering for your domain(s):
"nslookup -querytype=ns" (this will check what the google name servers think your domain name servers should be)
"nslookup -querytype=ns" this will use your defaults. Any differences?

caksin 03-18-2010 10:42 AM

Yes just our domains, we don't do anything for anyone else - well that sounds so selfish but that's not true, valid only for this particular question :D - We don't have any slave dns servers, I know the risks but well, what can you do.
That's the correct scenario you wrote, that's the one killing me :p
Ok here are the results of what you asked;
- without the google ip:

DNS request timed out.
timeout was 2 seconds.
Sunucu: UnKnown

DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
timeout was 2 seconds.
- with the google ip

Address: nameserver = nameserver =
But the 192.168 makes me think I couldn't successfully query since it's a local ip. Is that so do you think?

spampig 03-18-2010 10:55 AM

What the top result tells me is the system you run it on could not find an authortative server for your domain(s) - but the google server did. Two things spring from this - first, where did you run that first query? On your own PC? It's a concern that it can't find an auth name server for your domain in the global DNS system. Second, the results from google show two name servers, but you tell me you have one {did I read somewhere you had aliased a second to point to the first?}. The next question is does:

nslookup -querytype=A
nslookup -querytype=A

return the same single IP for your name server (both queries)?

If you are only hosting a few domains, and not resolving any odd internal IP addresses on your name server can you not make use of the name servers of the domain register? In the UK if you register a domain with one of the normal reputable companies you can usually use their name servers and build you simple zones on their infrastructure via a simple web config page. Is that different from what happens where you are? I appreciate the situation in different parts of the world may not be the same. What I'm trying to get at, is there any need for you to run a name server at all?

caksin 03-18-2010 11:18 AM

They both return a non existent domain error, what the heck? :D
But they should in fact point to the same ip.
Things are the same here too, that was just a choice, if I can not resolve this problem soon, that might be what we'll have to do. :S

spampig 03-18-2010 11:33 AM

It's probable that you are going to need to visit that domain control panel anyway, to tell it where you NS is. It may be worth letting their infrastructure carry the brunt and set it up to resolve your A,AAAA, MX etc. The only thing you would need to take care of is the reverse DNS mapping, and that is likely to be dealt with by the hosting company that gave you the IP. Don't overlook this as incorrect/mismatched reverse dns (PTR) will result in non-delivered email in great quantity :-)

Glad you're getting closer to the pot of gold :-)

caksin 03-18-2010 04:54 PM

:) Thnx but you know what, today at work I told the guys from the hosting commpany to enter a reverse dns record for our ip to resolve to and guess what, we can not send mail to hotmail anylonger. They used to get treated as spam but now hotmail directly refuses them, of course from all our domains cuz they all use the same mail server. I checked the blacklists and couldn't find anything, the only thing that used to look wrong before was the reverse dns and now it is also ok but everything seems to be worse :S
Sux when you can not understand things :D
Btw I want to thank you for your attention and all the time you took to respond, there; I have :p

All times are GMT -5. The time now is 06:04 PM.