LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices

Reply
 
Search this Thread
Old 07-14-2006, 09:15 AM   #1
ggal
LQ Newbie
 
Registered: Jul 2006
Posts: 1

Rep: Reputation: 0
named timeouts, just kill/start helps, can't find reason


Hi All!

I use 2 Dell PE650 servers on 2 independent local network as mail/DNS/DHCP servers.
OS is RedHat Linux release 9 (Shrike),
Linux **** 2.4.20-31.9 #1

bind-9.2.1-16,
dhcp-3.0pl1-23,
sendmail-8.12.8-9.90
mailman-2.1.1-5
imap-2001a-18
ipop3d

It works with one interface, one IP address on it.
60 - 80 users on a server,
60 - 80 client machines on the localnet.

They are working without errors for months/years. But there are times, when for some weeks comes a strange error ( not in one time on the 2 servers...):

connections timing out, named stops serving, I can ssh in only with very long connection time, clients can not get their mails, network freezes.
This time I can't use 'service' command to stop named, so I stop it with kill -9, then I try to restart it. Sometimes have to do repeatedly 2-3 times till named starts to answer normally. There are days when it happens only once, but there are days when it happens 3 - 4 times, half an hour, hours or half days between them.

I tried temporarily a script which kills the named then restarts it from crontab if it is cooked, but found crontab doesn't work well when this error occurs. If I try a
'crontab -l' it can't answer to.

Maybe some resource problem, but where to search ?

I collected datas when it doesn't functioning, but I dont find any reason for it.
If anybody have met error like this, please help, I have no more idea.

Thanks,

Geza

- there is nothing strange in the named or other logs
- load is 0.5 - 5.0 the upper value is very rare

free
total used free shared buffers cached
Mem: 255252 251704 3548 0 69012 129780
-/+ buffers/cache: 52912 202340
Swap: 2104432 42080 2062352

vmstat
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
1 0 0 42080 4784 64616 131696 0 0 0 0 101 16 0 0 100
0 0 0 42080 4796 64632 131692 0 0 0 41 190 37 0 3 97
0 0 0 42080 4796 64632 131692 0 0 0 0 103 18 0 0 100

sometimes cs can go up to 180-200.

how many sockets are in each connection state:
netstat -a -n|grep -E "^(tcp)"| cut -c 68-|sort|uniq -c|sort -n
4 LAST_ACK
16 LISTEN
33 ESTABLISHED
 
Old 07-14-2006, 03:23 PM   #2
peter_robb
Senior Member
 
Registered: Feb 2002
Location: Szczecin, Poland
Distribution: Gentoo, Debian
Posts: 2,458

Rep: Reputation: 47
If you lose dns facilities on the server, everything that uses it will slow down waiting for replies, eg mail, ssh & inetd, which rely on reverse dns checks or dns resolution for logins.

I suggest re-installing bind or using a dns proxy rather than a full blown dns server.

That kernel version is quite old and there are many well known & published exploits for the kernel and ssh.
If you are going to maintain a long term & secure server, I suggest moving to a distro that keeps updates, and keeps versions for a long time so you don't have to keep upgrading every year.

Last edited by peter_robb; 07-14-2006 at 03:25 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Oh no, is this the reason UT2004 won't start up? lenny45 Linux - Games 4 04-01-2006 01:21 AM
How can I find out the reason of unexpected reboot (redhat3+2.6.11)? yuzuohong Linux - General 2 05-24-2005 05:49 PM
how to find out the reason when the system stops shigeru Linux - General 3 07-01-2004 08:08 PM
cannot find named.conf and /var/named kaushikma Red Hat 1 02-07-2004 12:49 PM
Bind/named can't open zone files for some reason adam_lang Linux - Networking 2 02-02-2004 09:09 PM


All times are GMT -5. The time now is 10:40 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration