named timeouts, just kill/start helps, can't find reason
Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
It works with one interface, one IP address on it.
60 - 80 users on a server,
60 - 80 client machines on the localnet.
They are working without errors for months/years. But there are times, when for some weeks comes a strange error ( not in one time on the 2 servers...):
connections timing out, named stops serving, I can ssh in only with very long connection time, clients can not get their mails, network freezes.
This time I can't use 'service' command to stop named, so I stop it with kill -9, then I try to restart it. Sometimes have to do repeatedly 2-3 times till named starts to answer normally. There are days when it happens only once, but there are days when it happens 3 - 4 times, half an hour, hours or half days between them.
I tried temporarily a script which kills the named then restarts it from crontab if it is cooked, but found crontab doesn't work well when this error occurs. If I try a
'crontab -l' it can't answer to.
Maybe some resource problem, but where to search ?
I collected datas when it doesn't functioning, but I dont find any reason for it.
If anybody have met error like this, please help, I have no more idea.
- there is nothing strange in the named or other logs
- load is 0.5 - 5.0 the upper value is very rare
If you lose dns facilities on the server, everything that uses it will slow down waiting for replies, eg mail, ssh & inetd, which rely on reverse dns checks or dns resolution for logins.
I suggest re-installing bind or using a dns proxy rather than a full blown dns server.
That kernel version is quite old and there are many well known & published exploits for the kernel and ssh.
If you are going to maintain a long term & secure server, I suggest moving to a distro that keeps updates, and keeps versions for a long time so you don't have to keep upgrading every year.
Last edited by peter_robb; 07-14-2006 at 04:25 PM.