NTP - syncs initially, eventually stops syncing, uses LOCAL clock
I've been beating my head against the wall for 2 days dealing with NTP issues on my RHEL box. I suppose I could ask our vendor for assistance but I wanted to work through this with somebody on these forums because I find that method the most effective.
My ntp.conf is (comments removed): server time.service.softlayer.com prefer server 0.north-america.pool.ntp.org burst iburst server 1.north-america.pool.ntp.org burst iburst server 2.north-america.pool.ntp.org burst iburst server 3.north-america.pool.ntp.org burst iburst fudge 127.127.1.0 stratum 10 driftfile /var/lib/ntp/drift broadcastdelay 0.008 keys /etc/ntp/keys I read elsewhere not to worry about the "restrict" lines for now while I'm debugging my issues so I've commented all restrict lines out. This config was working for me just fine. I was getting a time adjustment every 15 minutes, but then, it just stopped. Here's when sync'ing was first successful: Feb 8 15:08:10 lin-10 ntpd[4402]: time reset +1.271467 s Feb 8 17:27:22 lin-10 ntpd[21306]: time reset +0.278015 s Feb 8 17:43:01 lin-10 ntpd[21306]: time reset +30.067615 s Feb 8 17:58:32 lin-10 ntpd[21306]: time reset +28.418495 s Feb 8 18:14:16 lin-10 ntpd[21306]: time reset +31.885751 s Feb 8 18:30:04 lin-10 ntpd[21306]: time reset +30.506775 s The adjustments happened every 15 minutes through the night until... Feb 9 01:32:21 lin-10 ntpd[21306]: time reset +29.880494 s Feb 9 01:48:10 lin-10 ntpd[21306]: time reset +32.030575 s Feb 9 02:03:42 lin-10 ntpd[21306]: time reset +28.456379 s And they just stopped. Checking the logs for other ntpd messages, the only thing out of the ordinary that I see is: Feb 9 02:15:39 lin-10 ntpd[21306]: kernel time sync enabled 0001 ...which is 12 minutes after the last sync! Every ntpd message beyond this one is along these lines: Feb 9 02:22:21 lin-10 ntpd[21306]: synchronized to 72.250.128.138, stratum 3 Feb 9 02:23:08 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2 Feb 9 02:23:15 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 02:24:13 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2 Feb 9 02:24:18 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 02:25:17 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2 But now, more and more, it's choosing LOCAL as its sync partner: Feb 9 07:21:03 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 07:38:11 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 07:38:21 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 07:55:27 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 08:12:20 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 08:12:31 lin-10 ntpd[21306]: synchronized to 72.250.128.138, stratum 3 Feb 9 08:29:16 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 08:29:23 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 08:46:22 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 08:46:39 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 09:03:34 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Feb 9 09:03:45 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10 Feb 9 09:19:27 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2 Looking at ntpq -p shows: remote refid st t when poll reach delay offset jitter ========================================================= 10.0.77.54 70.84.160.232 2 u 82 1024 377 0.332 876489. 34131.4 xamelia.felixker 128.233.154.245 2 u 189 1024 377 42.405 872593. 195.085 xnew.dawson.edu 67.18.187.111 3 u 126 1024 377 45.793 874558. 297.871 xnear.fatelectro 192.12.19.20 2 u 132 1024 377 32.562 874424. 241.504 xvega.jeffkaplan 69.164.222.108 3 u 112 1024 377 41.820 875222. 166.518 *LOCAL(0) LOCAL(0) 10 l 65 64 377 0.000 0.000 0.001 And now my system time is off again! I am not able to keep time on this server for some reason. Any suggestions? Forgot to add...ntpd is still running: [root@lin-10 ~]# ps -aef | grep ntpd root 4107 12648 0 09:38 pts/1 00:00:00 grep ntpd ntp 21306 1 0 Feb08 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid Thanks, Pete |
so, the logs say that your system time losing 30 seconds every 15 minutes? something is wrong.
|
Quote:
Quote:
The solution was to apply a patch to VMWare itself, and also a patch to RHEL. This may also be a case of a known RHEL bug, which is why it's better to check with your vendor. |
Thanks for the responses. I'll check with the vendor and try to let you know what they say. It the "hardware node" of a virtual environment. It's RHEL but considered in the environment as "Parallels Virtuozzo Containers for Linux (x86)".
(sorry for the delayed response...I thought the forum would email me when I received a response but it didn't and I was just now able to get back around to this issue) |
Quote:
|
I contacted Parallels, since it's a Parallels Virtuozzo environment, and they said, and I quote:
"I could see that you are having the issue with the NTP server time syncing on the Hardware Node. I discussed this with our Senior Engineer and could see that Virtuozzo does not affect the functionality of NTP in the Hardware Node. So this is out of scope from our Parallels Support Policy. We are suggesting you to contact Operating System support to help on this issue further." But, the problem is that we use OS templates provided to us by Parallels. We've never contacted RedHat to purchase and install their OS, we just do everything through Parallels. Seems like I'm in no-man's land with this one. That aside, Parallels has always been able to help us through our problems so I'm not sure why they are wiping their hands clean of this one. I think I'm going to just do a job every 15 minutes that stops and starts NTPD. The time is off 30 seconds every 15 minutes so hopefully, nothing gets hurt too bad. I am, of course, open to any other suggestions. Thanks for all the help and advice I've received thus far. Pete |
Quote:
I was in that same boat with VMware, until they released the patch, in conjunction with RedHat. I turned off NTP, since it would die frequently, and just cron'ed a job to do an ntpdate (although sntp -d is used these days), every two minutes, since the clock drifted that badly. Trying to keep NTP running is a losing proposition, until the underlying problem is resolved. |
Quote:
|
Hardware node
Quote:
|
My solution turned out to be creating a job that runs every 15 minutes of every day that restarts ntpd. I chose every 15 minutes because the clock would be off a full 30 seconds by then already. It's now how I want to solve the problem at hand but I have no other option.
Thanks for the input. Too bad NTP doesn't work as advertised (on this machine), seems to work on others just fine. |
All times are GMT -5. The time now is 01:54 PM. |