LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   NTP - syncs initially, eventually stops syncing, uses LOCAL clock (https://www.linuxquestions.org/questions/linux-newbie-8/ntp-syncs-initially-eventually-stops-syncing-uses-local-clock-928482/)

cvillepete 02-09-2012 08:44 AM

NTP - syncs initially, eventually stops syncing, uses LOCAL clock
 
I've been beating my head against the wall for 2 days dealing with NTP issues on my RHEL box. I suppose I could ask our vendor for assistance but I wanted to work through this with somebody on these forums because I find that method the most effective.

My ntp.conf is (comments removed):
server time.service.softlayer.com prefer
server 0.north-america.pool.ntp.org burst iburst
server 1.north-america.pool.ntp.org burst iburst
server 2.north-america.pool.ntp.org burst iburst
server 3.north-america.pool.ntp.org burst iburst
fudge 127.127.1.0 stratum 10
driftfile /var/lib/ntp/drift
broadcastdelay 0.008
keys /etc/ntp/keys

I read elsewhere not to worry about the "restrict" lines for now while I'm debugging my issues so I've commented all restrict lines out.

This config was working for me just fine. I was getting a time adjustment every 15 minutes, but then, it just stopped. Here's when sync'ing was first successful:

Feb 8 15:08:10 lin-10 ntpd[4402]: time reset +1.271467 s
Feb 8 17:27:22 lin-10 ntpd[21306]: time reset +0.278015 s
Feb 8 17:43:01 lin-10 ntpd[21306]: time reset +30.067615 s
Feb 8 17:58:32 lin-10 ntpd[21306]: time reset +28.418495 s
Feb 8 18:14:16 lin-10 ntpd[21306]: time reset +31.885751 s
Feb 8 18:30:04 lin-10 ntpd[21306]: time reset +30.506775 s

The adjustments happened every 15 minutes through the night until...

Feb 9 01:32:21 lin-10 ntpd[21306]: time reset +29.880494 s
Feb 9 01:48:10 lin-10 ntpd[21306]: time reset +32.030575 s
Feb 9 02:03:42 lin-10 ntpd[21306]: time reset +28.456379 s

And they just stopped. Checking the logs for other ntpd messages, the only thing out of the ordinary that I see is:

Feb 9 02:15:39 lin-10 ntpd[21306]: kernel time sync enabled 0001

...which is 12 minutes after the last sync! Every ntpd message beyond this one is along these lines:

Feb 9 02:22:21 lin-10 ntpd[21306]: synchronized to 72.250.128.138, stratum 3
Feb 9 02:23:08 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2
Feb 9 02:23:15 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 02:24:13 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2
Feb 9 02:24:18 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 02:25:17 lin-10 ntpd[21306]: synchronized to 173.255.219.242, stratum 2

But now, more and more, it's choosing LOCAL as its sync partner:

Feb 9 07:21:03 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 07:38:11 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 07:38:21 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 07:55:27 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 08:12:20 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 08:12:31 lin-10 ntpd[21306]: synchronized to 72.250.128.138, stratum 3
Feb 9 08:29:16 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 08:29:23 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 08:46:22 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 08:46:39 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 09:03:34 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2
Feb 9 09:03:45 lin-10 ntpd[21306]: synchronized to LOCAL(0), stratum 10
Feb 9 09:19:27 lin-10 ntpd[21306]: synchronized to 64.69.38.107, stratum 2

Looking at ntpq -p shows:

remote refid st t when poll reach delay offset jitter
=========================================================
10.0.77.54 70.84.160.232 2 u 82 1024 377 0.332 876489. 34131.4
xamelia.felixker 128.233.154.245 2 u 189 1024 377 42.405 872593. 195.085
xnew.dawson.edu 67.18.187.111 3 u 126 1024 377 45.793 874558. 297.871
xnear.fatelectro 192.12.19.20 2 u 132 1024 377 32.562 874424. 241.504
xvega.jeffkaplan 69.164.222.108 3 u 112 1024 377 41.820 875222. 166.518
*LOCAL(0) LOCAL(0) 10 l 65 64 377 0.000 0.000 0.001

And now my system time is off again! I am not able to keep time on this server for some reason. Any suggestions?

Forgot to add...ntpd is still running:
[root@lin-10 ~]# ps -aef | grep ntpd
root 4107 12648 0 09:38 pts/1 00:00:00 grep ntpd
ntp 21306 1 0 Feb08 ? 00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid

Thanks,
Pete

Berhanie 02-09-2012 09:11 AM

so, the logs say that your system time losing 30 seconds every 15 minutes? something is wrong.

TB0ne 02-09-2012 11:46 AM

Quote:

Originally Posted by cvillepete (Post 4598084)
I've been beating my head against the wall for 2 days dealing with NTP issues on my RHEL box. I suppose I could ask our vendor for assistance but I wanted to work through this with somebody on these forums because I find that method the most effective.

It isn't, really...if you're paying for support, you should use it. Otherwise, it may be some time for folks here to be able to walk through your problem. You really should call them.
Quote:

I read elsewhere not to worry about the "restrict" lines for now while I'm debugging my issues so I've commented all restrict lines out. This config was working for me just fine. I was getting a time adjustment every 15 minutes, but then, it just stopped. Here's when sync'ing was first successful: The adjustments happened every 15 minutes through the night until...And they just stopped. Checking the logs for other ntpd messages, the only thing out of the ordinary that I see is:

But now, more and more, it's choosing LOCAL as its sync partner: And now my system time is off again! I am not able to keep time on this server for some reason. Any suggestions?
For your time to drift that much that quick, something is very wrong. You say RHEL, but don't say what version, or if this is a virtual machine or not. There were problems some time back with RHEL on VMware, which would cause time to drift badly. When it gets too far off, the only thing you can do is manually reset the clock using 'ntpdate' or a similar method, use systohc to get the time there, then restart NTP. Repeat forever, until you fix the problem.

The solution was to apply a patch to VMWare itself, and also a patch to RHEL. This may also be a case of a known RHEL bug, which is why it's better to check with your vendor.

cvillepete 02-14-2012 12:40 PM

Thanks for the responses. I'll check with the vendor and try to let you know what they say. It the "hardware node" of a virtual environment. It's RHEL but considered in the environment as "Parallels Virtuozzo Containers for Linux (x86)".

(sorry for the delayed response...I thought the forum would email me when I received a response but it didn't and I was just now able to get back around to this issue)

TB0ne 02-14-2012 01:31 PM

Quote:

Originally Posted by cvillepete (Post 4602276)
Thanks for the responses. I'll check with the vendor and try to let you know what they say. It the "hardware node" of a virtual environment. It's RHEL but considered in the environment as "Parallels Virtuozzo Containers for Linux (x86)".

(sorry for the delayed response...I thought the forum would email me when I received a response but it didn't and I was just now able to get back around to this issue)

So it is a virtual machine, then. If it's VMware, check the VMware site for a patch to the virtual engine, as well as RedHat's site for a patch to the kernel.

cvillepete 02-15-2012 09:07 AM

I contacted Parallels, since it's a Parallels Virtuozzo environment, and they said, and I quote:

"I could see that you are having the issue with the NTP server time syncing on the Hardware Node. I discussed this with our Senior Engineer and could see that Virtuozzo does not affect the functionality of NTP in the Hardware Node. So this is out of scope from our Parallels Support Policy. We are suggesting you to contact Operating System support to help on this issue further."

But, the problem is that we use OS templates provided to us by Parallels. We've never contacted RedHat to purchase and install their OS, we just do everything through Parallels. Seems like I'm in no-man's land with this one.

That aside, Parallels has always been able to help us through our problems so I'm not sure why they are wiping their hands clean of this one. I think I'm going to just do a job every 15 minutes that stops and starts NTPD. The time is off 30 seconds every 15 minutes so hopefully, nothing gets hurt too bad.

I am, of course, open to any other suggestions.

Thanks for all the help and advice I've received thus far.

Pete

TB0ne 02-15-2012 10:11 AM

Quote:

Originally Posted by cvillepete (Post 4603012)
I contacted Parallels, since it's a Parallels Virtuozzo environment, and they said, and I quote:

"I could see that you are having the issue with the NTP server time syncing on the Hardware Node. I discussed this with our Senior Engineer and could see that Virtuozzo does not affect the functionality of NTP in the Hardware Node. So this is out of scope from our Parallels Support Policy. We are suggesting you to contact Operating System support to help on this issue further."

But, the problem is that we use OS templates provided to us by Parallels. We've never contacted RedHat to purchase and install their OS, we just do everything through Parallels. Seems like I'm in no-man's land with this one. That aside, Parallels has always been able to help us through our problems so I'm not sure why they are wiping their hands clean of this one. I think I'm going to just do a job every 15 minutes that stops and starts NTPD. The time is off 30 seconds every 15 minutes so hopefully, nothing gets hurt too bad.

I am, of course, open to any other suggestions.Thanks for all the help and advice I've received thus far.
Pete

Hmm...I'd ask Parallels to put you in touch with RedHat, since THEY are the ones that provided you WITH it. That said, you say "templates provided to you"...not sure what you mean by that, in this context. Does such a template include the OS, or is it a suggested layout/config, that you use to LOAD the OS?? If that's the case, you may be running RHEL, but unsupported...a quick call to RedHat will get you supported and get you updates, which may fix the issue.

I was in that same boat with VMware, until they released the patch, in conjunction with RedHat. I turned off NTP, since it would die frequently, and just cron'ed a job to do an ntpdate (although sntp -d is used these days), every two minutes, since the clock drifted that badly. Trying to keep NTP running is a losing proposition, until the underlying problem is resolved.

Berhanie 02-15-2012 02:18 PM

Quote:

It the "hardware node" of a virtual environment. It's RHEL but considered in the environment as "Parallels Virtuozzo Containers for Linux (x86)".
can you explain the quote above? is the system running ntpd the Parallels Virtuozzo hardware node (host), or is it a PV container (guest)? i use openvz (similar to virtuozzo) and had to set the sys_time capability for a container running ntpd to enable it to modify the system time.

cvillepete 02-18-2012 09:04 PM

Hardware node
 
Quote:

Originally Posted by Berhanie (Post 4603304)
can you explain the quote above? is the system running ntpd the Parallels Virtuozzo hardware node (host), or is it a PV container (guest)? i use openvz (similar to virtuozzo) and had to set the sys_time capability for a container running ntpd to enable it to modify the system time.

It's the hardware node. I think it's RHEL (might be CentOS) with Virtuozzo installed on it: http://www.parallels.com/products/pv...ent-resources/ That's one thing I've never understood how to obtain...the flavor of Linux that is actually running. I only know of "uname -a" which returns this: "Linux my.server.company.com 2.6.18-028stab094.3-PAE #1 SMP Thu Sep 22 13:05:01 MSD 2011 i686 i686 i386 GNU/Linux" but I'm sure that's another discussion.

cvillepete 02-21-2012 03:03 PM

My solution turned out to be creating a job that runs every 15 minutes of every day that restarts ntpd. I chose every 15 minutes because the clock would be off a full 30 seconds by then already. It's now how I want to solve the problem at hand but I have no other option.

Thanks for the input. Too bad NTP doesn't work as advertised (on this machine), seems to work on others just fine.


All times are GMT -5. The time now is 01:54 PM.