LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 01-27-2021, 08:17 AM   #1
HTop
Member
 
Registered: Mar 2019
Posts: 44

Rep: Reputation: Disabled
NTP sometimes stops syncing


Hello,
We have a dozen CentOS 7.7 virtual machines on VMware Esxi 6.x hosts (with vCenter). They run with open-vm-tools.
All servers have the ntpd service running which synchronizes with a Windows domain controller (PDC).
I noticed that some of them have their time not sychronized.
Windows, Ubuntu or SuSE Linux machines do not experienced such problem.
The out-of-time servers have the ntpd service active (and the process is also running).
I notice is that the system log file, i.e. dmesg, gives me back the last ntp message is recorded several weeks ago. In working systems, ntpd message logging is daily.

ntp runs as /usr/sbin/ntpd -u ntp:ntp -g

/etc/ntp.conf contains:

include /etc/ntp/crypto/pw
restrict 127.0.0.1
keys /etc/ntp/keys
disable monitor
server pdc.bludomain.local iburst


pdc.bludomain.local is the Windows domain controller.

Ideas to make ntpd works?
 
Old 01-27-2021, 04:16 PM   #2
rnturn
Senior Member
 
Registered: Jan 2003
Location: Illinois (SW Chicago 'burbs)
Distribution: openSUSE, Raspbian, Slackware. Previous: MacOS, Red Hat, Coherent, Consensys SVR4.2, Tru64, Solaris
Posts: 2,795

Rep: Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550Reputation: 550
Quote:
Originally Posted by HTop View Post
Hello,
We have a dozen CentOS 7.7 virtual machines on VMware Esxi 6.x hosts (with vCenter). They run with open-vm-tools.
All servers have the ntpd service running which synchronizes with a Windows domain controller (PDC).
I noticed that some of them have their time not sychronized.
Windows, Ubuntu or SuSE Linux machines do not experienced such problem.
The out-of-time servers have the ntpd service active (and the process is also running).
I notice is that the system log file, i.e. dmesg, gives me back the last ntp message is recorded several weeks ago. In working systems, ntpd message logging is daily.

ntp runs as /usr/sbin/ntpd -u ntp:ntp -g

/etc/ntp.conf contains:

include /etc/ntp/crypto/pw
restrict 127.0.0.1
keys /etc/ntp/keys
disable monitor
server pdc.bludomain.local iburst


pdc.bludomain.local is the Windows domain controller.

Ideas to make ntpd works?
If you manually restart the ntpd process, can you then successfully issue "ntpd -q"? Does the ntpd process disappear immediately? After a time? Is there anything ntp-related in the log files that might indicate that it's exiting or being killed? I'd open a couple of terminal windows. In one, tail the log file where ntp is logging and grep for ntp. In your case: "tail -f /var/log/dmesg | grep -i ntp". In the second window, check that ntp is running ("ps -ef | grep -i ntp") and if it's not, restart it. Watch for activity in the window that tailing the log file.

Consider tweaking the ntpd command line to include at least one "-d" (debug) switch; you can include more than one for increasing amounts of debugging information. Then restart the service. (Remember, you're tailing the log file, right?)

I typically use only the NTP server's IP address in the config file---rarely the FQDN. If all of the ntp.conf files are set up the same, though, using the FQDN is likely not the problem. But... if changing the server record in "ntp.conf" to the IP address of "pdc.bludomain.local" on the troublesome servers corrects the problem, I'd check if there is something "different" about the DNS settings on those systems from what's configured on the servers that are syncing. Do the non-syncing servers get the same IP address when they lookup the PDC server as, say, the Ubuntu server(s)? Try pinging and running nmap from one of the non-syncing systems and targeting both the FQDN and IP address of the PDC. (You should see port 37 in both sets of nmap's results.) Perhaps the non-syncing systems are looking for an NTP connection where none is available.

There also could something "off" in the VMs' network configuration, too. If, though, you are able to reach other areas of the network from the CentOS systems then this isn't your problem. (Caveat: not a VM "expert" so I could be wrong about that.)

HTH...
 
Old 01-31-2021, 05:11 AM   #3
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,768

Rep: Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192Reputation: 1192
Check time sources and status with
Code:
ntpq -np
Ensure that the ESXi has the correct time (it might be used as a hidden source, not shown by ntpq).
 
Old 02-24-2021, 09:28 AM   #4
HTop
Member
 
Registered: Mar 2019
Posts: 44

Original Poster
Rep: Reputation: Disabled
Time source is reachable. However, I tried to add another ntp server (not windows) as time source, I will see in the next days if the problem is related to bad Windows time server or not.
 
Old 08-24-2021, 07:27 AM   #5
HTop
Member
 
Registered: Mar 2019
Posts: 44

Original Poster
Rep: Reputation: Disabled
The problem virtually solved "automagically".
I think that servers hosting the virtual machines had a high processor usage because they were too loaded. By reducing the number of virtual machines and therefore the load of virtualization hosts, this problem has practically disappeared.
 
Old 08-24-2021, 05:55 PM   #6
scottieH
Member
 
Registered: Mar 2021
Posts: 58

Rep: Reputation: Disabled
I have seen this behavior before on VMs. The issue is, basically, that the machine isn't really "ON" or "RUNNING" (whichever term you prefer) 100% of the time. Occasionally, the VM is "SLEEPING" (or whatever term you prefer). In other words, it is not getting any bare-metal CPU cycles. Then, it comes back on. This is normal behavior for a VM. The bare-metal CPU has to divide it's time among all of the virtual CPUs. Nott all of them can be active at the same time.

The clock on the VM can only update when it is "RUNNING." When the VM is "SLEEPING," it's clock stops. It does not update it's time.

When the VM transitions from "SLEEPING" to "RUNNING," it's clock is behind actual time. NTP then updates the time to keep it current.
By default, if the time is "too far" off, the daemon will stop. This is the scenario you are encountering.

In this case, I tell NTP to ignore large clock differences. to do that, I change the ntp.conf file to contain the line:
tinker panic 0

For reference, see this RedHat article:
Avoiding clock drift on VMs
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
ntp client is not syncing with ntp server time sagar666 Linux - Server 3 12-19-2014 04:47 AM
NTP client is not syncing to ntp server LittleMaster Linux - Newbie 6 04-05-2013 02:37 PM
NTP - syncs initially, eventually stops syncing, uses LOCAL clock cvillepete Linux - Newbie 9 02-21-2012 03:03 PM
ntp drift file in /etc/ntp instead of /var/lib/ntp - suggestion for a patch in Slack niels.horn Slackware 16 05-07-2009 07:35 PM
Mouse - Sometimes work and sometimes does'nt danishmr Linux - Hardware 0 05-09-2002 04:44 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 11:47 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration