Monitoring for Time Drift in your enterprise


 

image

 

Time sync is critical in today’s networks.  Experiencing time drift across devices can cause authentication breakdowns, reporting miscalculations, and wreak havoc on interconnected systems.  This article shows a demo management pack to monitor for time sync across your Windows devices.

The basic idea was – to monitor all systems and compare their local time, against a target reference time server, using W32Time.  Here is the command from the PowerShell:

$cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples

The script will take two parameters, the reference server and the threshold for how much time drift is allowed.

Here is the PowerShell script:

#================================================================================= # Time Skew Monitoring Script # Kevin Holman # Version 1.0 #================================================================================= param([string]$RefServer,[int]$Threshold) #================================================================================= # Constants section - modify stuff here: # Assign script name variable for use in event logging $ScriptName = "Demo.TimeDrift.PA.ps1" # Set samples to the number of w32time samples you wish to include [int]$Samples = '1' # For testing - assign values instead of paramtersto the script #[string]$RefServer = 'dc1.opsmgr.net' #[int]$Threshold = '10' #================================================================================= # Gather script start time $StartTime = Get-Date # Gather who the script is running as $WhoAmI = whoami # Load MomScript API and PropertyBag function $momapi = new-object -comObject 'MOM.ScriptAPI' $bag = $momapi.CreatePropertyBag() #Log script event that we are starting task $momapi.LogScriptEvent($ScriptName,9250,0, "Starting script") #Start MAIN body of script: #Getting the required data $cmd = w32tm /stripchart /computer:$RefServer /dataonly /samples:$Samples IF ($cmd -match 'error') { #Log error and quit $momapi.LogScriptEvent($ScriptName,9250,2, "Getting TimeDrift from Reference Server returned an error . Reference server is ($RefServer). Output of command is ($cmd)") exit } ELSE { #Assume we got good results from cmd $Skew = $cmd[-1..($Samples * -1)] | ConvertFrom-Csv -Header "Time","Skew" | Select -ExpandProperty Skew $Result = $Skew | % { $_ -replace "s","" } | Measure-Object -Average | select -ExpandProperty Average } #The problem is that you can have time skew in two directions: positive or negative. You can do two #things: create an IF statement that does check both or just create a positive number. IF ($Result -lt 0) { $Result = $Result * -1 } $TimeDriftSeconds = [math]::Round($Result,2) #Determine if the average time skew is higher than your threshold and report this back to SCOM. IF ($TimeDriftSeconds -gt $Threshold) { $bag.AddValue("TimeSkew","True") $momapi.LogScriptEvent($ScriptName,9250,2, "Time Drift was detected. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } ELSE { $bag.AddValue("TimeSkew","False") #Log good event for testing #$momapi.LogScriptEvent($ScriptName,9250,0, "Time Drift was OK. Reference server is ($RefServer). Threshold is ($Threshold) seconds. Value is ($TimeDriftSeconds) seconds") } #Add stuff into the propertybag $bag.AddValue("RefServer",$RefServer) $bag.AddValue("Threshold",$Threshold) $bag.AddValue("TimeDriftSeconds",$TimeDriftSeconds) #Log an event for script ending and total execution time. $EndTime = Get-Date $ScriptTime = ($EndTime - $StartTime).TotalSeconds $ScriptTime = [math]::Round($ScriptTime,2) $momapi.LogScriptEvent($ScriptName,9250,0,"`n Script has completed. `n Reference server is ($RefServer). `n Threshold is ($Threshold) seconds. `n Value is ($TimeDriftSeconds) seconds. `n Runtime was ($ScriptTime) seconds.") #Output the propertybag $bag

 

Next, we will put the script into a Probe action, which will be called by a Datasource with a scheduler.  The reason we want to break this out, is because we want to “share” this datasource between a monitor and rule.  The monitor will monitor for the time skew, while the rule will collect the skew as a perf counter, so we can monitor for trends in the environment.

 

So the key components of the MP are the DS, the PA (containing the script), the MonitorType and the Monitor, the Perf collection rule, and some views to show this off:

 

image

 

When a threshold is breached, the monitor raises an alert:

image

 

The performance view will show you the trending across your systems:

image

 

On the monitor (and rule) you can modify the reference server:

image

 

One VERY IMPORTANT concept – if you change anything – you must make identical overrides on BOTH the monitor and the rule, otherwise you will break cookdown, and result in the script running twice for each interval.  So be sure to set the IntervalSeconds, RefServer, and Threshold the same on both the monitor and the rule.  If you want the monitor to run much more frequently than the default once an hour, that’s fine, but you might not want the perf data collected more than once per hour, so while that will break cookdown, it only breaks once per hour, which is probably less of an impact than overcollecting performance data.

From here, you could add in a recovery to force a resync of w32time if you wanted, or add in additional alert rules for w32time events.

 

The example MP is available here:

https://gallery.technet.microsoft.com/SCOM-Management-Pack-to-bca30237


Comments (0)

Skip to main content