DNS MP – Noisy resolution time alerts, and how to deal with them


This is a problem in the 6.0.6480.0 version of the DNS MP.

You will likely see a lot of DNS Resolution Time alerts popping into your console – then disappearing.

This is because these alerts are generated by a monitor, which is frequently changing state.  These alerts get auto-resolved when the monitor flips back to healthy status, by design. 

The root cause of the problem, if often that the server is busy when we run our script to check the local DNS resolution response… and the default threshold is set to 1 second.

 

image

 

 

Even in some of the best DNS environments, with good hardware… we will find DNS servers on Domain Controllers can get busy… and this is compounded by SCOM running multiple scripts at the same time – from the ADMP and DNS MP… sometimes we cannot return results in less than 1 second.

 

The best thing to do – is to chart out your current environment, using the provided performance views in the MP…. and adjust this moniotr for your servers:

 

What I can see – is that my Server 2003 DC/DNS server, with only 1 zone, but running on a PIII 933 mhz CPU, with 512mb of RAM…. is taking a baseline of 2-3 seconds.  I will override this monitor for this SERVER, or for ALL 2003 DNS servers… to be 5 seconds. 

 

image

 

Granted – our expectation is that our DNS servers can respond to a DNS query faster than 5 seconds – but this number is relative… due to how OpsMgr is collecting it.  So the goal here, is to look at what is normal when the server is functioning well, establish that as our baseline, and set the threshold just above it.

 

Now – my Server 2008 DC/DNS server, which has 1GB of ram, and is a VM on very fast disk, and has a better CPU available, has a baseline of .2 seconds… so I will leave this monitor alone, since it is obviously not changing state so frequently.

 

image

 

When a real problem arises, load increases, or DNS is performaing poorly, we will be alerted – because we will breach our *baseline*.

Comments (24)

  1. Kevin Holman says:

    Yes – the new DNS MP for Server 2008 R2 should be shipping very soon.  There are not many changes in it, as DNS has not changed much functionally from 2008 to 2008R2.

  2. Kevin Holman says:

    Then I would simply follow the logic here… if spikes are that frequent – I would create a baseline off the spikes.

    In the customer environments I have worked in… we were always able to tune this monitor using the method outlined about.  The only rare cases were older DC’s that were having trouble with memory pressure.

  3. Kevin Holman says:

    The current DNS MP does not support 2008 R2.

    There is supposed to be an updated MP in the works – but I dont know anything about release timing.

  4. Anonymous says:

    This may sound strange.. But what if I'm getting this alert from my WSUS?

  5. Bob Panick says:

    Very true, unfortunately the monitor uses only a single sample, so spike’s will throw the alert only to be cleared on the next sample 15 minutes later.

    Unfortunately since the data is from a script it isn’t practical to do this over a number of samples.  The false alerts are so high on this monitor on some machines that we simply turned it off.

  6. Craig says:

    I’m having trouble seeing the value in this monitor being rolled up as Availability since it does not indicate availability at all but moreso performance.  

    It would probably be better to have this as an alert generating rule so you can specify consecutive occurrences for suppression.

    IMHO

  7. Tim Nichols says:

    I’m getting alot of alerts on one of my DNS servers that is running Windows Server 2008 R2.  It is a DNS Resolution Time alert.  When I look at the event view, I see alot of erros with the description "DNS Server Resolution Check: The DNS service is not running.  Monitoring script cannot continue."  But when I log on to the server and look at the services, the DNS Server service is started and running.  Any idea if this is an error?

  8. chaselton says:

    Has anyone run into an increase in DNS resolution time after virtualizing a server?

  9. SC says:

    We have, our physical server is fine and doesnt generate DNS resolution alerts. However we get them from our VM’s

  10. Roel says:

    Same here, physical servers don’t generate this alert but virtual ones do.

  11. darwin says:

    Good posting, I appreciate it.  We enabled this and got a bunch.  This posting was a good quick explanation.

  12. dustin says:

    Thanks for posting these tips; it helped me solve this problem for a few servers.

  13. Davy says:

    Yep, thanks from me too, adjusted both 2003 and 2008 response times to 10 secs and all good 🙂

  14. Bix says:

    Greetings,

    I know this article is quite old but I see recent comments on it so I permit myself to do so 🙂

    I have created an override for the *critical* state of this monitor just like you suggested and everything works fine;

    However I have an issue with the *warning* state of this monitor :

    On the Knowledge tab of the health explorer I see this:

    Green:Worst Time less than Threshold and Success Count greater than 0 and Failure Count=0

    Yellow:Failure Count greater than 0 and Success Count greater than 0

    Red:Best Time >Threshold or Success Count=0

    so my warning state (Yellow) is when I have some failure and , indeed, I have a failure count of 1.

    In our environment, having such failure is not a problem at all and can happen quite frequently. I would like to override this value to 3 or 5 but I do not find where I can do that. In the Override Properties window I have lots of settings, but nothing seeming to be related with the failure count.

    Any idea how I can do that ?  

  15. nik says:

    Hi

    We see the same issue as Bix. Zeros in all categories. Any ideas?

  16. Cabi says:

    Hello*,  when will there be a new release of DNS-MP that is suited for Windows 2008 R2?

  17. Adrian C says:

    Hello,

    At my server this allert is generated when the Local Resolution time gets a 0 ( zero ) value!

    Any ideeas?

    Br,

    Adrian

  18. Noel F says:

    Hey guys, I've installed the new Windows 2008/R2 DNS Management Pack (v6.0.7000.0) and I'm still getting the same problem.  Could it be that Windows 2008 R2 still isn't fully supported in the new MP?  Anyway – for your records and so this is all linked, I've created a TechNet Forum post here with the issue: social.technet.microsoft.com/…/54f35d15-e004-45b1-8e9c-444fa30cdb59

    Hopefully this can be resolved.

  19. Rajasekhar says:

    Can we ignore the below scom alert. any idea

    Alert: DNS 2008/R2 Resolution Time Alert Priority: 1 Severity: 2 Resolution state: New

  20. Ram Kumar says:

    Hi Kevin,

    I need to monitor DNS 2000 zone, but i am using the latest MP for DNS management pack. Could you please let me know how to monitor this.

    Thanks in advance.

  21. Anonymous says:

    DNS MP – Noisy resolution time alerts, and how to deal with them – Kevin Holman’s System Center Blog – Site Home – TechNet Blogs

  22. Dwayne R says:

    I know its a bit late but in Adrian’s case look for errors in the event logs for the agent (event 1163). You may find that its reporting the DNS server is stopped on ongoing 0 second responses. This may or may not be correct.

    I had the same issue the DNS server was responding quickly all the time (windows 2003), SCOM showed 0 second responses all the time, and the event logs showed the DNS service without issue, BUT the operations manager logs showed DNS to be stopped (i.e event
    1163 : DNS Server Resolution Check : The DNS service is not running. Monitoring script cannot continue.)

    The trick is to workout why its not working, so once again where possible extract what the monitor is actually doing and duplicate.

    In my case trying the nslookup via the NslookupAllTests.js script took forever, ie 20 minutes later its still running, and we were seeing alerts for it doing so. So I had to look there to find the real problem

  23. Nile says:

    Here is another take at this issue elsewhere, basically DNS listens on ipv4 and v6 ifaces. And if you don’t have both reverse and forward zones config-ed – which is usually how your ipv6 will look… this causes the delays and extra noise. Two ways to
    resolve, by either turning off Listening on Ipv6 or creating zones to accomodate the monitoring script.

  24. subrata says:

    Can we change the failure count to 2 or 3 for 2 checkings multiple times with 15mins interval ?

Skip to main content